**Meet the editor**

Boris Cogan received in 1972 the Master of Science (MSc) degree in Automatic Control Systems, by the Moscow Institute of Physics and Technology (State University). In 1986 he received the degree of Doctor of Philosophy (PhD) in Computer Science at the Academy of Sciences of the Soviet Union (AS of USSR). Later on he received a Senior Research Fellow title from AS of USSR, and

Assoc. Prof. (Docent) one from the Russian Ministry of Higher Education (RM of HE, 1990). In 1993 he received the Doctor of Science degree from AS of USSR. From 1994 he is a professor at the Far-Eastern State University (FESU); in 1996 he received the life-time title of Professor from RM of HE. He is the author of 150+ publications including a book, two edited books and an IEEE standard. Prior to becoming a Senior Lecturer at the Faculty of Computing (London Metropolitan University) in 2001, he was Head of Software Engineering Laboratory at the Institute for Control Processes and Professor of Software Engineering at FESU, Vladivostok, Russia.

### Contents

#### **Preface XI**


Jason Sherwin and Dimitri Mavris

X Contents


	- **Part 2 New Systems Engineering Theories 231**

### Preface

The book "*Systems Engineering: Practice and Theory*" is a collection of articles written by developers and researches from all around the globe. Mostly they present methodologies for separate Systems Engineering processes; others consider issues of adjacent knowledge areas and sub-areas that significantly contribute to systems development, operation, and maintenance. Case studies include aircraft, spacecrafts, and space systems development, post-analysis of data collected during operation of large systems etc. Important issues related to 'bottlenecks' of Systems Engineering, such as complexity, reliability, and safety of different kinds of systems, creation, operation and maintenance of services, system-human communication, and management tasks done during system projects are addressed in the collection. This book is for people who are interested in the modern state of the Systems Engineering knowledge area and for systems engineers involved in different activities of the area. Some articles may be a valuable source for university lecturers and students; most of case studies can be directly used in Systems Engineering courses as illustrative materials.

> **Prof Boris Cogan, MSc, PhD, DSc, Senior Research Fellow,**  Faculty of Computing, London Metropolitan University, UK

## **Introductory Chapter**

### **A Few Words About Systems Engineering**

#### Boris Cogan

*Faculty of Computing, London Metropolitan University, UK* 

#### **1. Introduction**

First of all, why 'Practice and Theory', not vice versa what probably could be more traditional? – We will see later on…

*Systems Engineering* is a relatively young area of knowledge. Its main elements got a significant development during World War II (mainly for aircraft development and maintenance) because project managers found that considering product components as 'separate entities' with their own attributes gives additional views to ones for separate elements. Nowadays it is generally accepted that any 'complex engineering product', 'a system', is analysed/viewed as a hierarchy of layers of 'simpler sub-systems'. (This is referred only to the way of how systems analysis is done and has nothing in common with the architecture of a system.)

After the WW II, numerous military applications, spacecrafts of any kind, nuclear power stations etc. (i.e. products with higher requirements to reliability and safety) required separation of Systems Engineering in a branch of engineering with its own methods, techniques, tools, rules etc. that distinguished it from other engineering knowledge areas. It is considered that the evolution of Systems Engineering began during the late 1950's [INCOSE Handbook].

Since the late 1960's, Systems Engineering Standards were recognised as a separate group and corresponding ISO and IEEE standards were labelled as 'Systems Engineering' ones. (It is worth to note that nowadays more and more Systems Engineering standards are combined with Software Engineering ones because modern systems are 'softwareintensive' or 'software-based', Systems Engineering processes and Software Engineering ones are similar and practically no technical system exists without massive amount of software.) However, some 'older' IEEE standards are related to the 'Information Technology' category.

In 1990 the *International Council on Systems Engineering* (INCOSE) was founded as a not-forprofit membership organisation to develop and disseminate the interdisciplinary principles and practices that enable the realisation of successful systems [INCOSE]. As its mission, the organisation declared: '*Share, promote and advance the best of systems engineering from across the globe for the benefit of humanity and the planet*'.

Older and newer ISO and IEEE standards and INCOSE materials may give a bit different definitions of what Systems Engineering is. However, actually, they are about the same. '*Systems Engineering: An interdisciplinary approach and means to enable the realisation of successful systems*' [INCOSE Handbook]. Then we need to clarify what kinds of systems are meant: '*System: An interacting combination of elements to accomplish a defined objective. These include hardware, software, firmware, people, information, techniques, facilities, services, and other support elements*' [INCOSE Handbook]. It is actually important that a system under consideration is 'engineered'. As it is said in the same INCOSE materials: '*The term "Systems Engineering" is only used with respect to the act of engineering a specific system*'. It is not good and not bad; it is just as it is.

#### **2. My route in systems engineering**

My own acquaintance with Systems Engineering took place in the second part of 1960's at the *Institute of Control Sciences* (ICS), the Soviet Union's leading research institution in the automatic control knowledge area. Main sub-areas of research of the Institute were automatic control theory and development of elements for automatic equipment [ICS]. My University (actually, Moscow Institute of Physics and Technology – MIPT) department was situated at the ICS because, according to the mode of the education at MIPT, last four years (out of six) students were (and are now) taught by acting leading researches in corresponding knowledge areas, and the students worked with the researchers at their work place, not vice versa [MIPT]. Just one example: during one of the semester at my fifth year, one module (lectures) of a day was taught by a Vice President of the International Federation of Automatic Control – IFAC [IFAC], the next module lectures were read by a leading Soviet Union specialist in control issues of air and ballistic missile defence, the following one was given by a leading specialist in air-to-air missile control. It was an extremely wonderful time! Definitely, no military terms were used in the lectures; they were about automatic control, not military applications.

At the fourth-sixth years, each MIPT student worked at an ICS laboratory as a researcher. The sixth year was completely dedicated to the Final Project. According to the mission of ICS, it was involved in all 'big' military and civil Systems Engineering projects of the Soviet Union for solving control problems of the systems. Because of the severe secrecy, usually students did not know what their projects were for. We developed 'theories' and functioning prototypes. Whether they were used or not in real systems, we, of course, had no idea (at least, officially). However, meetings with specialists from various system development organisations allowed us (according to their interest or absence of the interest) to understand our contribution to Systems Engineering. I was involved (definitely, together with staff researchers of ICS) in developing electronic elements based on magnetic cores with a rather complex configuration to be used in different technical facilities, including elements of multi-stage rockets (for manned space flights). Actually, I do not know so far whether they were used in real space programmes or not.

So, it was my first involvement in Systems Engineering. This term was not used in our environment because (1) it did not yet exist and (2) we all were too far from real (and *technological)* Systems Engineering processes (in terms of ISO/IEC 15288:2008 / IEEE Std 15288™-2008 [ISO 15288]). However, it was invaluable experience for my better understanding of engineering that have been used in my following 'adult' life. I am deeply

Older and newer ISO and IEEE standards and INCOSE materials may give a bit different definitions of what Systems Engineering is. However, actually, they are about the same. '*Systems Engineering: An interdisciplinary approach and means to enable the realisation of successful systems*' [INCOSE Handbook]. Then we need to clarify what kinds of systems are meant: '*System: An interacting combination of elements to accomplish a defined objective. These include hardware, software, firmware, people, information, techniques, facilities, services, and other support elements*' [INCOSE Handbook]. It is actually important that a system under consideration is 'engineered'. As it is said in the same INCOSE materials: '*The term "Systems Engineering" is only used with respect to the act of engineering a specific system*'. It is not good

My own acquaintance with Systems Engineering took place in the second part of 1960's at the *Institute of Control Sciences* (ICS), the Soviet Union's leading research institution in the automatic control knowledge area. Main sub-areas of research of the Institute were automatic control theory and development of elements for automatic equipment [ICS]. My University (actually, Moscow Institute of Physics and Technology – MIPT) department was situated at the ICS because, according to the mode of the education at MIPT, last four years (out of six) students were (and are now) taught by acting leading researches in corresponding knowledge areas, and the students worked with the researchers at their work place, not vice versa [MIPT]. Just one example: during one of the semester at my fifth year, one module (lectures) of a day was taught by a Vice President of the International Federation of Automatic Control – IFAC [IFAC], the next module lectures were read by a leading Soviet Union specialist in control issues of air and ballistic missile defence, the following one was given by a leading specialist in air-to-air missile control. It was an extremely wonderful time! Definitely, no military terms were used in the lectures; they were

At the fourth-sixth years, each MIPT student worked at an ICS laboratory as a researcher. The sixth year was completely dedicated to the Final Project. According to the mission of ICS, it was involved in all 'big' military and civil Systems Engineering projects of the Soviet Union for solving control problems of the systems. Because of the severe secrecy, usually students did not know what their projects were for. We developed 'theories' and functioning prototypes. Whether they were used or not in real systems, we, of course, had no idea (at least, officially). However, meetings with specialists from various system development organisations allowed us (according to their interest or absence of the interest) to understand our contribution to Systems Engineering. I was involved (definitely, together with staff researchers of ICS) in developing electronic elements based on magnetic cores with a rather complex configuration to be used in different technical facilities, including elements of multi-stage rockets (for manned space flights). Actually, I do not know so far

So, it was my first involvement in Systems Engineering. This term was not used in our environment because (1) it did not yet exist and (2) we all were too far from real (and *technological)* Systems Engineering processes (in terms of ISO/IEC 15288:2008 / IEEE Std 15288™-2008 [ISO 15288]). However, it was invaluable experience for my better understanding of engineering that have been used in my following 'adult' life. I am deeply

and not bad; it is just as it is.

**2. My route in systems engineering** 

about automatic control, not military applications.

whether they were used in real space programmes or not.

grateful to all my teachers and colleagues from ICS for the years spend at the Institute (I was a PhD student at ICS later on but specialised in Software Engineering).

During my long employment at the Institute for Automation and Control Processes of the Russian Academy of Science in Vladivostok (the Russian Far East) [IACP], I was mainly involved in developing software for different 'computer-based' systems and did research in the Software Engineering knowledge area creating new methods, techniques and tools for increasing software productivity. However, I also took part in Systems Engineering projects, as well.

In 1980's-90's, as Head of Testing, I was involved in a large project to develop a prototype of a distributed military system (a legacy of the 'Cold War'). The 'core' of the system was 'artificial intelligence' (that may be treated in different ways in this context) developed at the Expert Systems Department of IACP. The knowledge base containing validated knowledge of the best specialist in the application area together with an extremely efficient inference engine allowed monitoring corresponding activity in a very large geographical area, practically in real-time. The bottle-necks were sensors (I reminder that it was only a prototype); aircrafts and ships played the role of sensors. Nowadays, spacecrafts are very common 'sensors' for those kinds of applications, and there is no need to move them from one orbit to another but we did not have access to spacecrafts that time. As members of my testing team, I had specialists who developed, tested and maintained software for the 'Buran' system, the Russian analogue of the US's space shuttle, who took part in launching and landing of that extraordinary automatic craft. It was a great experience for me.

In mid 1990's, after a long and interesting discussion during my work on another project in Scotland, Professor Richard Thayer [R. Thayer], invited me to be a member of his team to finish development of *IEEE Std 1362 IEEE Guide for Information Technology—System Definition — Concept of Operations (ConOps) Document*. This standard was published in 1998 as IEEE Std 1362™-1998 and reaffirmed in 2008 for the next 10 years without changing a letter. '*This guide prescribes the format and contents of the concept of operations (ConOps) document. A ConOps is a user oriented document that describes system characteristics of the to-be-delivered system from the user's viewpoint. The ConOps document is used to communicate overall quantitative and qualitative system characteristics to the user, buyer, developer, and other organisational elements (e.g., training, facilities, staffing, and maintenance). It describes the user organisation(s), mission(s), and organisational objectives from an integrated systems point of view'* [IEEE 1362]. As a matter of fact, this kind of user oriented document should be developed for any system planned to be build.

Thus, in retrospect, those three decades of my professional life actually gave me great practical experience in development of systems, without any deep knowledge of any 'theory' of the development (in Systems Engineering). Really, that time there were no 'validated' 'theory' (process standards) available for the projects I was involved in (I am not saying that there were no theory in the country at all). However, as a matter of fact, I got my understanding if not Systems Engineering processes but at least what Systems Engineering is – from practice.

The next period of my life is lecturing at the London Metropolitan University [LMU]. In my Software Engineering modules for MSc students I needed (and need now) to clarify the place of Software Engineering in the context of Systems Engineering. For that I had to study at least some 'theory' of systems development and be familiar in detail with many Systems Engineering standards related to processes of Systems Engineering projects. So, I would name this period of my life as 'familiarisation with Systems Engineering theory'. Two 'stages' of my professional career: 'practice' and 'theory', are the first reason for the title of the book.

The second reason (and really important one) is a bit 'philosophical'. People develop systems using existing knowledge ('theory'). During a system's development, then operation and maintenance, they better understand the processes that they have applied and used, their merits and demerits; their own mistakes and 'holes' in the theories applied. They need new theories or at least improved old ones. Reasonable theories are always based on practice. It is why (at least, in engineering) theories are following practice, not vice versa. Now we know the reasons to name the book.

#### **3. Part I: Systems engineering practice**

The first article of the book, '*Methodology for an Integrated Definition of a System and its Subsystems: The case-study of an Airplane and its Subsystems*' by *Sergio Chiesa, Marco Fioriti & Nicole Viola*, demonstrates application of the current 'theory' of Systems Engineering on the example of an aircraft, one of the most complex computer-intensive modern systems. Reliability and safety requirements to any aircraft are extremely high that demands a very sophisticated process of development of all aircraft's sub-systems. From another point of view, the development is very expensive and it sometimes needs in compromises (tradeoffs). These conditions need specific methodologies and well-grounded requirements to the product. The article presents such a methodology and in addition to its research value, it provides wonderful material for teaching Systems Engineering and Aviation students.

Aircrafts and missiles are extremely 'complex' objects to develop. However, space systems are usually even more 'complex' because in addition to crafts themselves they include a lot of specific ground services to launch and operate the crafts and to receive, collect, and process information sent by spacecrafts. All these demand some additional methods or even methodologies to develop a system that works and meets other requirements. The second article of this Part, '*Complex-systems Design Methodology for Systems-Engineering Collaborative environment*' by *Guido Ridolfi, Erwin Mooij & Sabrina Corpino*, presents a methodology that is designed for implementation in collaborative environments to support the engineering team and the decision-makers in the activity of exploring the design space of complex-system, typically long-running, models. The term 'complexity' is used in the real life without too much thinking what it means, 'complex'. However, for developers of a system, the complexity has to be measured (or estimated) in particular measurement units to understand what methods and solutions could better suit the requirements (trade-offs are needed as usual). The authors show that contribution of the human factor is fundamental for obtaining a final product with a high cost-effectiveness value. This means that any human activity in Systems Engineering processes needs specific methodological and tool support as much as possible. As a case study, an Earth-observation satellite mission is introduced in the beginning of the article and this satellite mission is used throughout the chapter to show step by step implementation of the suggested methods. This article is a good source for teaching material, as well as the first one.

at least some 'theory' of systems development and be familiar in detail with many Systems Engineering standards related to processes of Systems Engineering projects. So, I would name this period of my life as 'familiarisation with Systems Engineering theory'. Two 'stages' of my professional career: 'practice' and 'theory', are the first reason for the title of

The second reason (and really important one) is a bit 'philosophical'. People develop systems using existing knowledge ('theory'). During a system's development, then operation and maintenance, they better understand the processes that they have applied and used, their merits and demerits; their own mistakes and 'holes' in the theories applied. They need new theories or at least improved old ones. Reasonable theories are always based on practice. It is why (at least, in engineering) theories are following practice, not vice versa.

The first article of the book, '*Methodology for an Integrated Definition of a System and its Subsystems: The case-study of an Airplane and its Subsystems*' by *Sergio Chiesa, Marco Fioriti & Nicole Viola*, demonstrates application of the current 'theory' of Systems Engineering on the example of an aircraft, one of the most complex computer-intensive modern systems. Reliability and safety requirements to any aircraft are extremely high that demands a very sophisticated process of development of all aircraft's sub-systems. From another point of view, the development is very expensive and it sometimes needs in compromises (tradeoffs). These conditions need specific methodologies and well-grounded requirements to the product. The article presents such a methodology and in addition to its research value, it provides wonderful material for teaching Systems Engineering and Aviation students.

Aircrafts and missiles are extremely 'complex' objects to develop. However, space systems are usually even more 'complex' because in addition to crafts themselves they include a lot of specific ground services to launch and operate the crafts and to receive, collect, and process information sent by spacecrafts. All these demand some additional methods or even methodologies to develop a system that works and meets other requirements. The second article of this Part, '*Complex-systems Design Methodology for Systems-Engineering Collaborative environment*' by *Guido Ridolfi, Erwin Mooij & Sabrina Corpino*, presents a methodology that is designed for implementation in collaborative environments to support the engineering team and the decision-makers in the activity of exploring the design space of complex-system, typically long-running, models. The term 'complexity' is used in the real life without too much thinking what it means, 'complex'. However, for developers of a system, the complexity has to be measured (or estimated) in particular measurement units to understand what methods and solutions could better suit the requirements (trade-offs are needed as usual). The authors show that contribution of the human factor is fundamental for obtaining a final product with a high cost-effectiveness value. This means that any human activity in Systems Engineering processes needs specific methodological and tool support as much as possible. As a case study, an Earth-observation satellite mission is introduced in the beginning of the article and this satellite mission is used throughout the chapter to show step by step implementation of the suggested methods. This article is a

the book.

Now we know the reasons to name the book.

**3. Part I: Systems engineering practice** 

good source for teaching material, as well as the first one.

Considering the system requirements, first of all, any system performs functions. As the system is viewed as being 'composed' of a few lower layers, the *Functional Analysis* is done on each layer for each sub-system. A particular case how it could be done can be a valuable source of material for systems' engineers and for students. The third article of Part I, '*Functional Analysis in Systems Engineering: Methodology and Applications*' by *Nicole Viola, Sabrina Corpino, Marco Fioriti & Fabrizio Stesina*, gives the opportunity to see practical applications of the Functional Analysis. Functional Analysis applies in every phase of the design process; it turns out to be particularly useful during conceptual design, when there is still a wide range of potentially feasible solutions for the future product. The precious role of Functional Analysis consists in individuating as many available options as possible, but not missing any ideas that may offer significant advantages. The article gives very vivid examples of application of Functional Analysis in development of various systems. Three of them deserve special mentioning: (1) Functional Analysis at sub-system level to define the avionic sub-system of an aircraft; (2) Functional Analysis at system level to define a satellite in Low Earth Orbit; (3) Functional Analysis at system of systems level to define a permanent human Moon base. The paper is a wonderful illustrative material for a range of engineering university courses.

As it has been already mentioned, nowadays *safety* is one of the most important properties of any complex system. As it is shown in the next articles of the book, '*A Safety Engineering Perspective*' by *Derek Fowler & Ronald Pierce*, *safety* is actually a set of attributes that have to be considered and measured separately. Authors show that the concept of '*reliability*' should not be mixed up or even considered together with the concept of 'safety'. Reliable system elements may contribute to non-reliability of a system just because they do not suit the requirements to this particular system. In other words, functionality of the system, of its components and their elements has to be carefully analysed and expressed on all layers of system hierarchy: requirements to a higher layer architecture component have to be carefully allocated to 'sub-systems' of the next layer, including ones to reliability and safety. The article introduces the principles of safety assurance and safety cases and showed how they should drive all the processes of a safety assessment, throughout the project life cycle.

Development of complex systems is extremely expensive. If it is a completely new kind of systems, there are no historical data to base a new project on. More or less, managers understand how to cost hardware and to a lesser extent, software. However, it is not the case for integration and interfaces of complex systems that needs new methods and tools for estimations. When the cost of the process of development of larger and more complex systems, a system of systems, and enterprises is estimated, managers' ability to make accurate (or at least adequate) estimates becomes less relevant and reliable. The following, fifth, article of the book, '*Life Cycle Cost Considerations for Complex Systems*' by *John V. Farr*, presents some of the methods, processes, tools and other considerations for conducting analysis, estimation and managing the life cycle costs of complex systems. It considers some estimation models and tools for hardware, software, integration at the system level, and project management. It briefly describes *Cost Management* as a separate task of a Systems Engineering project. The article emphasises that systems engineers are usually not trained for doing accurate system development cost estimation, and proper methods, processes, and tools could significantly help them in the task.

According to the definition of a system, the 'system' may have different forms, in particular, to be a service ('*Service: Useful work performed that does not produce a tangible product or result, such as performing any of the business functions supporting production or distribution*' [PMBOK]). Any service has first to be created, then it operates, it needs maintenance, repair, upgrade, take-back, and consultation. Any product during its life somehow influences the environment. When service is developed, the environmental problems have to be carefully analysed: what is the influence. The sixth article, '*Integrated Product Service Engineering - Factors Influencing Environmental Performance*' by *Sofia Lingegård, Tomohiko Sakao & Mattias Lindahl*, analyses widely-used strategies of *Service Engineering* and suggest improvements of the strategies. Some products are used for 20-40 years and definitely knowledge about the products is increased during the time. Characteristics of the product may turn out deviated from supposed ones in the development; environmental requirements may change over the decades; 'better' products with the same mission may be developed and so on, and so on… Unfortunately, in real practice, rather often little care is taken in product development (and in its specification) for future services, maintenance, and end-of-life-treatment. Traditionally, the initial focus is on developing the 'physical' product; once that is done, a possible service (intangible product) is developed, but this is hindered by the limitations set up in and resulted from the physical product. When *Integrated Product Service Offering* proposed by the authors is used, the development is accomplished in an integrated and parallel approach.

During the use of a complex system, usually, an extremely big amount of data is collected. What and how to do with the data to extract 'useful' information for planning new projects and developing new systems? It has been a rather 'normal' situation when people did not know what to do with the information and its collection and keeping detailed records were just a waste of money. The problems to properly use the data are: absence of available methods for that amount of data to be analysed, lack of computational resources, impossibility to interpret results of the analysis etc. New approaches are needed to cope with the problems. The seventh article of the book's Part I, '*Leveraging Neural Engineering in the Post-Factum Analysis of Complex Systems*' by *Jason Sherwin & Dimitri Mavris*, presents such an approach. They suggest considering the breadth of results and techniques emerging from neural engineering to bolster systems analysis for engineering purposes. In particular, instead of relying on an inconsistent mapping made by human experts to design analysis, why not understand some cognitive elements to expertise and, in turn, apply that comprehension to both systems analysis and manipulation? As the case study, methods of neural engineering to the post-factum analysis of Iraq's stability during 2003-2008 were applied. Such an analysis was never performed in a real context; however authors frame the problem within the context of its utility to a decision-maker whose actions influence the outcome of such a system.

Usually, when the Systems (or other kind of) Engineering is discussed in the literature, a set of processes consisting of activities and tasks is presented. But it is only one 'dimension' of the project management; there are two other: (1) work products to use and generate, and (2) people and tools involved. Any Systems Engineering organisation has potential capabilities for creating systems (or other products). *Capability Portfolio Management* allows an organisation to coordinate capabilities needed to correspond to potential projects (investments). The most Capability Portfolio Management processes are too complex to be used by inexperienced managers. The eighth article of the book, '*Abstracted Effective* 

According to the definition of a system, the 'system' may have different forms, in particular, to be a service ('*Service: Useful work performed that does not produce a tangible product or result, such as performing any of the business functions supporting production or distribution*' [PMBOK]). Any service has first to be created, then it operates, it needs maintenance, repair, upgrade, take-back, and consultation. Any product during its life somehow influences the environment. When service is developed, the environmental problems have to be carefully analysed: what is the influence. The sixth article, '*Integrated Product Service Engineering - Factors Influencing Environmental Performance*' by *Sofia Lingegård, Tomohiko Sakao & Mattias Lindahl*, analyses widely-used strategies of *Service Engineering* and suggest improvements of the strategies. Some products are used for 20-40 years and definitely knowledge about the products is increased during the time. Characteristics of the product may turn out deviated from supposed ones in the development; environmental requirements may change over the decades; 'better' products with the same mission may be developed and so on, and so on… Unfortunately, in real practice, rather often little care is taken in product development (and in its specification) for future services, maintenance, and end-of-life-treatment. Traditionally, the initial focus is on developing the 'physical' product; once that is done, a possible service (intangible product) is developed, but this is hindered by the limitations set up in and resulted from the physical product. When *Integrated Product Service Offering* proposed by the authors is used, the development is accomplished in an integrated and parallel approach.

During the use of a complex system, usually, an extremely big amount of data is collected. What and how to do with the data to extract 'useful' information for planning new projects and developing new systems? It has been a rather 'normal' situation when people did not know what to do with the information and its collection and keeping detailed records were just a waste of money. The problems to properly use the data are: absence of available methods for that amount of data to be analysed, lack of computational resources, impossibility to interpret results of the analysis etc. New approaches are needed to cope with the problems. The seventh article of the book's Part I, '*Leveraging Neural Engineering in the Post-Factum Analysis of Complex Systems*' by *Jason Sherwin & Dimitri Mavris*, presents such an approach. They suggest considering the breadth of results and techniques emerging from neural engineering to bolster systems analysis for engineering purposes. In particular, instead of relying on an inconsistent mapping made by human experts to design analysis, why not understand some cognitive elements to expertise and, in turn, apply that comprehension to both systems analysis and manipulation? As the case study, methods of neural engineering to the post-factum analysis of Iraq's stability during 2003-2008 were applied. Such an analysis was never performed in a real context; however authors frame the problem within the context of its utility to a decision-maker whose actions influence the

Usually, when the Systems (or other kind of) Engineering is discussed in the literature, a set of processes consisting of activities and tasks is presented. But it is only one 'dimension' of the project management; there are two other: (1) work products to use and generate, and (2) people and tools involved. Any Systems Engineering organisation has potential capabilities for creating systems (or other products). *Capability Portfolio Management* allows an organisation to coordinate capabilities needed to correspond to potential projects (investments). The most Capability Portfolio Management processes are too complex to be used by inexperienced managers. The eighth article of the book, '*Abstracted Effective* 

outcome of such a system.

*Capabilities Portfolio Management Methodology Using Enterprise or System of Systems Level Architecture*' by *Joongyoon Lee,* suggests a simpler and more practical methodology for developers and enterprises. The process consists of only 16 sequential tasks that corresponds to *ISO/IEC 24744 Software Engineering - Metamodel for Development Methodologies*, ISO, 2007.

Systems are developed by engineers who are taught and trained for that. Potentially, there could be different approaches for that teaching. One is that sub-system specialists are taught how to develop sub-systems and there are someones who know how to integrate subsystems in a system. Another approach is to get all system project participants familiar with development of systems, not just system components and their integration. The second one allows better understanding and communication. The ninth paper of the Part, '*System Engineering Method for System Design*' by *Guillaume Auriol, Claude Baron, Vikas Shukla & Jean-Yves Fourniols*, presents some educational materials, the process and the outcomes to teach an engineering approach. The case to illustrate the approach is rather practical; it includes commonly used sensors, wireless network, and computational facilities. Various issues can be raised during teaching on wireless sensor networks: electronic design, risks to humans, energy management, telecommunication technologies, etc. The case demonstrates all implementation and some management processes (in terms of ISO/IEC 15288) for a liner project life cycle model. The paper may be very useful as reading (or even a set of educational ideas) for students of various engineering courses.

For each development project engineers with particular knowledge and skills are needed. When project's team is formed, the project manager team have to be sure that project participants correspond to project requirements. How to test competencies of the project teams? There are some traditional approaches and the last, tenth, article of this part, **'***Assessing the Capacity for Engineering Systems Thinking (CEST) and other Competencies of Systems Engineers*' by *Moti Frank & Joseph Kasser*, suggests a new tool for that. As there is no known way for directly 'measuring' thinking skills of individuals, an indirect way is needed, for example, IQ tests are pen-and-paper indirect tests for 'measuring' the intelligence of individuals. The tool combines questionnaires for three main concepts: (1) Success in a systems engineering position, (2) An interest in systems engineering positions and (3) Capacity for engineering systems thinking (CEST); they are all interconnected and interrelated. The will and interest to be a systems engineer basically means the desire and interest to be involved with job positions that require CEST. In other words, the authors hypothesise that there is a high positive correlation between the engineering systems thinking extent of an individual and his/her interest in what is required from successful systems engineers.

#### **4. Part II: New systems engineering theories**

According to the *U.N. telecommunications agency*, there were 5 billion mobile communication devices all across the globe in to the end of 2010 [BBC] and the quantity of produced mobile phones and rate of diffusion are still increasing. The devices are used by all people regardless of race, age or nationality but their requirements to the devices differ. In other words, quality of the devices (as correspondence to requirements) should be treated differently. From another point of view, the level of quality has to be 'high enough' for all categories of the devices and the levels need to be compared. For an effective communication between parties a common 'quality language' is needed and unitless process capability indices are widely used for this purpose. However, according to the statement of the author of the first article of Part II, '*Usage of Process Capability Indices during Development Cycle of Mobile Radio Product*' by *Marko E. Leinonen*, the existing process capability indices do not suit the modern practice in full. The article analyses the current approaches to definition and calculation of indices and proposes new equations for one-dimensional process capability indices with statistical process models based on calculations and simulations. In addition, process capability indices have been defined for multidimensional parameters which are analogous to one-dimensional process capability indices. One of the main difference between one and two-dimensional process capability indices analysis is that a correlation of the data with two-dimensional data should be included into the analysis.

Systems engineers communicate each other during a system's development and users communicate to the system during its operation/use. Effectiveness of the communications has a significant effect on the result of the system's development and success in the system's use. *Human Engineering* may be considered (within the context under consideration) as a sub-area of the Systems Engineering knowledge area. The second article of Part II, '*Augmented Human Engineering: A Theoretical and Experimental Approach to Human Systems Integration*' by *Didier Fass*, focuses on one of the main issues for augmented human engineering: integrating the *biological user's needs* in its methodology for designing humanartefact systems integration requirements and specifications. To take into account biological, anatomical and physiological requirements the author validates theoretical framework. He explains how to ground augmented human engineering on the *Chauvet* mathematical theory of integrative physiology as a fundamental framework for human system integration and augmented human design. The author proposes to validate and assess augmented human domain engineering models and prototypes by experimental neurophysiology. He presents a synthesis of his fundamental and applied research on augmented human engineering, human system integration and human *in-the-loop* system design and engineering for enhancing human performance – especially for technical gestures, in safety critical systems operations such as surgery, astronauts' extra-vehicular activities and aeronautics.

Nowadays e-Infrastructures become more and more spread out in the world, mainly for research and development. *'The term e-Infrastructure refers to this new research environment in which all researchers - whether working in the context of their home institutions or in national or multinational scientific initiatives - have shared access to unique or distributed scientific facilities (including data, instruments, computing and communications), regardless of their type and location in the world'* [e-IRG]. It is obvious that being, in some sense, a 'super-system', an e-Infrastructure cannot take into account all technologies used in 'sub-parts' of the structure, peculiarities of different group of researchers, different cultures and so on. A harmonised approach (a meta-model) is needed for creation suitable e-Infrastructures. The third article of Part II, **'***A System Engineering Approach to e-Infrastructure***'** by *Marcel J. Simonette & Edison Spina*, presents such. It aims to deal with the interactions between e-Infrastructure technologies, humans and social institutions, ensuring that the emergent properties of the system may be synthesised, engaging the right system parts in the right way to create a unified whole that is greater than the sum of its parts.

Generally, no big/complex system can be developed by organisation on its own; tens and even hundreds of other Systems Engineering and other kinds of Engineering organisation may be involved in the project. Then a rather complicated management task of dealing with

capability indices are widely used for this purpose. However, according to the statement of the author of the first article of Part II, '*Usage of Process Capability Indices during Development Cycle of Mobile Radio Product*' by *Marko E. Leinonen*, the existing process capability indices do not suit the modern practice in full. The article analyses the current approaches to definition and calculation of indices and proposes new equations for one-dimensional process capability indices with statistical process models based on calculations and simulations. In addition, process capability indices have been defined for multidimensional parameters which are analogous to one-dimensional process capability indices. One of the main difference between one and two-dimensional process capability indices analysis is that a correlation of the data with two-dimensional data should be included into the analysis.

Systems engineers communicate each other during a system's development and users communicate to the system during its operation/use. Effectiveness of the communications has a significant effect on the result of the system's development and success in the system's use. *Human Engineering* may be considered (within the context under consideration) as a sub-area of the Systems Engineering knowledge area. The second article of Part II, '*Augmented Human Engineering: A Theoretical and Experimental Approach to Human Systems Integration*' by *Didier Fass*, focuses on one of the main issues for augmented human engineering: integrating the *biological user's needs* in its methodology for designing humanartefact systems integration requirements and specifications. To take into account biological, anatomical and physiological requirements the author validates theoretical framework. He explains how to ground augmented human engineering on the *Chauvet* mathematical theory of integrative physiology as a fundamental framework for human system integration and augmented human design. The author proposes to validate and assess augmented human domain engineering models and prototypes by experimental neurophysiology. He presents a synthesis of his fundamental and applied research on augmented human engineering, human system integration and human *in-the-loop* system design and engineering for enhancing human performance – especially for technical gestures, in safety critical systems

operations such as surgery, astronauts' extra-vehicular activities and aeronautics.

unified whole that is greater than the sum of its parts.

Nowadays e-Infrastructures become more and more spread out in the world, mainly for research and development. *'The term e-Infrastructure refers to this new research environment in which all researchers - whether working in the context of their home institutions or in national or multinational scientific initiatives - have shared access to unique or distributed scientific facilities (including data, instruments, computing and communications), regardless of their type and location in the world'* [e-IRG]. It is obvious that being, in some sense, a 'super-system', an e-Infrastructure cannot take into account all technologies used in 'sub-parts' of the structure, peculiarities of different group of researchers, different cultures and so on. A harmonised approach (a meta-model) is needed for creation suitable e-Infrastructures. The third article of Part II, **'***A System Engineering Approach to e-Infrastructure***'** by *Marcel J. Simonette & Edison Spina*, presents such. It aims to deal with the interactions between e-Infrastructure technologies, humans and social institutions, ensuring that the emergent properties of the system may be synthesised, engaging the right system parts in the right way to create a

Generally, no big/complex system can be developed by organisation on its own; tens and even hundreds of other Systems Engineering and other kinds of Engineering organisation may be involved in the project. Then a rather complicated management task of dealing with numerous sub-contractors emerges. The two *Agreement Processes* of *ISO/IEC 15288-2008*: *Acquisition and Supply ones*, cover the task: '*These processes define the activities necessary to establish an agreement between two organizations. If the Acquisition Process is invoked, it provides the means for conducting business with a supplier: of products that are supplied for use as an operational system, of services in support of operational activities, or of elements of a system being developed by a project. If the Supply Process is invoked, it provides the means for conducting a project in which the result is a product or service that is delivered to the acquirer.*' The fourth article of Part II, '*Systems Engineering and Subcontract Management Issues*' by *Alper Pahsa*, presents a possible interpretation of activities and tasks of ISO/IEC 15288 Agreement Processes in terms of the INCOSE materials.

Tactical wireless radio frequency communication systems are a kind of communication systems that allow the interoperability and integration of Command, Control, Computers, Communications, and Information and Intelligence, Surveillance and Reconnaissance Systems in the field of information management control in modern armed forces. According to the current practice, the systems are rather too vulnerable. So, when they are under development and use, they need additional methods of analysis to decrease their vulnerability. The fifth article of Part II, '*System Engineering Approach in Tactical Wireless RF Network Analysis*' by *Philip Chan, Hong Man, David Nowicki & Mo Mansouri*, presents an approach to use mathematical Bayesian network to model, calculate and analyse all potential vulnerability paths in wireless radio frequency networks.

Engineering of systems usually includes involvement of many disciplines and knowledge areas. The disciplines have their own terminology and standards. Often the same terms in different disciplines have different semantics. The same situation is for standards; for example, process standards may present similar processes in more or less different way and in different terms. Harmonising the standards is a slow and difficult process and ISO and IEEE Working Groups have been done the activities for decades. It does not put any restraint on independent researchers to try to create their own synergetic models. The last article of the book, '*Creating Synergies for Systems Engineering: Bridging Cross-disciplinary Standards*' by *Oroitz Elgezabal & Holger Schumann*, is an attempt to merge standards related to Systems Engineering even though they officially refer to different knowledge areas.

#### **5. References**


MIPT: Moscow Institute of Physics and Technology (State University) http://phystech.edu/about/


IACP: Institute for Automation and Control Processes http://www.iacp.dvo.ru/english/institute/institute.html R. Thayer: Richard H Thayer http://www.richardthayer.com/bio.htm


 http://www.cbsnews.com/stories/2010/02/15/business/main6209772.shtml e-IRG: e-Infrastructure Reflection Group http://www.e-irg.eu/

**Part 1** 

**Systems Engineering Practice** 

10 Systems Engineering – Practice and Theory

IEEE 1362: IEEE Std 1362-1998, IEEE Guide for Information Technology-System Definition-

PMBOK: ANSI/PMI 99-001-2004, *A Guide to the Project Management Body of Knowledge Third* 

http://www.cbsnews.com/stories/2010/02/15/business/main6209772.shtml

R. Thayer: Richard H Thayer http://www.richardthayer.com/bio.htm

LMU: London Metropolitan University www.londonmet.ac.uk

e-IRG: e-Infrastructure Reflection Group http://www.e-irg.eu/

*Edition, 2004.*  CBC: CBC Money Watch

Concept of Operations (ConOps) Document, IEEE, 1998.

### **Methodology for an Integrated Definition of a System and Its Subsystems: The Case-Study of an Airplane and Its Subsystems**

Sergio Chiesa, Marco Fioriti and Nicole Viola *Politecnico di Torino Italy* 

#### **1. Introduction**

A modern airplane is without any doubts one of the clearest and most convincing example of "complex system". A modern airplane consists in fact of various types of elements of different technologies (structures, mechanics, electric, electronics, fluids, etc.). Each element has specific tasks to perform and all elements are harmonically integrated to constitute the whole system. Moreover the airplane is a particularly critical system because of quite obvious safety reasons, because of the relevance of its mission, because of high costs and eventually because of its long Life Cycle. Figure 1 shows an example of a modern transport aircraft.

#### Fig. 1. Alenia C 27 J

Let us consider the case of such an airplane, whose mission statement sounds like: "To transport in flight a certain payload from point A to point B". At a first glance the airplane can be seen as a single entity able to perform a well defined function but, getting more into the details, the airplane appears as consisting of various parts, all harmonically integrated and concurrently working to accomplish the same mission. For instance, taking into account Figure 1, different items, like the wing, the fuselage, the horizontal and vertical tails, the engine nacelles with propellers and the wheels of the landing gear (when the aircraft is on ground), can be easily individuated. By looking at the whole aircraft more into the details, other items can be identified or at least imagined, like the structural elements, the engines and many mechanical, electronic and fluidic installations, referable to the numerous and various technologies present onboard the aircraft.

#### **1.1 Terminology**

Before proceeding any further, it is worth clarifying the terminology related to the so-called "system view" and used in the remainder of the chapter.

Taking into account the functional decomposition of the aircraft, it is quite obvious, being the aircraft a complex system, that at the first level of the physical tree there are not single items but group of items, harmonically integrated to perform certain determined functions. Considering a rigorous approach from the terminology point of view, these groups of items should be identified as "subsystems". However, practically speaking, all first level building blocks of the aircraft physical tree (indicated in Figure 2 as subsystems) are usually defined as "systems" (like, for instance, the avionic system, the fuel system, the landing gear system, etc.), as they gather together many different equipments. This ambiguity confirms the following typical characteristic of the system view of complex systems: the concept of system can be applied at different levels. The aircraft system is therefore formed by "n" "subsystems", which in their turn may be thought of as "systems", consisting of the integration of different equipments. A further level of subdivision may also be introduced, in order to split each subsystem into sub-subsystems, made up of various equipments, as Figure 3 shows.

Fig. 2. System view terminology for the aircraft physical tree

Figure 3 illustrates the physical tree of the avionic system (more correctly "subsystem" from the terminology standpoint) of a modern transport aircraft. Because of its high complexity and of the great number of performed functions, the avionic system is in its turn decomposed into several systems (more correctly "sub-subsystems"), which have to accomplish different functions. In particular in the example presented in Figure 3 there are four systems to accomplish the navigation ("Navigation System"), flight controls ("Flight Control and Auto-Pilot System"), communications ("Communications System") and the detection ("Radar System") functions. For sake of brevity only the subdivision of the radar system into equipments is shown in Figure 3. There are two different types of radars: the weather radar and the altimeter radar. They both interface with the same integrated radar display and relative processor. Eventually it is worth noting that the equipments themselves, at least the complex ones, are not at all single entity but may be again further decomposed into modules, which quite often are Line Replaceable Units (LRU) modules, i.e. items that may be replaced quickly at an operating location, in order to minimize the aircraft down time for maintenance.

other items can be identified or at least imagined, like the structural elements, the engines and many mechanical, electronic and fluidic installations, referable to the numerous and

Before proceeding any further, it is worth clarifying the terminology related to the so-called

Taking into account the functional decomposition of the aircraft, it is quite obvious, being the aircraft a complex system, that at the first level of the physical tree there are not single items but group of items, harmonically integrated to perform certain determined functions. Considering a rigorous approach from the terminology point of view, these groups of items should be identified as "subsystems". However, practically speaking, all first level building blocks of the aircraft physical tree (indicated in Figure 2 as subsystems) are usually defined as "systems" (like, for instance, the avionic system, the fuel system, the landing gear system, etc.), as they gather together many different equipments. This ambiguity confirms the following typical characteristic of the system view of complex systems: the concept of system can be applied at different levels. The aircraft system is therefore formed by "n" "subsystems", which in their turn may be thought of as "systems", consisting of the integration of different equipments. A further level of subdivision may also be introduced, in order to split each

subsystem into sub-subsystems, made up of various equipments, as Figure 3 shows.

Figure 3 illustrates the physical tree of the avionic system (more correctly "subsystem" from the terminology standpoint) of a modern transport aircraft. Because of its high complexity and of the great number of performed functions, the avionic system is in its turn decomposed into several systems (more correctly "sub-subsystems"), which have to accomplish different functions. In particular in the example presented in Figure 3 there are four systems to accomplish the navigation ("Navigation System"), flight controls ("Flight Control and Auto-Pilot System"), communications ("Communications System") and the detection ("Radar System") functions. For sake of brevity only the subdivision of the radar system into equipments is shown in Figure 3. There are two different types of radars: the weather radar and the altimeter radar. They both interface with the same integrated radar display and relative processor. Eventually it is worth noting that the equipments themselves, at least the complex ones, are not at all single entity but may be again further decomposed into modules, which quite often are Line Replaceable Units (LRU) modules, i.e. items that may be replaced quickly at an operating location, in order to minimize the aircraft down time for maintenance.

various technologies present onboard the aircraft.

"system view" and used in the remainder of the chapter.

Fig. 2. System view terminology for the aircraft physical tree

**1.1 Terminology** 

Fig. 3. Transport airplane avionic system physical tree

Please note that in the remainder of the chapter the subsystems are referred to as systems for the above mentioned reasons.

#### **1.2 State of the art and trends of the aeronautical systems**

Before presenting the methodology, it is worth describing the state of the art and the general trends of the aeronautic systems. Only the main systems onboard medium/large airplanes are here considered.

Figure 4 illustrates the main systems of a transport aircraft and all their interactions, in particular in terms of power exchange. Please note that the building blocks with dotted line have not been dealt with specifically in the present text. By looking at Figure 4 it is possible to note that:


the electrical system (as onboard the aircraft there are users that can be supplied only with electric power, like lights, electronics, etc.). This solution is represented by the actual successful trend of the so-called "All Electric Aircraft", which shows quite a few advantages, if compared to the traditional aircraft, in terms of simplicity and rationality. Other intermediate solutions do also exist as the so-called "More Electric Aircraft" testifies, where the engines power is initially transformed only into electric power and then partially into hydraulic and/or pneumatic power.

Fig. 4. Transport airplane system and its (sub-)systems

Table 1 summarizes functions, types and features of the main onboard systems.

then partially into hydraulic and/or pneumatic power.

Fig. 4. Transport airplane system and its (sub-)systems

**SYS FUNCTIONS** 

To acquire information (from airplane, from external environment, from other Entities, from Telemetry). To elaborate them.

To give them to the Crew, to the airplane, to other Entities by means of Telemetry.

**AVIONIC SYSTEM** 

Table 1 summarizes functions, types and features of the main onboard systems.

**PERFORMED SYSTEMS CONFIGURATIONS NOTES** 

The avionics will be considered at the today state-of-the-art, taking into account usual kinds of equipments. The new trend to "Integrated Modular Avionics" is not considered in a very preliminary approach. The data exchange between Equipments is based on DATA BUS.

the electrical system (as onboard the aircraft there are users that can be supplied only with electric power, like lights, electronics, etc.). This solution is represented by the actual successful trend of the so-called "All Electric Aircraft", which shows quite a few advantages, if compared to the traditional aircraft, in terms of simplicity and rationality. Other intermediate solutions do also exist as the so-called "More Electric Aircraft" testifies, where the engines power is initially transformed only into electric power and


**PERFORMED SYSTEMS CONFIGURATIONS NOTES** 

**PERFORMED SYSTEMS CONFIGURATIONS NOTES** 

A new kind of anti-ice system on wing leading edge, characterised by very low electric power required, is the "Impulse System".

**PERFORMED SYSTEMS CONFIGURATIONS NOTES** 

The ice problem → increase CD0 → decrease CLMAX

→ mobile devices jamming → to perturbe air intake flow → propellers dynamic unbalance Two kinds of CAU can be envisaged: "vapor cycle" and "air cycle". If the CAU output temperature of the air is < 0°C, it is mandatory to introduce it in the cabin, after mixing with re-circulated cabin air.

Apart from the antiice or de-ice actions illustrated in the figure beside, please consider the electric ice protection of hinges,

compensation horn, small sensors, windshields and propellers.

This system greatly affects aircraft configuration because of the extension and great volumes of its tanks. The booster pumps are usually electrically driven.

**SYS FUNCTIONS** 

To provide all people onboard the aircraft with correct values of air total pressure, partial O2 pressure and temperature.

**SYS FUNCTIONS** 

To avoid problems due to ice formation of the airplane external surfaces.

**SYS FUNCTIONS** 

To perform pressure refuelling, allowing tanks venting. To store onboard all fuel necessary to engines and APU and to feed them when requested.

**ENVIRONMENT** 

**ANTI-ICE SYSTEM** 

**FUEL SYSTEM** 

**CONTROL SYSTEM** 

Table 1. Functions, types and features of the main onboard systems

#### **2. Airplane system design**

As already said, an airplane is a complex system, consisting of many different elements all harmonically integrated to form a unique entity, designed to perform a well defined mission.

Let us now examine the complex process, which, starting from the customer needs and moving on to the definition of requirements, proceeds with the development and then the manufacturing of the new airplane. Figure 5 schematically illustrates this complex process. Considering a reference frame with time on the x-axis and the level of details on the y-axis, it can be noted that, starting from the customer needs, the new product is first defined at system level, then at subsystem level and eventually, getting more into the details of the design flow, at equipment level. Every successive step, which corresponds to a new design phase, is an iterative process (see Figure 5) and the results of each phase are seriously affected by those of the previous phase. If we look at Figure 5, we can therefore understand that, starting from the customer needs and then the requirements definition, the process gets through all design phases (from the conceptual to the preliminary and eventually to the detailed design) following a typical top-down approach with an increased level of details from the system to the equipments. Then, once equipments have been defined and thus bought and/or manufactured, they are tested and integrated to form first the subsystems and eventually the whole system through the final assembly, according to a typical bottomup approach. Once the final assembly has been completed, new activities at system level can be performed. After successfully accomplishing these activities, i.e. the system functional testing, the operative life of the new product can begin.

Fig. 5. The system design process

#### **2.1 Airplane conceptual design**

Taking into account the whole design process presented in Figure 5, it is quite clear that the main criticality of the conceptual design phase lies in the capability of generating (Antona et al., 2009) a first idea of the new product. A first idea of the future product implies:

a. architectural choices, i.e. definition of the global product's architecture in terms of shape and type of main elements and mutual location of the elements themselves. It is worth noting that the various alternatives can generate quite a few combinations, which are all potentially feasible and which shall then be traded to pick up the best ones. For sake of clarity, let us consider a modern medium passenger transport airplane. The possible alternatives for its architectural layout may be expressed in terms of:

As already said, an airplane is a complex system, consisting of many different elements all harmonically integrated to form a unique entity, designed to perform a well defined mission. Let us now examine the complex process, which, starting from the customer needs and moving on to the definition of requirements, proceeds with the development and then the manufacturing of the new airplane. Figure 5 schematically illustrates this complex process. Considering a reference frame with time on the x-axis and the level of details on the y-axis, it can be noted that, starting from the customer needs, the new product is first defined at system level, then at subsystem level and eventually, getting more into the details of the design flow, at equipment level. Every successive step, which corresponds to a new design phase, is an iterative process (see Figure 5) and the results of each phase are seriously affected by those of the previous phase. If we look at Figure 5, we can therefore understand that, starting from the customer needs and then the requirements definition, the process gets through all design phases (from the conceptual to the preliminary and eventually to the detailed design) following a typical top-down approach with an increased level of details from the system to the equipments. Then, once equipments have been defined and thus bought and/or manufactured, they are tested and integrated to form first the subsystems and eventually the whole system through the final assembly, according to a typical bottomup approach. Once the final assembly has been completed, new activities at system level can be performed. After successfully accomplishing these activities, i.e. the system functional

Taking into account the whole design process presented in Figure 5, it is quite clear that the main criticality of the conceptual design phase lies in the capability of generating (Antona et

a. architectural choices, i.e. definition of the global product's architecture in terms of shape and type of main elements and mutual location of the elements themselves. It is worth noting that the various alternatives can generate quite a few combinations, which are all potentially feasible and which shall then be traded to pick up the best ones. For sake of clarity, let us consider a modern medium passenger transport airplane. The

al., 2009) a first idea of the new product. A first idea of the future product implies:

possible alternatives for its architectural layout may be expressed in terms of:

**2. Airplane system design** 

testing, the operative life of the new product can begin.

Fig. 5. The system design process

**2.1 Airplane conceptual design** 


Once the concept of the future product has been generated, the level of details is still so poor that the manufacturing process could never begin. In order to enter production, the design of the future product has to proceed from the conceptual to the preliminary and eventually to the detailed design phase but this evolution requires a great deal of resources in terms of time, people and obviously money and cannot be pursued, unless the first idea of the future product has been declared feasible and competitive at the end of the conceptual design phase. It is worth remembering here that the conceptual design phase may sometimes also be called "feasibility study" or "feasibility phase".

At the end of the conceptual design phase we thus have a first idea of the future product that cannot yet be manufactured but can without any doubts be evaluated and compared with other similar potentially competing products, which may already exist or be still under development.

The conceptual design phase is therefore extremely relevant because:


At the same time the conceptual design phase is extremely critical because:


Taking all these considerations into account, it seems absolutely valuable and interesting, both for pure methodological and more applicative aspects, to improve the conceptual design activities, specifically the aerospace systems conceptual design activities, in terms of accuracy and thoroughness of the results achieved.

Fig. 6. Conceptual design relevance

#### **2.2 Airplane systems conceptual design**

Unlike the past, when the system view of the airplane was not at all evident and the results of the conceptual design were almost exclusively turned to the preliminary definition of the airplane main characteristics, today the systems engineering approach is widely accepted and appreciated and the results of the conceptual design include also initial basic choices for onboard systems. Please note that these initial choices for the onboard systems lay the groundwork for the next activities of systems development during the successive design phases, as shown in Figure 5. It is quite obvious that the capability of preliminary defining onboard systems already during the conceptual design phase implies more accurate and detailed results of the conceptual design itself. The initial definition of the onboard systems allows in fact achieving a more precise and reliable estimation of the whole system characteristics (like, for instance, the system empty weight, given by the sum of the onboard systems weights) and make the start of the successive preliminary design activities easier. However it is clear that more accurate and detailed results require a more complex conceptual design phase, which can be successfully faced today thanks to computer programs automation and to new powerful software tools.

Figure 7 schematically illustrates the main steps of conceptual design according to the traditional approach (airplane conceptual design) and to the proposed innovative approach (airplane & systems conceptual design).

As Figure 7 shows, the traditional approach to conceptual design envisages both architectural and quantitative choices, mutually interrelated, to generate the first idea of the future product (see also sub-section 2.1). According to this approach in conceptual design there are just the individuation of the onboard systems of the future product and the estimation of their weights. Unlike the traditional approach, the innovative approach, besides the architectural and quantitative choices, envisages also the preliminary definition of onboard systems, once the systems themselves have been individuated. For every onboard system, the preliminary definition implies:


Unlike the past, when the system view of the airplane was not at all evident and the results of the conceptual design were almost exclusively turned to the preliminary definition of the airplane main characteristics, today the systems engineering approach is widely accepted and appreciated and the results of the conceptual design include also initial basic choices for onboard systems. Please note that these initial choices for the onboard systems lay the groundwork for the next activities of systems development during the successive design phases, as shown in Figure 5. It is quite obvious that the capability of preliminary defining onboard systems already during the conceptual design phase implies more accurate and detailed results of the conceptual design itself. The initial definition of the onboard systems allows in fact achieving a more precise and reliable estimation of the whole system characteristics (like, for instance, the system empty weight, given by the sum of the onboard systems weights) and make the start of the successive preliminary design activities easier. However it is clear that more accurate and detailed results require a more complex conceptual design phase, which can be successfully faced today thanks to computer

Figure 7 schematically illustrates the main steps of conceptual design according to the traditional approach (airplane conceptual design) and to the proposed innovative approach

As Figure 7 shows, the traditional approach to conceptual design envisages both architectural and quantitative choices, mutually interrelated, to generate the first idea of the future product (see also sub-section 2.1). According to this approach in conceptual design there are just the individuation of the onboard systems of the future product and the estimation of their weights. Unlike the traditional approach, the innovative approach, besides the architectural and quantitative choices, envisages also the preliminary definition of onboard systems, once the systems themselves have been individuated. For every

 choice of systems architecture through block diagrams at main equipments level; initial sizing of such blocks, in terms of weight, volume and power required, on the basis of their performance requirements, in order to be able to start selecting them; preliminary studies of equipments and systems installation onboard the airplane, on the basis of main equipments weight and volume considerations. These preliminary studies on systems installation allow making more accurate estimation on the airplane mass

Fig. 6. Conceptual design relevance

**2.2 Airplane systems conceptual design** 

programs automation and to new powerful software tools.

onboard system, the preliminary definition implies:

(airplane & systems conceptual design).

properties;

 evaluation of mass and power budgets on the basis of weight and power required of each system equipment.

Fig. 7. Main steps of conceptual design according to the traditional and the innovative approach

Drawing some conclusions, we can say that, as the proposed new approach guarantees more accurate and detailed results and as the problem has not been extensively addressed so far (unlike what has happened in other disciplines, like, for instance, structures, aerodynamics and flight mechanics, whose mathematical algorithms have been integrated in a Multi Disciplinary Optimization, MDO, context in conceptual design), the development of a conceptual design methodology that pursues the new approach (see right hand side of Figure 7) appears extremely useful and valuable. ASTRID (Aircraft on board Systems Sizing And TRade-Off Analysis in Initial Design phase) is the acronyms of the innovative conceptual design methodology, proposed by the Authors. ASTRID is based on a dedicated software tool to easily perform iterations and successive refinements and to make the evaluation and comparison of various potentially feasible alternatives possible. ASTRID will be the main topic of the next section.

#### **3. Airplane system innovative conceptual design methodology**

Main features of the proposed new conceptual design methodology for the airplane system are the early investigation of avionics and onboard general systems and their integration with the traditional activities of conceptual design, i.e. the definition of system architecture and the accomplishment of system sizing, in terms of weight, volume, performances and system cost estimation. However, unlike the traditional approach to preliminary system sizing, avionics and onboard general systems, cannot be easily assessed through few and simple relationships. It is worth remembering here that, according to the traditional approach, the study of avionics and onboard general systems starts only after at least a preliminary concept of aircraft has been defined. The conventional sequence of design activities, characterized by aircraft conceptual design and then avionics and onboard general systems preliminary assessment, is still the current state-of-the-art, like a considerable number of valuable references, such as Daniel P. Raymer (Raymer, 1992) and Jan Roskam (Roskam, 1990), testifies. The same approach is pursued by two important software tools of aircraft design, RDS – "Integrated aircraft design and analysis" (by Conceptual Research Corporation, a company founded and lead by Daniel Raymer) and AAA – "Advanced Aircraft Analysis" (by DAR Corporation, founded by Jan Roskam), which have been developed on the basis of the works of, respectively, Daniel P. Raymer and Jan Roskam and have recently become widespread also at industrial level. The relevance of avionics and onboard general systems in aircraft conceptual design is witnessed by John Fielding from Cranfield College of Aeronautics (Fielding, 1999), who dedicates a great effort to the description of avionics and onboard general systems, but, as his work provides the reader with just an introduction to aircraft design issues, no specific methodology is reported in the text. On the basis of this preliminary assessment, the development of ASTRID seems to be highly desirable, in order to support the design process of new aircraft.

#### **3.1 General context, goals and overview of ASTRID methodology**

Before proceeding any further, let us briefly review the most common and widely used methodologies of aircraft conceptual design, focusing in particular on the way in which avionics and onboard general systems are taken into account. There are two main types of approaches:


It can be noticed that in the first case every considerations about avionics and onboard general systems is postponed to a later stage of the design process, where, apart from all other requirements, avionics and onboard general systems shall be compliant with the previously defined constraint of global weight. Unlike the first case, in the second type of methodologies avionics and onboard general systems are taken into account since the very beginning of the design process at least from the point of view of weight, as their weight is established as part of the empty weight, either as percentage (in simplified methodologies) or as a result of WERs for the various systems (Staton, 1972) (Chiesa et al., 2000). It is interesting to observe that, on the basis of WERs for a single system, the same number of CERs (Cost Estimation Relationships) have been derived by several authors (Beltramo et al., 1979). Only in the second type of methodologies of aircraft conceptual design, some influences of avionics and onboard general systems on the overall aircraft design can therefore be expected since the very beginning of the design process, as the WERs of the

simple relationships. It is worth remembering here that, according to the traditional approach, the study of avionics and onboard general systems starts only after at least a preliminary concept of aircraft has been defined. The conventional sequence of design activities, characterized by aircraft conceptual design and then avionics and onboard general systems preliminary assessment, is still the current state-of-the-art, like a considerable number of valuable references, such as Daniel P. Raymer (Raymer, 1992) and Jan Roskam (Roskam, 1990), testifies. The same approach is pursued by two important software tools of aircraft design, RDS – "Integrated aircraft design and analysis" (by Conceptual Research Corporation, a company founded and lead by Daniel Raymer) and AAA – "Advanced Aircraft Analysis" (by DAR Corporation, founded by Jan Roskam), which have been developed on the basis of the works of, respectively, Daniel P. Raymer and Jan Roskam and have recently become widespread also at industrial level. The relevance of avionics and onboard general systems in aircraft conceptual design is witnessed by John Fielding from Cranfield College of Aeronautics (Fielding, 1999), who dedicates a great effort to the description of avionics and onboard general systems, but, as his work provides the reader with just an introduction to aircraft design issues, no specific methodology is reported in the text. On the basis of this preliminary assessment, the development of ASTRID seems to be

Before proceeding any further, let us briefly review the most common and widely used methodologies of aircraft conceptual design, focusing in particular on the way in which avionics and onboard general systems are taken into account. There are two main types of

 methodologies in which the aircraft Maximum Take-off Gross Weight (MTGW) is defined in such a way to match requirements (generally expressed in terms of performances) and it is broken down into pay-load, fuel and empty weight, being the

 methodologies in which the aircraft MTGW is estimated on the basis of requirements (for example the fuel weight depends on the range requirement) and the components of the empty weight are estimated on the basis of the Weight Estimation Relationships

It can be noticed that in the first case every considerations about avionics and onboard general systems is postponed to a later stage of the design process, where, apart from all other requirements, avionics and onboard general systems shall be compliant with the previously defined constraint of global weight. Unlike the first case, in the second type of methodologies avionics and onboard general systems are taken into account since the very beginning of the design process at least from the point of view of weight, as their weight is established as part of the empty weight, either as percentage (in simplified methodologies) or as a result of WERs for the various systems (Staton, 1972) (Chiesa et al., 2000). It is interesting to observe that, on the basis of WERs for a single system, the same number of CERs (Cost Estimation Relationships) have been derived by several authors (Beltramo et al., 1979). Only in the second type of methodologies of aircraft conceptual design, some influences of avionics and onboard general systems on the overall aircraft design can therefore be expected since the very beginning of the design process, as the WERs of the

highly desirable, in order to support the design process of new aircraft.

**3.1 General context, goals and overview of ASTRID methodology** 

empty weight often defined as a percentage of MTGW itself;

approaches:

(WERs).

various systems allow defining some crucial parameters of the systems themselves, like the number of fuel tanks of the fuel system, the number of actuators of the flight control system, the electric power that has to be supplied by the energy sources, etc.. Nevertheless other considerations on the following issues are still missing:


After reviewing the various existing methodologies of aircraft conceptual design, the main characteristics of the new methodology can be brought to evidence. Referring in particular to the influence of avionics and onboard general systems on the conceptual design of the whole aircraft, the new methodology envisages taking into account the design of avionics and onboard general systems since the very beginning of the design process through various successive refinements and iterations that affect also the main aircraft characteristics. The new tool shall not therefore be structured at level of the single systems (for example as ATA subdivision) but, for each system, at level of its main equipments (i.e., for instance, in the avionic system: weather radar, AHRS, ADC, VOR, radio UHF, etc.; in the electrical system: generators, TRUs, inverters, batteries, etc.). Thanks to this approach four main advantages that may lead to a better definition of the whole aircraft very early during the design process can be envisaged:


Focusing the attention on main equipments, by estimating their weights and costs, might lead to neglect the contribution to the overall weight and cost of the remaining parts of the systems, such as small components, like lines, pipes, wires, installation devices, etc. However the problem can be solved by means of a further estimate of these small components and/or by matching weight and cost estimations at main equipment/components level with results obtained by WERs and CERs at system level.

Before getting into the details of the logical steps that have to be taken to apply ASTRID methodology, a synthetic overview of the complete methodology is reported hereafter.

After preliminary estimating the aircraft global parameters, the main equipments of each system can be identified through, for example, the functional analysis, keeping in mind the various possible alternatives of architectures and taking into account the new emerging technologies. After the identification of the main equipments of each system and their interfaces, the inputs and outputs of each building block (i.e. main equipment) can be individuated. Specifically per each building block the number of inputs/outputs as well as which parameters have to be considered as inputs/outputs have to be established. It is quite obvious that the aircraft data, the design constraints (including the constraints of regulations) and the performance requirements, that characterize the equipments, are among the considered parameters, which on a statistical basis allow then to estimate the weight, volume, cost and any other possible feature of the equipment itself. Starting from the inputs/outputs of every main equipment, the relationships that allow calculating the value of the outputs on the basis of the inputs can then be defined through statistical relationships.

Notwithstanding the integration between the various systems, each system has to be considered at least initially separately for the identification of its main equipment. It appears therefore obvious that, in this phase, a logical sequence with which the tool addresses the various systems has to be established. In order to avoid or minimize iterations, for instance, the systems requiring power supply have to be considered first and later on those generating power.

Once the complete set of relationships between inputs and outputs of each main equipment and their sequence, which constitute a mathematical model, has been established, the design process proceeds with the application of the iterative loops for the refinement of the aircraft sizing and performance estimation.

The output of the convergence of this iterative loop is an optimized aircraft with optimized avionics and on-board general systems architecture.

#### **3.2 ASTRID methodology**

Purpose of the section is to describe in an easy and straightforward way the various steps that have to be taken to apply ASTRID methodology and the logical path that has to be followed to move from one step to the next one.

Figure 8 shows the flowchart of the complete methodology.

Main goal of the methodology is to identify the best global configuration, in terms of architecture and system sizing, of avionics and onboard general systems for a defined airplane concept, which may be either already frozen or still under development. It is worth noting that the former case implies more constraints with respect to the latter case for the avionics and onboard systems design. Moreover in the latter case the global aircraft design can still benefit from the data coming from the avionics and onboard systems design, in order to achieve a more accurate global conceptual design.

ASTRID is therefore a separate module that can however be integrated with the global aircraft concept definition thanks to specific building blocks dedicated to data exchange.

The methodology is characterized by the possibility of carrying out more designs of avionics and onboard general systems for the same aircraft concept, in order to trade then off the various designs and pick up the best ones. The methodology also allows addressing only some systems, in case others have still been designed.

Main expected result of every system module is the definition of the system architecture and the accomplishment of the system sizing at equipments level, with obvious advantages in

individuated. Specifically per each building block the number of inputs/outputs as well as which parameters have to be considered as inputs/outputs have to be established. It is quite obvious that the aircraft data, the design constraints (including the constraints of regulations) and the performance requirements, that characterize the equipments, are among the considered parameters, which on a statistical basis allow then to estimate the weight, volume, cost and any other possible feature of the equipment itself. Starting from the inputs/outputs of every main equipment, the relationships that allow calculating the value of the outputs on the basis of the inputs can then be defined through statistical

Notwithstanding the integration between the various systems, each system has to be considered at least initially separately for the identification of its main equipment. It appears therefore obvious that, in this phase, a logical sequence with which the tool addresses the various systems has to be established. In order to avoid or minimize iterations, for instance, the systems requiring power supply have to be considered first and later on those

Once the complete set of relationships between inputs and outputs of each main equipment and their sequence, which constitute a mathematical model, has been established, the design process proceeds with the application of the iterative loops for the refinement of the aircraft

The output of the convergence of this iterative loop is an optimized aircraft with optimized

Purpose of the section is to describe in an easy and straightforward way the various steps that have to be taken to apply ASTRID methodology and the logical path that has to be

Main goal of the methodology is to identify the best global configuration, in terms of architecture and system sizing, of avionics and onboard general systems for a defined airplane concept, which may be either already frozen or still under development. It is worth noting that the former case implies more constraints with respect to the latter case for the avionics and onboard systems design. Moreover in the latter case the global aircraft design can still benefit from the data coming from the avionics and onboard systems design, in

ASTRID is therefore a separate module that can however be integrated with the global aircraft concept definition thanks to specific building blocks dedicated to data exchange.

The methodology is characterized by the possibility of carrying out more designs of avionics and onboard general systems for the same aircraft concept, in order to trade then off the various designs and pick up the best ones. The methodology also allows addressing only

Main expected result of every system module is the definition of the system architecture and the accomplishment of the system sizing at equipments level, with obvious advantages in

relationships.

generating power.

sizing and performance estimation.

**3.2 ASTRID methodology** 

avionics and on-board general systems architecture.

followed to move from one step to the next one.

Figure 8 shows the flowchart of the complete methodology.

order to achieve a more accurate global conceptual design.

some systems, in case others have still been designed.

terms of estimation of aircraft mass and power budgets. Per each system, it is also possible, if requested, to study the system installation onboard the aircraft at equipments level, with clear advantages in terms global aircraft mass properties and evaluation of the feasibility of the aircraft configuration itself.

Fig. 8. ASTRID methodology flow-chart

Taking now into account avionics and onboard general systems, Table 2 sums up the activities to perform and the tools/algorithms to apply, in order to accomplish the design of each system.


Table 2. ASTRID methodology: systems design

Table 2. ASTRID methodology: systems design

Considering each system separately, the following considerations need to be highlighted:

 avionic system. Main activities of the conceptual design of the avionic system consist in identifying which and how many types of equipments will form the whole system. The design purses the typical functional approach. The functional tree, which is one of the main tasks of the functional analysis, allows defining the basic functions that the avionics system shall be able to perform. Figure 9 illustrates an example of functional tree (for sake of simplicity this example refers to the block diagram of the avionic system shown in Table 1), where the top level function "avionic system" has been decomposed into first level functions, which identify the various subsystems of the avionic system. For sake of simplicity only one of the first level functions, "to control flight behaviours", has been further subdivided into lower level functions. The so called basic functions, i.e. those functions that cannot be split any further, are in this case functions that can be performed by equipments. Through the functions/equipments matrix (see example in Figure 10) the basic functions are associated to equipments. Once the functions/equipments matrix is completed, all equipments of the avionic system are known. Figure 10 illustrates the functions/equipments matrix related to first level function "to control flight behaviours" of Figure 9. On the basis of performance requirements, either already available equipments can be individuated or new (not yet existing) equipments can be preliminary sized by statistically estimating their characteristics, like weight, volume, requested power per each flight phase. Once the basic equipments are identified, the links between each equipment can be established through the connection matrix (see example in Figure 11). Eventually the avionic system architecture is presented in the functional/physical block diagram (see example in Figure 12).

Fig. 9. Avionic system design: functional tree

Flight Controls & Landing Gear System. Even though flight controls and landing gear are separate systems, here we address them together as the main issue of their conceptual design is in both cases the definition and sizing of the various actuators (i.e. of their main (basic) equipment), thus leading to the system mass and power budgets. Main activities of the conceptual design of the flight controls & landing gear system consist in defining the architecture of flight control surfaces and landing gear actuators and then sizing the actuators themselves. Figure 13 illustrates schematically the applied algorithm for optimal actuator sizing. In particular Figure 13 focuses first on hydraulic cylinders (linear actuators) and hydraulic motors (rotary actuators). Considering the linear hydraulic actuator, the graph that depicts force vs. velocity allows us to achieve optimal sizing. The same graph can be easily translated into torque vs. angular speed, which is valid both for the hydraulic rotating actuator and for the electric rotary actuator (making the hypothesis of the presence of a current limiter), as Figure 13 shows. After completing the actuators sizing activity, it is fundamental to understand when the various actuators will work, i.e. during which flight phases (it is worth remembering, for instance, that generally primary flight controls actuators work continuously throughout all flight phases, while secondary flight controls and landing gear actuators work only during certain flight phases). Eventually, considering the power consumption of each actuator and the flight phases during which each actuator is supposed to work, the electric loads diagrams and thus the electric power budget can be generated


Fig. 10. Avionic system design: functions/equipment matrix


diagrams and thus the electric power budget can be generated

Fig. 10. Avionic system design: functions/equipment matrix

 Furnishing system. This system is made up of various equipments that may strongly affect the whole aircraft, in terms of mass and power required, especially in case a civil transport aircraft is considered. Main activities of the conceptual design of the furnishing system consist in identifying which and how many types of equipments will form the whole system, individuating their location and estimating their mass and power consumption. The estimates of mass and power consumption, based on the stateof-the-art technology available for the envisaged equipments, may have, respectively, a serious impact on the global aircraft concept and on the onboard power system sizing. Environment control system. Main activities of the conceptual design of the environment control system consist in preliminary estimating the thermal load, q,

and then sizing the actuators themselves. Figure 13 illustrates schematically the applied algorithm for optimal actuator sizing. In particular Figure 13 focuses first on hydraulic cylinders (linear actuators) and hydraulic motors (rotary actuators). Considering the linear hydraulic actuator, the graph that depicts force vs. velocity allows us to achieve optimal sizing. The same graph can be easily translated into torque vs. angular speed, which is valid both for the hydraulic rotating actuator and for the electric rotary actuator (making the hypothesis of the presence of a current limiter), as Figure 13 shows. After completing the actuators sizing activity, it is fundamental to understand when the various actuators will work, i.e. during which flight phases (it is worth remembering, for instance, that generally primary flight controls actuators work continuously throughout all flight phases, while secondary flight controls and landing gear actuators work only during certain flight phases). Eventually, considering the power consumption of each actuator and the flight phases during which each actuator is supposed to work, the electric loads between the cabin and the external environment, and then the required air mass flow, mcond, to keep the temperature of the cabin within an acceptable range of values (usually between 18°C and 25°C). After estimating the thermal load, the required air mass flow can be computed, depending on the desired temperature inside the cabin, TCAB, and on different operative scenarios, which range between the so-called cold case (case A in Table 2), when maximum heating is required, and the so-called hot case (case P in Table 2), when maximum cooling is required. The air mass flow, that can be provided at different temperatures (for instance, at high temperature, TI HOT, in cold cases or at low temperature, TI COLD, in hot cases), can in fact be computed by matching the two equations, which express the thermal load qA or qP in Table 2 and in Figure 14 and the heat load provided by the system (q in Figure 14) to maintain the desired temperature inside the cabin.

Fig. 11. Avionic system design: connection matrix

Fig. 12. Avionic system design: functional/physical block diagram

Fig. 13. Flight controls & Landing Gear system design: actuator sizing

Fig. 14. Environmental Control System design: mcond estimation


Fig. 13. Flight controls & Landing Gear system design: actuator sizing

Fig. 14. Environmental Control System design: mcond estimation

 Anti-ice system. Main results of the conceptual design of the environment control system consist in evaluating the surface of the aircraft that has to be protected to avoid ice formations (see SP, protected surface, in Table 2) and in estimating the power required to protect that surface. It is worth noting that the power required may be either pneumatic power (i.e. air mass flow, mAI in Table 2) or electric power (PAI in Table 2). Apart from wide aircraft surfaces, also small zones have to be taken into account in terms of electric

power required for anti-icing (see PSZ, power required for small zones, in Table 2). Fuel System. Once the aircraft architecture has been defined, the equipments of the fuel system that have the largest impact on the whole aircraft, i.e. the fuel tanks, have usually already been determined in terms of number, capacity and location onboard the aircraft. However fuel feed, pressure refuelling pipes and the fuel booster pumps, used to boost the fuel flow from the aircraft fuel system to the engine, have still to be identified as main results of the conceptual design of the fuel system (see Table 2). As fuel booster pumps are usually electrically driven, it is clear that their power consumption represents another important input to the whole aircraft power budget. The definition of the fuel booster pumps is based on their graphs (p=p(Q)) that depict pressure, p, as a function of fuel flow, Q (Figure 15), taking into account the requirements of maximum and minimum engine interface pressure (respectively pmax\_engine and pmin\_engine in Figure 15), as well as the pressure losses along fuel pipes, and the requirements of maximum and minimum fuel flow (respectively Qmax and Qmin in Figure 15).

Fig. 15. Fuel system design: fuel booster pump definition


worth underling the importance of developing different solutions in terms of users and therefore generators, in order to be able to compare then these alternatives and pick up the best one. It has to be remembered that different forms of electric power imply different weight and cost. Among the various available solutions, the most common trend is to select the option that generates the type of power to feed the greatest number of users, in order to minimize the request of power conversion. As reported in Table 2, various solutions are available. The most common ones are listed hereafter:


While the first case is the today most common solution, the other three options represent the future trends, characterized by new forms of electric power to satisfy the ever increasing request of electric power onboard the aircraft. Figure 17, Figure 18 and Figure 19 show possible solutions of the electrical system architecture and the relative load diagrams of these new trends. It is worth noting that in the load diagrams the different forms of electric power, which have been converted from the main generation, are considered as users that contribute to the global electric power required. The generator is sized on the basis of the global electric power required during the various mission phases, while all power conversion units are sized on the basis of the amount of electric power they are requested to supply. As reported in Table 2, especially in case of "All Electric" philosophy, the capability of every engine driven generator of performing engine starting (thanks to the electric machines reversibility) shall be verified. When accomplishing the task of engine starting, the generator is powered by another generator driven by the APU, which, being in its turn a gas turbine engine of relative small dimensions, can be easily set working by a traditional 28 VDC electric starter, fed by the battery. Eventually the most appropriate battery has to be selected (see Figure 20), in order to be able to perform APU starting and to be able to feed vital users (according to regulations) even though for a limited time.

Fig. 16. Hydraulic system design: hydraulic pumps load diagram

various solutions are available. The most common ones are listed hereafter:

115 VAC 400 Hz, 270 VDC and 28 VDC;

115 VAC 400 Hz, 270 VDC and 28 VDC.

(according to regulations) even though for a limited time.

Fig. 16. Hydraulic system design: hydraulic pumps load diagram

Hz and 28 VDC;

worth underling the importance of developing different solutions in terms of users and therefore generators, in order to be able to compare then these alternatives and pick up the best one. It has to be remembered that different forms of electric power imply different weight and cost. Among the various available solutions, the most common trend is to select the option that generates the type of power to feed the greatest number of users, in order to minimize the request of power conversion. As reported in Table 2,

i. 115 VAC 400 Hz generation and conversion of part of the electric power to 28 VDC; ii. 115 VAC wide frequency generation and conversion of part of the electric power to

iii. 270 VDC generation and conversion of part of the electric power to 115 VAC 400

iv. 230 VAC wide frequency generation and conversion of part of the electric power to

While the first case is the today most common solution, the other three options represent the future trends, characterized by new forms of electric power to satisfy the ever increasing request of electric power onboard the aircraft. Figure 17, Figure 18 and Figure 19 show possible solutions of the electrical system architecture and the relative load diagrams of these new trends. It is worth noting that in the load diagrams the different forms of electric power, which have been converted from the main generation, are considered as users that contribute to the global electric power required. The generator is sized on the basis of the global electric power required during the various mission phases, while all power conversion units are sized on the basis of the amount of electric power they are requested to supply. As reported in Table 2, especially in case of "All Electric" philosophy, the capability of every engine driven generator of performing engine starting (thanks to the electric machines reversibility) shall be verified. When accomplishing the task of engine starting, the generator is powered by another generator driven by the APU, which, being in its turn a gas turbine engine of relative small dimensions, can be easily set working by a traditional 28 VDC electric starter, fed by the battery. Eventually the most appropriate battery has to be selected (see Figure 20), in order to be able to perform APU starting and to be able to feed vital users

Fig. 17. Electrical System: 230 VAC wide frequency generation and example of load diagram

Fig. 18. Electric System: 270 VDC generation and example of load diagram

Fig. 19. Electric System: 115 VAC wide frequency generation and example of load diagram

Fig. 20. Battery selection

#### **4. Conclusions**

After an overview of the airplane system conceptual design, the chapter focuses on an innovative conceptual design methodology, ASTRID, which allows assessing and preliminary sizing avionics and onboard general systems very early during the design process. The advantage is a better definition of the whole aircraft, in terms of weight, mass properties, power budget and consequently cost already during the conceptual design phase. A better quality of the design of new aircraft is likely to widely improve the development of future aircraft. The proposed innovative methodology can contribute to the achievement of this goal with limited cost.

#### **5. Acronyms**

AC = Aircraft ADC = Air Data Computer ADF = Automatic Direction Finder ADI = Attitude Director Indicator AHRS = Attitude Heading Reference System APU = Auxiliary Power Unit ASTRID = Aircraft on board Systems Sizing And Trade-Off Analysis in Initial Design phase CAU = Cold Air Unit CERS = Cost Estimation Relationships DME = Distance Measuring Equipment ECS = Environment Control System GPS = Global Position System HSI = Horizontal Situation Indicator IFEC = In-Flight Entertainment and Connectivity

ILS = Instrumented Landing System LG = Landing Gear LRU = Line Replaceable Unit MDO = Multi Disciplinary Optimization MLG = Main Landing Gear MTGW = Maximum Takeoff Gross Weight NLG = Nose Landing Gear SYS =System TOGW = TakeOff Gross Weight UHF = Ultra High Frequency VAC = Voltage Alternate Current VDC = Voltage Direct Current VHF = Very High Frequency VOR = VHF Omnidirectional Range WERs = Weight Estimation Relationships

### **6. Nomenclature**

36 Systems Engineering – Practice and Theory

After an overview of the airplane system conceptual design, the chapter focuses on an innovative conceptual design methodology, ASTRID, which allows assessing and preliminary sizing avionics and onboard general systems very early during the design process. The advantage is a better definition of the whole aircraft, in terms of weight, mass properties, power budget and consequently cost already during the conceptual design phase. A better quality of the design of new aircraft is likely to widely improve the development of future aircraft. The proposed innovative methodology can contribute to the

ASTRID = Aircraft on board Systems Sizing And Trade-Off Analysis in Initial Design phase

Fig. 20. Battery selection

achievement of this goal with limited cost.

AHRS = Attitude Heading Reference System

IFEC = In-Flight Entertainment and Connectivity

**4. Conclusions** 

**5. Acronyms**  AC = Aircraft

ADC = Air Data Computer

APU = Auxiliary Power Unit

CAU = Cold Air Unit

ADF = Automatic Direction Finder ADI = Attitude Director Indicator

CERS = Cost Estimation Relationships DME = Distance Measuring Equipment ECS = Environment Control System GPS = Global Position System HSI = Horizontal Situation Indicator

b = wingspan CD0 = parasite drag coefficient CLMAX = maximum lift coefficient Cp = specific heat (costant pressure) CRESER = reservoir capacity F = force F0 = maximum static force (stall force) K = constant l = fuselage length lLAN = landing distance lTO = takeoff distance M = momentum M0 = maximum static momentum (stall momentum) mAI = anti-ice system air mass flow rate mBLEED A = air mass flow rate bleed from engine or APU compressor or from dedicated compressor in case of max request of heating mBLEED P = air mass flow rate bleed from engine or APU or dedicated compressor in case of max request of cooling mcondA = air mass flow rate supplied by ECS in case of max request of heating mcondP = air mass flow rate supplied by ECS in case of max request of cooling PAI = electrical power required by anti-ice system pcab = cabin air pressure pext = external air pressure ph = pressure of pneumatic power generation output air pi = pressure of CAU output air pmax\_engine = maximum engine interface pressure pmin\_engine = minimum engine interface pressure PSZ = electric power required for small zones to avoid ice formation

q = cabin thermal load

Q = fluid volumetric flow rate qA = q in case of maximum request of heating QAVAILABLE = available hydraulic mass flow rate qP = q in case of maximum request of cooling QREQUESTED = required hydraulic mass flow Sp = protected surface (by ice formation) Sw = wing surface T = thrust Tcab = cabin air temperature Text = external air temperature Th = air temperature of pneumatic power generation Ti = output air temperature of CAU TI COLD= temperature of air supplied by ECS in case of max request of cooling TI HOT = temperature of air supplied by ECS in case of max request of heating tmiss = mission time V∞ = unperturbed air velocity Vmax = airplane maximum speed VV = no load rate

**7. References** 


### **Complex-Systems Design Methodology for Systems-Engineering Collaborative Environment**

Guido Ridolfi1,2, Erwin Mooij <sup>2</sup> and Sabrina Corpino1 <sup>1</sup>*Politecnico di Torino* <sup>2</sup>*Delft University of Technology* <sup>1</sup>*Italy* <sup>2</sup>*The Netherlands*

#### **1. Introduction**

38 Systems Engineering – Practice and Theory

Q = fluid volumetric flow rate

Sw = wing surface

tmiss = mission time

VV = no load rate

**7. References** 

Tcab = cabin air temperature Text = external air temperature

V∞ = unperturbed air velocity Vmax = airplane maximum speed

Ti = output air temperature of CAU

Naturali. 65- 115. 32

ISBN 9788879922647, Turin, Italy

56347-281-0, Reston, Virginia, USA

Lawrence, Kansas, USA

T = thrust

qA = q in case of maximum request of heating QAVAILABLE = available hydraulic mass flow rate qP = q in case of maximum request of cooling QREQUESTED = required hydraulic mass flow Sp = protected surface (by ice formation)

Th = air temperature of pneumatic power generation

TI COLD= temperature of air supplied by ECS in case of max request of cooling TI HOT = temperature of air supplied by ECS in case of max request of heating

Conference of the SAWE, New York, U.S.A., May 7 10, 1979

Conference, Bremen, Germany, November 8 9, 2000

Antona, E., Chiesa, S., Corpino, S., Viola, N., (2009). L'avamprogetto dei velivoli. Memorie

Beltramo, M.N., Morris, M.A. & Anderson, J.L. (1979) Application of parametric weight and

Chiesa, S. (2007) Affidabilità, sicurezza e manutenzione nel progetto dei sistemi, CLUT,

Chiesa, S., Maggiore, P., Corpino, S. & Pasquino, M. (2000) The Weight Estimation in

Raymer, D. P. (1999) Aircraft Design: a conceptual approach (third edition), AIAA

Roskam, J. (1990) Airplane Design, Vol.1 to 7, DARcorporation, ISBN 1-884885-24-1,

Staton, R.N. (1972) Weight estimation methods, SAWE JOURNAL, Vol. 31, (April-May 1972)

della Accademia delle Scienze di Torino. Classe di Scienze Fisiche Matematiche e

cost estimating relationship to future transport aircraft, Proceeding of 38th Annual

Aerospace Engineering at the Polythecnic of Turin, Proceedings of SAWE Europe

(American Institute of Aeronautics and Astronautics) Education Series, ISBN 1-

In the last decades man-made systems have gained in overall complexity and have become more articulated than before. From an engineering point of view, a complex system may be defined as one in which there are multiple interactions between many different elements of the system and many different disciplines concurring to its definition. However, the complexity seen from the system perspective is only partial. In more general terms complexity does not only regard the system *per se*, but it is also related to the whole life-cycle management of the system. This encompasses all the activities needed to support the program development from the requirements definition to the verification, validation and operation of the system in the presence of a large number of different stakeholders. These two interrelated views of complexity, being *bottom-up* in the first case and *top-down* in the second, both converge to the system defined as an entity formed by a set of interdependent functions and elements that complete one or more functions defined by requirements and specifications.

Systems Engineering processes have been increasingly adopted and implemented by enterprise environments to face this increased complexity. The purpose is to pursue time and cost reduction by a parallelization of processes and activities, while at the same time maintaining high-quality standards. From the life-cycle management point of view the tendency has been to rely more and more on software tools to formally applying modeling techniques in support of all the activities involved in the system life-cycle from the beginning to the end. The transition from document-centric to model-centric systems engineering allows for an efficient management of the information flow across space and time by delivering the right information, in the right place, at the right time, and to the right people working in geographically-distributed multi-disciplinary teams. This standardized implementation of model-centric systems engineering, using virtual systems modeling standards, is usually called Model Based Systems Engineering, MBSE.

On the other side, looking at the problem from the perspective of the system as a product, the management of complexity is also experiencing a radical modification. The former adopted approach of sequentially designing with separate discipline activities is now being replaced by a more integrated approach. In the Aerospace-Engineering domain, for instance, designing with highly integrated mathematical models has become the norm. Already from the preliminary design of a new system all its elements and the disciplines involved over the entire life-cycle are taken into account, with the objective of reducing risks and costs, and possibly optimizing the performance.

When the *right people* all work as a team in a multi-disciplinary collaborative environment, the MBSE and the Concurrent Engineering finally converge to the definition of the system. The main concern of the engineering activities involved in system design is to predict the behavior of the physical phenomena typical of the system of interest. The development and utilization of mathematical models able to reproduce the future behavior of the system based on inputs, boundary conditions and constraints, is of paramount importance for these design activities. The basic idea is that before those decisions that are hard to undo are made, the alternatives should be carefully assessed and discussed. Despite the favorable environment created by MBSE and Concurrent Engineering for the discipline experts to work, discuss and share knowledge, a certain lack of engineering-tool interoperability and standardized design methodologies has been so far a significant inhibitor, (International Council on Systems Engineering [INCOSE], 2007). The systems mathematical models usually implemented in the collaborative environments provide exceptional engineering-data exchange between experts, but often lack in providing structured and common design approaches involving all the disciplines at the same time. In most of the cases the various stakeholders have full authority on design issues belonging to their inherent domain only. The interfaces are usually determined by the experts and manually fed to the integrated models. We believe that the enormous effort made to conceive, implement, and operate MBSE and Concurrent Engineering could be consolidated and brought to a more fundamental level, if also the more common design analytical methods and tools could be concurrently exploited. Design-space exploration and optimization, uncertainty and sensitivity analysis, and trade off analysis are certainly design activities that are common to all the disciplines, consistently implemented for design purposes at the discipline-domain level. Bringing fundamental analysis techniques from the discipline-domain level to the system-domain level, to exploit interactions and synergies and to enable an efficient trade-off management is the central topic discussed in this chapter. The methodologies presented in this chapter are designed for their implementation in collaborative environments to support the engineering team and the *decision-makers* in the activity of *exploring* the design space of complex-system, typically long-running, models. In Section 2 some basic definitions, terminology, and design settings of the class of problems of interest are discussed. In Section 3 a test case of an Earth-observation satellite mission is introduced. This satellite mission is used throughout the chapter to show the implementation of the methods step by step. Sampling the design space is the first design activity discussed in Section 4. Then in Section 5 and Section 6 a general approach to compute sensitivity and to support the engineering team and decision makers with standard visualization tools are discussed. In Section 7 we provide an overview on the utilization of a unified sampling method for uncertainty and robustness analysis. Finally, we conclude the chapter providing some recommendations and additional thoughts in Section 8.

#### **2. Basic definitions**

The discussion and the methodologies presented in this chapter are based on the assumption that the activity of designing a complex system is performed by a team of designers (the engineering team), using **mathematical models** to determine the physical and functional characteristics of the system itself. A mathematical model is a set of relationships, i.e., 2 Will-be-set-by-IN-TECH

the preliminary design of a new system all its elements and the disciplines involved over the entire life-cycle are taken into account, with the objective of reducing risks and costs, and

When the *right people* all work as a team in a multi-disciplinary collaborative environment, the MBSE and the Concurrent Engineering finally converge to the definition of the system. The main concern of the engineering activities involved in system design is to predict the behavior of the physical phenomena typical of the system of interest. The development and utilization of mathematical models able to reproduce the future behavior of the system based on inputs, boundary conditions and constraints, is of paramount importance for these design activities. The basic idea is that before those decisions that are hard to undo are made, the alternatives should be carefully assessed and discussed. Despite the favorable environment created by MBSE and Concurrent Engineering for the discipline experts to work, discuss and share knowledge, a certain lack of engineering-tool interoperability and standardized design methodologies has been so far a significant inhibitor, (International Council on Systems Engineering [INCOSE], 2007). The systems mathematical models usually implemented in the collaborative environments provide exceptional engineering-data exchange between experts, but often lack in providing structured and common design approaches involving all the disciplines at the same time. In most of the cases the various stakeholders have full authority on design issues belonging to their inherent domain only. The interfaces are usually determined by the experts and manually fed to the integrated models. We believe that the enormous effort made to conceive, implement, and operate MBSE and Concurrent Engineering could be consolidated and brought to a more fundamental level, if also the more common design analytical methods and tools could be concurrently exploited. Design-space exploration and optimization, uncertainty and sensitivity analysis, and trade off analysis are certainly design activities that are common to all the disciplines, consistently implemented for design purposes at the discipline-domain level. Bringing fundamental analysis techniques from the discipline-domain level to the system-domain level, to exploit interactions and synergies and to enable an efficient trade-off management is the central topic discussed in this chapter. The methodologies presented in this chapter are designed for their implementation in collaborative environments to support the engineering team and the *decision-makers* in the activity of *exploring* the design space of complex-system, typically long-running, models. In Section 2 some basic definitions, terminology, and design settings of the class of problems of interest are discussed. In Section 3 a test case of an Earth-observation satellite mission is introduced. This satellite mission is used throughout the chapter to show the implementation of the methods step by step. Sampling the design space is the first design activity discussed in Section 4. Then in Section 5 and Section 6 a general approach to compute sensitivity and to support the engineering team and decision makers with standard visualization tools are discussed. In Section 7 we provide an overview on the utilization of a unified sampling method for uncertainty and robustness analysis. Finally, we conclude the chapter providing

possibly optimizing the performance.

some recommendations and additional thoughts in Section 8.

The discussion and the methodologies presented in this chapter are based on the assumption that the activity of designing a complex system is performed by a team of designers (the engineering team), using **mathematical models** to determine the physical and functional characteristics of the system itself. A mathematical model is a set of relationships, i.e.,

**2. Basic definitions**

equations, providing figures-of-merit on the **performance(s)** of the system to the engineering team when certain **inputs** are provided. The inputs are represented by the **design variables**, i.e., factors that are responsible for influencing the performance(s) of the system. For this motivation, the design variables will also be called **design factors**, or more generally inputs, or simply variables. The domain of existence of the design variables forms the **design space**, where they can assume certain **values** between a minimum and a maximum. The **design-variable range** determined by the minimum and the maximum can of course only be as large as the domain of existence of the variable. Mimima and maxima for the design variables are usually set by the engineering team to limit the analysis to a specific region of the design space or to avoid infeasible conditions. For instance, the design range of the eccentricity, *e*, of a closed orbit about the Earth should not exceed the interval 0 ≤ *e* < 1. In the upper-left Cartesian diagram of Fig. 1 a hypothetical design space, formed by two variables, is shown. The limits of the variable ranges are represented by the dash-dotted lines. The subspace of the design space determined by all the design-variable ranges is addressed as the **design region** of interest, and it is represented by the rectangle formed by the dash-dotted lines and the vertical axis of the Cartesian diagram.

Fig. 1. Schematic representation of the design space and the objective space of the model.

Design variables can be **continuous** or **discrete**. A continuous variable can assume all the values between the minimum and the maximum. A discrete variable, instead, can assume only few specific values in the design-variable range. In this case the values are called **levels**. Discrete variables can be further distinguished into two classes, namely **ordinal** or **categorical**. The *length* of a solar array on a satellite system, for instance, is a continuous variable. It can assume, in principle, any value between a minimum and a maximum set to limit the weight or to provide a minimum performance under certain circumstances. The *number of cells* used to build the array is an ordinal variable. It can only assume the levels represented by the natural numbers, and certain characteristics increase (decrease) when the number of cells increases (decreases), e.g., the total mass. The *type of solar cell*, instead, is a categorical variable. This means that it can only assume certain levels (e.g. *type#1*, *type#2*, and so on), but in this case the order is not important. It is not always the case that, for instance, the *efficiency* of the solar cells increases going from the first type to the second type and so on. It depends on the order in which they appear in a database, for instance, that may be an arbitrary choice of the engineering team. The model of the system may be also subject to other sources of variability representing the non-deterministically known parameters typical of the operating environment of the system. The residual atmospheric density on orbit, the solar radiation, the orbit injection errors, just to mention a few, are factors that may not be directly controlled at a design stage therefore they must be taken into account in a statistical sense. These factors are called **uncontrollable**.

One of the main tasks of the engineering team during the **design process** of the system is to set the values and/or the levels of the design variables in such a way that the performance(s) of the system assume a certain optimal level under *certain circumstances* (**optimal design**), and/or such that the final system is insensitive to variations of the uncontrollable factors (**robust design**). The performance(s) of interest are called **objective(s)** of the analysis. The space in which the objectives can be represented, i.e., the domain of the images of the mathematical equations of the model, is called **objective space**. Thus, the model is responsible for relating points in the design space with points in the objective space. The term *certain circumstances* is used to indicate the constraints and boundary conditions of the analysis. As already mentioned, the **boundary conditions** are represented by the design-variable ranges, the dash-dotted lines of Fig. 1. The **constraints**, instead, are determined by an infeasible condition in the objective space, e.g., the mass of the satellite exceeding the mass that the launcher is able to deliver in a given orbit. Further, the constraints can also be determined by infeasible conditions on the design space, when certain combinations of the values or levels of the design variables are not allowed. This may happen, for instance, with the eccentricity and the semimajor-axis of an Earth-orbiting satellite. Their combined values must ensure that the perigee altitude of the orbit is at least larger than the radius of the Earth. Constraints may be linear or non-linear, continuous or discrete. The dashed lines in Fig. 1 represent the constraints in the design space (non linear in this case), and in the objective space (linear in this case). The thick dots in Fig. 1 represent the **design points**. In the design space they are a representation of the values of the design variables, while on the objective space they represent the corresponding set of output values. Considering a deterministic model, there is a one-to-one correspondence between one point in the design space and one point in the objective space. However, the engineering team must make sure to provide design points that do not violate constraints in the design space. For instance, an orbit with a semi-major axis of 7000 km and eccentricity of 0.7 would lead to a negative value of the satellite altitude at perigee (i.e., non existing orbit) thus with the impossibility of computing relevant parameters such as, for instance, *time-in-view* at perigee passage on a specific region on Earth. Therefore, in Fig. 1 the design point C does not have a corresponding image on the objective space. In this case, the semi-major axis and the eccentricity are classified as correlated inputs.

The problem of developing and implementing the mathematical model of a complex system is beyond the scope of this chapter. However, a brief discussion on the type of modelization approach is beneficial for a better understanding of the discussed design methodologies. The development of a mathematical model is tackled considering two main sub-problems, namely *problem decomposition*, (Sobieszczanski-Sobieski, 1989), and *problem* 4 Will-be-set-by-IN-TECH

means that it can only assume certain levels (e.g. *type#1*, *type#2*, and so on), but in this case the order is not important. It is not always the case that, for instance, the *efficiency* of the solar cells increases going from the first type to the second type and so on. It depends on the order in which they appear in a database, for instance, that may be an arbitrary choice of the engineering team. The model of the system may be also subject to other sources of variability representing the non-deterministically known parameters typical of the operating environment of the system. The residual atmospheric density on orbit, the solar radiation, the orbit injection errors, just to mention a few, are factors that may not be directly controlled at a design stage therefore they must be taken into account in a statistical sense. These factors are

One of the main tasks of the engineering team during the **design process** of the system is to set the values and/or the levels of the design variables in such a way that the performance(s) of the system assume a certain optimal level under *certain circumstances* (**optimal design**), and/or such that the final system is insensitive to variations of the uncontrollable factors (**robust design**). The performance(s) of interest are called **objective(s)** of the analysis. The space in which the objectives can be represented, i.e., the domain of the images of the mathematical equations of the model, is called **objective space**. Thus, the model is responsible for relating points in the design space with points in the objective space. The term *certain circumstances* is used to indicate the constraints and boundary conditions of the analysis. As already mentioned, the **boundary conditions** are represented by the design-variable ranges, the dash-dotted lines of Fig. 1. The **constraints**, instead, are determined by an infeasible condition in the objective space, e.g., the mass of the satellite exceeding the mass that the launcher is able to deliver in a given orbit. Further, the constraints can also be determined by infeasible conditions on the design space, when certain combinations of the values or levels of the design variables are not allowed. This may happen, for instance, with the eccentricity and the semimajor-axis of an Earth-orbiting satellite. Their combined values must ensure that the perigee altitude of the orbit is at least larger than the radius of the Earth. Constraints may be linear or non-linear, continuous or discrete. The dashed lines in Fig. 1 represent the constraints in the design space (non linear in this case), and in the objective space (linear in this case). The thick dots in Fig. 1 represent the **design points**. In the design space they are a representation of the values of the design variables, while on the objective space they represent the corresponding set of output values. Considering a deterministic model, there is a one-to-one correspondence between one point in the design space and one point in the objective space. However, the engineering team must make sure to provide design points that do not violate constraints in the design space. For instance, an orbit with a semi-major axis of 7000 km and eccentricity of 0.7 would lead to a negative value of the satellite altitude at perigee (i.e., non existing orbit) thus with the impossibility of computing relevant parameters such as, for instance, *time-in-view* at perigee passage on a specific region on Earth. Therefore, in Fig. 1 the design point C does not have a corresponding image on the objective space. In

this case, the semi-major axis and the eccentricity are classified as correlated inputs.

The problem of developing and implementing the mathematical model of a complex system is beyond the scope of this chapter. However, a brief discussion on the type of modelization approach is beneficial for a better understanding of the discussed design methodologies. The development of a mathematical model is tackled considering two main sub-problems, namely *problem decomposition*, (Sobieszczanski-Sobieski, 1989), and *problem*

called **uncontrollable**.

*formulation*, (Cramer et al., 1993; Tedford & Martins, 2006). In the literature, authors propose several model-decomposition techniques. However, two main classes may be identified, namely *Hierarchical Decomposition* and *Non-Hierarchical Decomposition* methods, (Sobieszczanski-Sobieski & Haftka, 1995). Non-Hierarchical Decomposition methods (NHD) are advised when there is no clear separation between two or more elements/disciplines, i.e. when the coupling between them is not negligible *a priori*. The formulation of the complex-system design problem is related to the allocation of the resources to the various elements of the architecture. Single- and multiple-level formulations are discussed in the literature, (Cramer et al., 1993; Tedford & Martins, 2006; Yi et al., 2008). The former must be executed on a single machine, the latter, instead, allows for more flexibility in allocating the computational resources. The mathematical models of a collaborative environment are most likely developed using a NHD approach, because it is the most general one, and with a multi-level architecture because resources are usually geographically distributed. An example of the multi-level architecture of a complex-system design problem is presented in Fig. 2. It represents the architecture most likely adopted in a collaborative environment with team-members responsible for element analysis and others responsible for system analysis. The thick lines represent input/output interfaces.

Fig. 2. Schematic of the Collaborative Bi-Level (COBiL) formulation for complex systems models.

#### **3. Design case: Earth-observing satellite for natural disaster and land-use monitoring**

Earth-observation satellites can observe areas over a wide range rather quickly. It is expected that their observation data combined with information obtained by aircraft and helicopters will be useful for a regular disaster condition assessment. This would make rescue operations more effective, would allow for extracting topographical information reflecting latest land-usage changes, and identifying disaster risks.

In this chapter, the preliminary design of an Earth-observation mission to support the world-wide disaster management process and land-usage monitoring is deployed and discussed to show a practical implementation of the proposed design approaches. The following mission statement is considered as driver for the design process:

*Design an Earth observation mission to provide world-wide disaster-management capabilities, for over a period of 7 years*

The limited available space and at the same time the willingness to effectively convey the message of this chapter led us to take several assumptions to determine boundaries of the analysis presented here. A satellite system with an optical payload (staring sensor) is considered. The main purpose is to achieve a compromise between the design variables in such a way to obtain the best possible image resolution, at minimum cost. The satellite shall revisit the same area on the Earth surface within 24 hours, and shall be able to send the acquired data back, in real time, to any equipped ground station (the reference ground station is considered with 1 m aperture antenna diameter) with a link margin of at least 4 dB. The selected launcher is of the class of the *Delta II 6920/25*, with a maximum payload on polar orbit of 2950 kg. A highly inclined, circular orbit has been selected, with *i* = 98◦. The main mission geometry parameters and few of the equations implemented for computing the coverage and the resolution are presented in Fig. 3.

In Table 1 the design variables taken into account in the analysis, and their intervals or levels (in case of discrete variables) are summarized.

The mathematical model of the satellite system is composed of all its main subsystems (i.e., payload, Attitude Dynamics and Control System (ADCS), communication system,


Table 1. Settings of the design variables.*<sup>a</sup>* When *A* = 1, *B* = 13, 14 or 15. When *A* = 2, *B* = 28, 29 or 30. When *A* = 3, *B* = 43, 44 or 45.

6 Will-be-set-by-IN-TECH

operations more effective, would allow for extracting topographical information reflecting

In this chapter, the preliminary design of an Earth-observation mission to support the world-wide disaster management process and land-usage monitoring is deployed and discussed to show a practical implementation of the proposed design approaches. The

*Design an Earth observation mission to provide world-wide disaster-management capabilities, for over*

The limited available space and at the same time the willingness to effectively convey the message of this chapter led us to take several assumptions to determine boundaries of the analysis presented here. A satellite system with an optical payload (staring sensor) is considered. The main purpose is to achieve a compromise between the design variables in such a way to obtain the best possible image resolution, at minimum cost. The satellite shall revisit the same area on the Earth surface within 24 hours, and shall be able to send the acquired data back, in real time, to any equipped ground station (the reference ground station is considered with 1 m aperture antenna diameter) with a link margin of at least 4 dB. The selected launcher is of the class of the *Delta II 6920/25*, with a maximum payload on polar orbit of 2950 kg. A highly inclined, circular orbit has been selected, with *i* = 98◦. The main mission geometry parameters and few of the equations implemented for computing the

In Table 1 the design variables taken into account in the analysis, and their intervals or levels

The mathematical model of the satellite system is composed of all its main subsystems (i.e., payload, Attitude Dynamics and Control System (ADCS), communication system,

Table 1. Settings of the design variables.*<sup>a</sup>* When *A* = 1, *B* = 13, 14 or 15. When *A* = 2,

Number of slew

Transmitting output

Antenna diameter

Type of solar array

Type of thrusters

Intervals

Design Variables Code Min Max Levels

maneuvers [-] <sup>G</sup> <sup>10</sup>*<sup>k</sup>* <sup>30</sup>*<sup>k</sup>* <sup>−</sup>

RF power [W] <sup>H</sup> 5 30 <sup>−</sup>

[m] <sup>I</sup> 0.1 1 <sup>−</sup>

[-] <sup>J</sup> 1 2 <sup>2</sup>

[-] <sup>K</sup> 1 2 <sup>2</sup>

Payload heritage [-] L 1 2 2

following mission statement is considered as driver for the design process:

latest land-usage changes, and identifying disaster risks.

coverage and the resolution are presented in Fig. 3.

Intervals

(in case of discrete variables) are summarized.

Design Variables Code Min Max Levels

(rep. ground track) <sup>A</sup> 1 3 <sup>3</sup>

(rep. ground track)*<sup>a</sup>* <sup>B</sup> 1 3 <sup>3</sup>

diameter [m] <sup>C</sup> 0.3 1 <sup>−</sup> Min. [deg] D 5 50 −

[deg] <sup>E</sup> 0 50 <sup>−</sup>

[s] <sup>F</sup> 60 180 <sup>−</sup>

*B* = 28, 29 or 30. When *A* = 3, *B* = 43, 44 or 45.

Number of days

Number of orbits

Instrument aperture

Max. slew angle

Min. maneuver time

*a period of 7 years*

Fig. 3. Satellite mission geometry. Equations adapted from (Wertz & Larson, 1999).

power and avionics system, propulsion system, structure and thermal-control system) and a ground control station model. The cost is computed using the *Unmanned Spacecraft Cost Model* presented by Wertz & Larson (1999). Cost is mostly related to the mass and power consumption of the satellite, the type of technology used (e.g., type of payload or type of attitude control), and on the technology heritage of its components (the higher the cheaper). From database, two types of solar arrays and two types of thrusters are taken into account. The two types of solar arrays present an efficiency, *η*, of 0.14 and 0.2, and a power density of 115 [W/kg] and 100 [W/kg] respectively. The two thrusters are the *STAR48A* and the *IUS-SRM2* with a specific impulse of 250 [s] and 300 [s], (Wertz & Larson, 1999), and a percentage of inert mass with respect to the propellant of 0.13 and 0.21, respectively. The two levels of payload heritage foresee an *adapted* design from an existing one and a *new* design, respectively. The *new* design is more expensive, but allows for a better management of the acquired data on board, i.e., reduced data rate. The results of the analysis are discussed in the following sections, for every design step and for every type of design methodology presented.

on the Earth surface

#### **4. Sampling the design space**

Sampling the design space is the first step necessary when the mathematical model of a system needs to be studied. A sample is a set of points in the design region (a *k* − *dimensional* hyperspace) whose coordinates are the values of the design variables taken from their variability ranges, (*x*1, *x*2, ··· , *xk*), in their marginal (for independent inputs) or joint (for correlated/coupled inputs) distribution, see the black dots in Fig. 1.

The simplest, and possibly most straightforward approach to sampling is to generate a sequence of random points in the design region, as shown in Figure 4(a). The Latin Hypercube Sampling (LHS),developed by McKay, McKay et al. (1979), is an alternative method seen as a subclass of the stratified-sampling class. The LHS provides full stratification of the design region, thus increased design-space coverage characteristics if compared to the generic stratified sampling and the random sampling, see Figure 4(b). However, good space-filling characteristics are not always guaranteed, in the sense that points in the design space may still form separate and disordered bunches. Viana et al. (2010) propose an algorithm for near-optimal Latin hypercube designs (i.e., maximizing the distance between the samples) without using formal optimization, see Figure 4(c). This method provides results with a negligible computational effort if the number of design variables *k* is not so large. According to our experience using this algorithm, it requires the generation of matrices with at least 2*<sup>k</sup>* elements, irrespective of the number of samples actually required. The number of matrix entries to be stored to compute the near-optimal LHS can become cumbersome already for 20 variables. The Sobol *LP<sup>τ</sup>* sequence,Sobol (1979), is a quasi-random sampling technique that provides *low-discrepancy* sample points. Here discrepancy indicates a measure of *non-uniformity* and proximity between the samples. In Bratley & Fox (1988) and Press et al. (2007) there are useful indications on how a Sobol *LP<sup>τ</sup>* sequence, or its variant proposed by Antonov & Saleev (1979), can be computed. The (modified) Sobol *LP<sup>τ</sup>* sequence has the particular characteristic of providing a sequence of points for which successive points at any stage *know* how to fill in the gaps in the previously generated distribution, Press et al. (2007), see Figure 4(d). This aspect is particularly useful for the re-utilization of previous sample points when additional points shall be sampled to improve the quality of the results, as will be demonstrated later in the case of regression analysis. The modified Sobol *LP<sup>τ</sup>* sequence demonstrates that the additional sample points, the circles in Fig. 4, are placed in such a way to fill the gaps following a pre-defined pattern, allowing for a more efficient re-utilization of the samples previously generated.

#### **4.1 Design of experiments**

An experiment is a test performed to evaluate the outcome of a process given certain settings of the factors believed to influence it. The *experiments* considered in this chapter are all *computer experiments* performed on the mathematical model in correspondence of the sample points. However, the Design of Experiment (DoE) practice has older origins than the computer era, indeed it was first introduced by Fisher in 1935. The sampling methods belonging to the category of DoE can be distinguished in Factorial Designs (full or fractional), Orthogonal Arrays and other methods, amongst which, for instance, Central Composite Design (CCD). The common characteristic of these sampling methods is that they are all deterministic. The samples are placed on the design space according to a certain pre-defined geometry, so that

Fig. 4. Scatterplots of sampling points in a 2-dimensional design space based on (a) random sampling, (b) Latin Hypercube sampling, (c) sub-optimized Latin hypercube sampling, (Viana et al., 2010), (d) modified Sobol *LP<sup>τ</sup>* sequence. • Initial sample, 100 points. ◦ Additional sample, 100 points.

Fig. 5. Full factorial design with (a) 2 variable-levels and (b) 3 variable-levels in a 3-dimensional design space.

also *ordinal* and *categorical* variables can be used in the analysis, rather than only *cardinal* (i.e., continuous) variables as in the previously described sampling techniques. In this case the values of the variables are more properly called *levels*.

#### **4.1.1 Factorial design**

8 Will-be-set-by-IN-TECH

Sampling the design space is the first step necessary when the mathematical model of a system needs to be studied. A sample is a set of points in the design region (a *k* − *dimensional* hyperspace) whose coordinates are the values of the design variables taken from their variability ranges, (*x*1, *x*2, ··· , *xk*), in their marginal (for independent inputs) or joint (for

The simplest, and possibly most straightforward approach to sampling is to generate a sequence of random points in the design region, as shown in Figure 4(a). The Latin Hypercube Sampling (LHS),developed by McKay, McKay et al. (1979), is an alternative method seen as a subclass of the stratified-sampling class. The LHS provides full stratification of the design region, thus increased design-space coverage characteristics if compared to the generic stratified sampling and the random sampling, see Figure 4(b). However, good space-filling characteristics are not always guaranteed, in the sense that points in the design space may still form separate and disordered bunches. Viana et al. (2010) propose an algorithm for near-optimal Latin hypercube designs (i.e., maximizing the distance between the samples) without using formal optimization, see Figure 4(c). This method provides results with a negligible computational effort if the number of design variables *k* is not so large. According to our experience using this algorithm, it requires the generation of matrices with at least 2*<sup>k</sup>* elements, irrespective of the number of samples actually required. The number of matrix entries to be stored to compute the near-optimal LHS can become cumbersome already for 20 variables. The Sobol *LP<sup>τ</sup>* sequence,Sobol (1979), is a quasi-random sampling technique that provides *low-discrepancy* sample points. Here discrepancy indicates a measure of *non-uniformity* and proximity between the samples. In Bratley & Fox (1988) and Press et al. (2007) there are useful indications on how a Sobol *LP<sup>τ</sup>* sequence, or its variant proposed by Antonov & Saleev (1979), can be computed. The (modified) Sobol *LP<sup>τ</sup>* sequence has the particular characteristic of providing a sequence of points for which successive points at any stage *know* how to fill in the gaps in the previously generated distribution, Press et al. (2007), see Figure 4(d). This aspect is particularly useful for the re-utilization of previous sample points when additional points shall be sampled to improve the quality of the results, as will be demonstrated later in the case of regression analysis. The modified Sobol *LP<sup>τ</sup>* sequence demonstrates that the additional sample points, the circles in Fig. 4, are placed in such a way to fill the gaps following a pre-defined pattern, allowing for a more efficient re-utilization of

An experiment is a test performed to evaluate the outcome of a process given certain settings of the factors believed to influence it. The *experiments* considered in this chapter are all *computer experiments* performed on the mathematical model in correspondence of the sample points. However, the Design of Experiment (DoE) practice has older origins than the computer era, indeed it was first introduced by Fisher in 1935. The sampling methods belonging to the category of DoE can be distinguished in Factorial Designs (full or fractional), Orthogonal Arrays and other methods, amongst which, for instance, Central Composite Design (CCD). The common characteristic of these sampling methods is that they are all deterministic. The samples are placed on the design space according to a certain pre-defined geometry, so that

correlated/coupled inputs) distribution, see the black dots in Fig. 1.

**4. Sampling the design space**

the samples previously generated.

**4.1 Design of experiments**

Factorial design, or *full* factorial design, is a sampling method that foresees one experiment for each possible combination of the levels of the factors. If factor *A* has *a* levels, factor *B* has *b* levels and factor *C* has *c* levels, the total number of experiments is *N* = *a* · *b* · *c*. There are special cases of factorial design where for all the factors only 2 or 3 levels are considered. They are usually called 2*<sup>k</sup>* and 3*<sup>k</sup>* factorial designs respectively, where *k* indicates the number of factors. The experimental structure obtained for 2*<sup>k</sup>* and 3*<sup>k</sup>* factorial designs is shown in Fig. 5 where the dots indicate the sample points.

Full-factorial design requires a number of experiments that increases with the power of the number of factors. Thus, already in the case of 2*<sup>k</sup>* or 3*<sup>k</sup>* factorial designs, the experimentation (i.e., the simulation of the model) can become cumbersome very soon. Therefore, fractional factorial designs were introduced as an attempt to reduce the computational effort for the analysis. As the name suggests, fractional-factorial designs only foresee a fraction of the


Table 2. *L*<sup>8</sup> orthogonal array

number of experiments required by a full-factorial design with the same number of factors and the same number of levels. For instance a *one-half* fractional factorial design, or 2*k*−<sup>1</sup> design, requires half of the experiments of the original 2*<sup>k</sup>* design.

All the designs belonging to the category of DoE are also called matrix designs. Indeed their visualization, and their construction, is better understood if represented in the form of a matrix with the factors in the columns and the experiments to perform in the rows. A graphical structure for more than 3 variables becomes hard to visualize, see Table 2. A 2*k*−<sup>1</sup> design is also called *Resolution 5* design (for *k* > 4). It is also possible to generate fractional-factorial designs that require less experiments than Resolution 5. However, the smaller the number of experiments, the lesser the information that can be obtained, as will be discussed in Section 5.2. Box et al. (1979) provide a thorough discussion on DoE, in general. Montgomery (2001), instead, present a complete overview of factorial designs, methods for obtaining several kinds of designs and their implications. For more detailed analysis we advise to refer to their original work.

#### **4.1.2 Orthogonal arrays**

Orthogonal Arrays, OAs, are special matrix designs originally developed by Taguchi (1987). OAs can be used as Resolution 3, Resolution 4, and Resolution 5 designs by properly arranging the columns of the design matrices, (Phadke, 1989). The term *orthogonal* is related to the balancing property, which means that for any pair of columns, all combinations of factor levels are present an equal number of times. In Table 2, the 1*s* indicate the *low* levels, while 2*s* indicate the *high* levels of the design factors.

The *L*<sup>8</sup> orthogonal array of Table 2 is only one amongst the many OAs discussed in (Taguchi, 1987). It is possible to build also three-, four-, and five-level OAs, and also mixed-levels OAs for factors having a heterogeneous number of levels, (Phadke, 1989). An efficient algorithm to generate three-level OAs is discussed by Mistree et al. (1994) while standard tables for other types of orthogonal arrays can be found in (Taguchi, 1987) and (Phadke, 1989).

#### **4.1.3 Other experimental designs**

The major distinction amongst the experimental designs is usually made between first- and second-order designs, as already hinted before. In the first case the design variables can assume only two levels, while in the second case at least three levels per design variable

Fig. 6. Mixed-hypercube sampling with 3 discrete and 2 continuous variables.

are considered. The development of second-order designs is mainly related to the need of obtaining information on the curvature of the design space for fitting second-order response surfaces. Box et al. (1979) present a method to compute fractional 3*<sup>k</sup>* factorial designs, the Box-Behnken designs, obtained by combining two-level factorial designs with balanced incomplete block designs. The Central Composite Design, CCD, introduced by Box & Wilson (1951), is build instead using a 2*<sup>k</sup>* factorial design, plus a central point (in the geometric center of the design hyperspace), plus 2*k* points on the axis of every design variables at a distance *α* from the center. In a hyperspace normalized in the interval [−1, 1], a CCD with *α* �= 1 will present 5 levels for each variables, while with *α* = 1 it will only require the variables to assume 3 different levels. The interested readers may refer to Box et al. (1979) and Montgomery (2001) for a good overview and discussions on the many types of available experimental designs.

#### **4.2 The mixed-hypercube approach**

10 Will-be-set-by-IN-TECH

Experiment ABCDE FG 1 1111111 2 1112222 3 1221122 4 1222211 5 2121212 6 2122121 7 2211221 8 2212112

number of experiments required by a full-factorial design with the same number of factors and the same number of levels. For instance a *one-half* fractional factorial design, or 2*k*−<sup>1</sup>

All the designs belonging to the category of DoE are also called matrix designs. Indeed their visualization, and their construction, is better understood if represented in the form of a matrix with the factors in the columns and the experiments to perform in the rows. A graphical structure for more than 3 variables becomes hard to visualize, see Table 2. A 2*k*−<sup>1</sup> design is also called *Resolution 5* design (for *k* > 4). It is also possible to generate fractional-factorial designs that require less experiments than Resolution 5. However, the smaller the number of experiments, the lesser the information that can be obtained, as will be discussed in Section 5.2. Box et al. (1979) provide a thorough discussion on DoE, in general. Montgomery (2001), instead, present a complete overview of factorial designs, methods for obtaining several kinds of designs and their implications. For more detailed analysis we

Orthogonal Arrays, OAs, are special matrix designs originally developed by Taguchi (1987). OAs can be used as Resolution 3, Resolution 4, and Resolution 5 designs by properly arranging the columns of the design matrices, (Phadke, 1989). The term *orthogonal* is related to the balancing property, which means that for any pair of columns, all combinations of factor levels are present an equal number of times. In Table 2, the 1*s* indicate the *low* levels, while 2*s*

The *L*<sup>8</sup> orthogonal array of Table 2 is only one amongst the many OAs discussed in (Taguchi, 1987). It is possible to build also three-, four-, and five-level OAs, and also mixed-levels OAs for factors having a heterogeneous number of levels, (Phadke, 1989). An efficient algorithm to generate three-level OAs is discussed by Mistree et al. (1994) while standard tables for other

The major distinction amongst the experimental designs is usually made between first- and second-order designs, as already hinted before. In the first case the design variables can assume only two levels, while in the second case at least three levels per design variable

types of orthogonal arrays can be found in (Taguchi, 1987) and (Phadke, 1989).

design, requires half of the experiments of the original 2*<sup>k</sup>* design.

Table 2. *L*<sup>8</sup> orthogonal array

advise to refer to their original work.

indicate the *high* levels of the design factors.

**4.1.3 Other experimental designs**

**4.1.2 Orthogonal arrays**

Factors Assignment

The purpose of the mixed-hypercube approach is to exploit both stratified sampling and DoE to efficiently sample the design space for obtaining information on the effect of both the continuous and the discrete design variables on the performance(s) of interest. The main idea is to separate the continuous variables and the discrete ones in two groups. A matrix design is then created for the discrete variables while for every row of the matrix design a Sobol sequence is generated for the remaining continuous variables. An example with three discrete and two continuous variables is presented in Fig. 6.

The advantage of using a matrix design instead of a space-filling technique for the discrete variables is that it allows to deterministically select the levels of the factors. When only few factor-levels can be selected (e.g., in a database there is a certain number of batteries, or only a limited number of thrusters is considered in the analysis of a satellite system) the maximum number of simulations is determined by a full factorial design. Therefore, its relative Resolution 5, 4, and 3 designs are the best way of obtaining samples by avoiding to disrupt the balance characteristics of the sampling matrix. The modification of a random or pseudo-random technique for sampling only at certain levels does not immediately provide such a balance, especially when the number of samples is kept low. On the other hand, in case of continuous variables matrix designs alone are less flexible in *filling* the design region and less suitable for the *re-sampling* process than the Sobol technique. The proposed mixed-hypercube sampling approach allows for covering the design region more uniformly when compared to all the other techniques mentioned in this section, already with a low number of samples. The sensitivity-analysis technique described in Section 5, will directly benefit from these characteristics, since convergence of the variance is obtained with a reduced computational effort, for instance. Further, response surfaces for the continuous variables, and linear and interaction graphs for the discrete ones can be directly computed from the outcome of the simulations, with no additional data manipulation, see Section 6. A more detailed description of the implications of using specific implementations of the mixed-hypercube sampling method in combination with the design approaches presented in this chapter is discussed in the following sections.

#### **5. Sensitivity analysis**

Sensitivity analysis can be defined as the study of the *effect* of a certain input *x* on a given output *Y*. This *effect* can be the result of a local measure, e.g., the measure of a derivative as for instance (*∂Y*/*∂x*)*x*=*x*<sup>∗</sup> , which requires an infinitesimal variation of the input *x* around a specific value *x*∗. However, the measure of sensitivity can also be obtained when the input ranges in a specified finite interval. In this case sensitivity analysis is valid over the entire interval of variation spanned by the input factor rather than only a single point. Therefore this type of sensitivity analysis is often called *global*. The settings of the problem of designing a complex system by selecting the most appropriate combination of input-factor levels is particularly suitable for the implementation of global sensitivity analysis. Indeed, in this context sensitivity analysis is aimed at finding the set of relevant factors in the determination of the output, providing information that is valid over the entire design region, even if it represents only a (small) subset of the design space. The main design questions that can be answered by using the global sensitivity analysis technique are the following:

*Amongst all the design factors of the system model, what are those actually influencing the performance of interest? To what extent do these factors influence the performance?*

The answer to these questions, already at an early stage of the design, could bring several advantages to the engineering team. First, it allows to identify the *design drivers*, i.e., those factors or group of factors that shall be carefully assessed, because they will be the main responsible for determining the performance of the system. The extent of the influence identified may be useful for checking the adequacy of the model being used for the analysis and for corroborating the underlying analysis assumptions.

#### **5.1 Sensitivity indices**

The relative importance of the factors can be determined on the basis of the reduction of the (unconditional) variance of the output *Y*, *V* (*Y*), due to fixing that factor to a certain (yet unknown) value. A global quantitative measure for the *importance* of the factors, based on their contribution to the variance of the response, was first introduced by Sobol (1993). In (Sobol, 1993) and in (Sobol, 2001), the author presents a formal demonstration of his approach and a method for computing the sensitivity indices (sometimes called Sobol indices). Consider *Y* = *f*(*X*) as the model of interest. *Y* is the response vector while *X* = (*x*1, *x*2, ··· , *xk*) is 12 Will-be-set-by-IN-TECH

mixed-hypercube sampling approach allows for covering the design region more uniformly when compared to all the other techniques mentioned in this section, already with a low number of samples. The sensitivity-analysis technique described in Section 5, will directly benefit from these characteristics, since convergence of the variance is obtained with a reduced computational effort, for instance. Further, response surfaces for the continuous variables, and linear and interaction graphs for the discrete ones can be directly computed from the outcome of the simulations, with no additional data manipulation, see Section 6. A more detailed description of the implications of using specific implementations of the mixed-hypercube sampling method in combination with the design approaches presented in this chapter is

Sensitivity analysis can be defined as the study of the *effect* of a certain input *x* on a given output *Y*. This *effect* can be the result of a local measure, e.g., the measure of a derivative as for instance (*∂Y*/*∂x*)*x*=*x*<sup>∗</sup> , which requires an infinitesimal variation of the input *x* around a specific value *x*∗. However, the measure of sensitivity can also be obtained when the input ranges in a specified finite interval. In this case sensitivity analysis is valid over the entire interval of variation spanned by the input factor rather than only a single point. Therefore this type of sensitivity analysis is often called *global*. The settings of the problem of designing a complex system by selecting the most appropriate combination of input-factor levels is particularly suitable for the implementation of global sensitivity analysis. Indeed, in this context sensitivity analysis is aimed at finding the set of relevant factors in the determination of the output, providing information that is valid over the entire design region, even if it represents only a (small) subset of the design space. The main design questions that can be

*Amongst all the design factors of the system model, what are those actually influencing the performance*

The answer to these questions, already at an early stage of the design, could bring several advantages to the engineering team. First, it allows to identify the *design drivers*, i.e., those factors or group of factors that shall be carefully assessed, because they will be the main responsible for determining the performance of the system. The extent of the influence identified may be useful for checking the adequacy of the model being used for the analysis

The relative importance of the factors can be determined on the basis of the reduction of the (unconditional) variance of the output *Y*, *V* (*Y*), due to fixing that factor to a certain (yet unknown) value. A global quantitative measure for the *importance* of the factors, based on their contribution to the variance of the response, was first introduced by Sobol (1993). In (Sobol, 1993) and in (Sobol, 2001), the author presents a formal demonstration of his approach and a method for computing the sensitivity indices (sometimes called Sobol indices). Consider *Y* = *f*(*X*) as the model of interest. *Y* is the response vector while *X* = (*x*1, *x*2, ··· , *xk*) is

answered by using the global sensitivity analysis technique are the following:

*of interest? To what extent do these factors influence the performance?*

and for corroborating the underlying analysis assumptions.

discussed in the following sections.

**5. Sensitivity analysis**

**5.1 Sensitivity indices**

the vector with the *k* independent input factors. The method of Sobol discussed here and the regression-based sensitivity analysis described later in this section are in general valid for independent input factors. The case with correlated inputs implies that the correlation structure must be taken into account during the sampling of the design space, leading to higher computational cost, (Saltelli et al., 2004). An effective technique for imposing the correlation between input variables has been proposed by Iman & Conover (1982). However, in the case of systems design using mathematical models, dependencies between factors are very often accounted for within the model itself, leaving only the independent factors as design variables. Sometimes instead, input variables can still be considered independent if the design ranges are carefully selected. In the case of the semi-major axis and the eccentricity discussed in Section 2 one could limit the value of the eccentricity to the maximum possible with the minimum semi-major axis, for instance.

To compute the sensitivity, a sample of *N* points is taken from the model *Y* (performing *N* evaluations of the model *Y*). The unconditional variance *V* (*Y*) can be decomposed as shown in Eq. (1), (Sobol, 1993). The expression in Eq. (1) is the ANOVA (Analysis Of Variance) representation of *V* (*Y*), (Sobol, 2001).

$$V\left(Y\right) = \sum\_{i} V\_{i} + \sum\_{i} \sum\_{j>i} V\_{ij} + \dots + V\_{12\dots k} \tag{1}$$

All the terms of Eq. (1) are conditional variances of the factors indicated by the subscript indices. For instance *Vi* is the fraction of *V* (*Y*) due to factor *xi* only. *Vij*, instead, represents the contribution of the interaction of *xi* and *xj* to *V* (*Y*). The Sobol sensitivity indices are defined as in Eq. (2), (Sobol, 1993). *Si*, *Sij*, or *Si*,*j*···*<sup>k</sup>* are sometimes called *first-order sensitivity indices*. They refer to the contribution to the variance of the single factors of Eq. (1). An additional measure of sensitivity is represented by the so-called total-effect sensitivity indices, *STi*. A total-effect sensitivity index takes into account the unconditional variance of a certain variable *xi* considering the first-order and all the higher-order effects in which it is involved. The total-effect sensitivity indices can be computed using Eq. (2) where *<sup>V</sup>*−*<sup>i</sup>* indicates the contribution to the variance due to all factors but *xi* and all the higher-order effects in which it is involved (Saltelli et al., 2004).

$$S\_i = \frac{V\_i}{V\left(Y\right)}\tag{2}$$

$$S\_{\overline{li}} = 1 - \frac{V\_{-i}}{V\left(Y\right)}\tag{2}$$

Global sensitivity indices can be estimated using qualitative or quantitative methods, it depends on the purpose of the analysis, on the complexity of the problem and on the available computational resources. A qualitative approach, like the method of Morris, (Morris, 1991), allows to determine the relative importance of the factors with a relatively limited computational effort. It is not possible to obtain a precise measure of the percent contribution of the factors to the unconditional variance, thus these methods are usually used as a preliminary analysis to detect and fix the unimportant factors. Therefore, qualitative methods are also called *screening methods*. Techniques like the method of Sobol, (Sobol, 1993), or the FAST (Fourier Amplitude Sensitivity Test), (Cukier et al., 1978), require a large number of model evaluations to provide quantitative sensitivity indices of the design factors, especially the terms like *Vij*, or *Vij*···*k*. The regression-based sensitivity analysis method described in the following section provides a quantitative measure of the global sensitivity indices with a limited computational effort. The sensitivity indices computed with this method are based on the decomposition of the variance computed by a regression model, providing information on the first-order as well as on higher-order effects of the factors on the response.

#### **5.2 Regression based sensitivity analysis**

If the design region of interest is not stretched out so much, a polynomial regression model is often sufficient to accurately describe the behavior of the system. This is true for typical models of engineering systems, especially when the source of complexity is represented by the large number of elements and their interrelated behavior rather than the mathematical models of every single component. However, also when the complexity is related to the highly nonlinear and non-smooth behavior of the mathematical equations linking the design variables, in a relatively small portion of the design space a polynomial regression model is still able to describe the system and explain most (if not all) of the variability of the data.

The Regression-Based Sensitivity Analysis (RBSA) method proposed here, is general enough to be applicable to regression models of any order. However, the choice of the regression-order depends on several aspects that will be discussed throughout this section. For ease of the discussion the method will be explained using the second-order model presented in Eq. (3) as a reference.

$$\mathbf{Y} = \beta\_0 + \sum\_{i=1}^{k} \beta\_i \mathbf{x}\_i + \sum\_{i=1}^{k} \beta\_{ii} \mathbf{x}\_i^2 + \sum\_{i=1}^{k-1} \sum\_{j=i+1}^{k} \beta\_{ij} \mathbf{x}\_i \mathbf{x}\_j \tag{3}$$

Here, *βi*, *βii* and *βij* are the so-called regression coefficients that are calculated by fitting a response surface through the points sampled from the model, using the least-squares method. The estimate of the regression coefficients can be computed with the least-squares estimator, for instance:

$$
\hat{\beta} = \left(\mathbf{X}\mathbf{X}'\right)^{-1}\mathbf{X}'\mathbf{Y} \tag{4}
$$

The fitted model is therefore represented by the following equation:

$$
\hat{\mathbf{Y}} = \mathbf{X}\hat{\boldsymbol{\beta}}\tag{5}
$$

Given a set of observations of a mathematical model, the variance of the data can be computed with the well-known equation:

$$\mathcal{V} = \frac{\sum\_{i=1}^{N} \left(\mathbf{Y}\_i - E(\mathbf{Y})\right)^2}{N - 1} \tag{6}$$

where *E*(**Y**) is the expected value, or mean value, of the model output. The expression at the numerator of Eq. (6) is called sum of squares. Since in this case all the observations are taken into account we will refer to it as the total sum of squares, *SST*. The sum of squares of the regression only, instead, can be computed as follows:

$$SS\_R = \sum\_{i=1}^{N} \left(\hat{Y}\_i - E(\mathbf{Y})\right)^2\tag{7}$$

14 Will-be-set-by-IN-TECH

the following section provides a quantitative measure of the global sensitivity indices with a limited computational effort. The sensitivity indices computed with this method are based on the decomposition of the variance computed by a regression model, providing information on

If the design region of interest is not stretched out so much, a polynomial regression model is often sufficient to accurately describe the behavior of the system. This is true for typical models of engineering systems, especially when the source of complexity is represented by the large number of elements and their interrelated behavior rather than the mathematical models of every single component. However, also when the complexity is related to the highly nonlinear and non-smooth behavior of the mathematical equations linking the design variables, in a relatively small portion of the design space a polynomial regression model is still able to describe the system and explain most (if not all) of the variability of the data.

The Regression-Based Sensitivity Analysis (RBSA) method proposed here, is general enough to be applicable to regression models of any order. However, the choice of the regression-order depends on several aspects that will be discussed throughout this section. For ease of the discussion the method will be explained using the second-order model presented in Eq. (3) as

> *k* ∑ *i*=1

Here, *βi*, *βii* and *βij* are the so-called regression coefficients that are calculated by fitting a response surface through the points sampled from the model, using the least-squares method. The estimate of the regression coefficients can be computed with the least-squares estimator,

**XX**�

Given a set of observations of a mathematical model, the variance of the data can be computed

where *E*(**Y**) is the expected value, or mean value, of the model output. The expression at the numerator of Eq. (6) is called sum of squares. Since in this case all the observations are taken into account we will refer to it as the total sum of squares, *SST*. The sum of squares of the

> *N* ∑ *i*=1 *Y*ˆ

*<sup>i</sup>*=<sup>1</sup> (*Yi* − *E*(**Y**))

*<sup>i</sup>* − *E*(**Y**)

2

*βiix*<sup>2</sup> *<sup>i</sup>* + *k*−1 ∑ *i*=1

<sup>−</sup><sup>1</sup> **X**�

*k* ∑ *j*=*i*+1

*βijxixj* (3)

**Y** (4)

**Y**ˆ = **X**βˆ (5)

*<sup>N</sup>* <sup>−</sup> <sup>1</sup> (6)

<sup>2</sup> (7)

the first-order as well as on higher-order effects of the factors on the response.

**5.2 Regression based sensitivity analysis**

*Y* = *β*<sup>0</sup> +

regression only, instead, can be computed as follows:

*k* ∑ *i*=1

The fitted model is therefore represented by the following equation:

*βixi* +

βˆ =

*<sup>V</sup>*<sup>ˆ</sup> <sup>=</sup> <sup>∑</sup>*<sup>N</sup>*

*SSR* =

a reference.

for instance:

with the well-known equation:

The *SSR*, represents the portion of the total variability that can be explained by the regression model. In case the regression model perfectly fits the data then *SST* = *SSR*. When residuals are present, in case of lack-of-fit, the portion of the total variability not explained by the regression model can be computed in the form of the error sum of squares, *SSE*:

$$SS\_E = \sum\_{i=1}^{N} \left( Y\_i - \hat{Y}\_i \right)^2 \tag{8}$$

The regression sum of squares, as already mentioned, indicates how much of the observed variability is explained by the fitted model. To obtain the sensitivity indices of all factors that contribute to the total variability of the regression model, the regression sum of squares should be divided into its components, as done in Eq. (1). The main idea is to associate a sensitivity index to the additional variability calculated when a factor is added to the regression model. In Eq. (9) an alternative form of Eq. (5), combined with Eq. (4), is presented.

$$\hat{\mathbf{Y}} = \mathbf{X} \left(\mathbf{X}\mathbf{X}'\right)^{-1} \mathbf{X}'\mathbf{Y} = \mathbf{H} \mathbf{Y} \tag{9}$$

The matrix **X** (**XX**� ) <sup>−</sup><sup>1</sup> **X**� is called the *hat* matrix. It transforms the vector of the observed responses **Y** into the vector of the fitted values **Y**ˆ . Using the hat matrix, the total, regression, and error sums of squares can be expressed with the following relationships:

$$\mathbf{S} \mathbf{S}\_T = \mathbf{Y}' \left[ \mathbf{I} - \frac{1}{N} \mathbf{J} \right] \mathbf{Y} \qquad \qquad \mathbf{S} \mathbf{S}\_R = \mathbf{Y}' \left[ \mathbf{H} - \frac{1}{N} \mathbf{J} \right] \mathbf{Y} \qquad \qquad \mathbf{S} \mathbf{S}\_E = \mathbf{Y}' \left[ \mathbf{I} - \mathbf{H} \right] \mathbf{Y}$$

where **I** is a *N* × *N* identity matrix, and **J** is a 1 × *N* vector of ones. Given these settings the RBSA is easy to compute. Let us consider a model in the form of Eq. (3) with three variables only. The compact notation *Yf ull* denotes the model computed taking into account all the three factors, 2-factor interactions and quadratic terms, Eq. (10). In this notation *Y*−*x*<sup>1</sup> *<sup>x</sup>*<sup>2</sup> denotes the model computed excluding the factor *x*1*x*2., Eq. (11). The sensitivity index for the factor *x*1*x*<sup>2</sup> can thus be computed as shown in Eq. (12).

$$\mathbf{Y}\_{full} = \boldsymbol{\beta}\_0 + \boldsymbol{\beta}\_1 \mathbf{x}\_1 + \boldsymbol{\beta}\_2 \mathbf{x}\_2 + \boldsymbol{\beta}\_3 \mathbf{x}\_3 + \boldsymbol{\beta}\_{11} \mathbf{x}\_1^2 + \boldsymbol{\beta}\_{22} \mathbf{x}\_2^2 + \boldsymbol{\beta}\_{33} \mathbf{x}\_3^2 + \boldsymbol{\beta}\_{12} \mathbf{x}\_1 \mathbf{x}\_2 + \boldsymbol{\beta}\_{13} \mathbf{x}\_1 \mathbf{x}\_3 + \boldsymbol{\beta}\_{23} \mathbf{x}\_2 \mathbf{x}\_3 \tag{10}$$

$$Y\_{-\mathbf{x}\_1\mathbf{x}\_2} = \beta\_0 + \beta\_1\mathbf{x}\_1 + \beta\_2\mathbf{x}\_2 + \beta\_3\mathbf{x}\_3 + \beta\_{11}\mathbf{x}\_1^2 + \beta\_{22}\mathbf{x}\_2^2 + \beta\_{33}\mathbf{x}\_3^2 + \beta\_{13}\mathbf{x}\_1\mathbf{x}\_3 + \beta\_{23}\mathbf{x}\_2\mathbf{x}\_3 \tag{11}$$

$$S\_{\mathbf{x}\_1 \mathbf{x}\_2} = \frac{V\left(Y\right) - V\_{-\mathbf{x}\_1 \mathbf{x}\_2}}{V\left(Y\right)} = \frac{SS\_T - SS\_R\left(Y\_{-\mathbf{x}\_1 \mathbf{x}\_2}\right)}{SS\_T} \tag{12}$$

The conditional variance term *SSR*(**X**−*xi*) can also be computed and interpreted as the variance determined by excluding the *i th* design variable from the model. It is equivalent to the notation *<sup>V</sup>*−*<sup>i</sup>* used before. In this case the sensitivity indices provide a measure of the total contribution of the variable *xi* to the variance of the performance, considering all the interactions and higher order effects in which *xi* is involved, see for instance Eq. (13) and Eq. (14). The sensitivity indices *Si* are computed for all the terms of the model indicated in Eq. (3) while the total sensitivity indices *STi* are computed for every design variable.

$$Y\_{-\mathbf{x}\_1} = \beta\_0 + \beta\_2 \mathbf{x}\_2 + \beta\_3 \mathbf{x}\_3 + \beta\_{22} \mathbf{x}\_2^2 + \beta\_{33} \mathbf{x}\_3^2 + \beta\_{23} \mathbf{x}\_2 \mathbf{x}\_3 \tag{13}$$

$$S\_{T\chi\_1} = \frac{V\left(Y\right) - V\_{-\chi\_1}}{V\left(Y\right)} = \frac{SS\_T - SS\_R\left(Y\_{-\chi\_1}\right)}{SS\_T} \tag{14}$$

The validity of the sensitivity indices computed with RBSA depends on the lack-of-fit of the regression model with respect to the sample data. Indeed, particular attention must be paid to the ratio between the regression and the total sum of squares. If *SSR* is close to *SST*, then the regression model is able to account for a large part of the output variance, and as a consequence the sensitivity indices are meaningful measures. If this is not the case, lack-of-fit is present meaning that important terms are missing from the initially assumed regression model. However, this information is still important to decide whether to proceed with sensitivity analysis anyway or to modify the initial assumption and increment the order of the regression model by adding extra terms, i.e., higher-order terms like cubic or higher order interactions. Regression models of higher order require a larger number of samples to estimate the effect of all the terms included in the model.

The minimum number of samples for building a regression model is equal to the number of factors present in the model plus one. However, we suggest to collect a set of additional samples that may vary from 4 to 6 times the number of variables to allow for the values of the *SST* and *SSR* to stabilize. At first sight, this iterative approach may seem inefficient, due to the re-sampling of the design region. However, if the design space is sampled using the mixed hypercube approach presented in the previous section, the samples taken in one iteration can be efficiently re-used also for the subsequent one. For continuous variables this is demonstrated in Fig. 4. For discrete variables the possibility of reusing the previous samples to compute new results is due to the deterministic structure of a factorial design. Going from a Resolution 3, to Resolution 4, Resolution 5, or eventually to a full factorial design guarantees that the additional samples are different from the previous ones allowing to maintain the balancing structure of the matrix design.

When working with factorial design, the problem of *aliasing*, or *confounding*, is often experienced. The aliasing effect is the impossibility of discerning the effect of two different factors or interactions of factors. Observing Table 2, it is clear that the effect of factor C is equal to (is confounded with) the effect of interaction AB. In fact, column C is obtained by *xor* operation between columns A and B. In general, for a Resolution 3 design no main effects are confounded with any other main effect, but main effects are confounded with two-factors interactions (and higher order) that may also be confounded with each other. The design in Table 2 is a Resolution 3 design, for instance. For a Resolution 4 design no main effects confounded with any other main effect or with any two-factor interaction, but two-factor interactions can be confounded with each other and with higher-order interactions. Resolution 5 designs allows for experimentation with no main effect or two-factor interaction confounded with any other main effect or two-factor interaction, although two-factor interactions can be confounded with higher-order interactions, (Box et al., 1979) and (Montgomery, 2001). For this motivation, when selecting the type of matrix design for the discrete variables in the mixed-hypercube sampling approach it is necessary to match the *resolution* of the matrix design with the number of samples required to compute the desired effects. For instance, a Resolution 3 design is sufficient to compute linear effects only, more sample points are needed to take into account also the interactions (Resolution 4 and 5 for

Fig. 7. Bar plots indicating the first-order sensitivity indices computed with the RBSA method.

2-factor interactions, full factorial for higher-order interactions) and, as mentioned already, even more than two levels per variable are required to estimate quadratic effects.

#### **5.3 The Earth-observation mission, sensitivity analysis**

16 Will-be-set-by-IN-TECH

The validity of the sensitivity indices computed with RBSA depends on the lack-of-fit of the regression model with respect to the sample data. Indeed, particular attention must be paid to the ratio between the regression and the total sum of squares. If *SSR* is close to *SST*, then the regression model is able to account for a large part of the output variance, and as a consequence the sensitivity indices are meaningful measures. If this is not the case, lack-of-fit is present meaning that important terms are missing from the initially assumed regression model. However, this information is still important to decide whether to proceed with sensitivity analysis anyway or to modify the initial assumption and increment the order of the regression model by adding extra terms, i.e., higher-order terms like cubic or higher order interactions. Regression models of higher order require a larger number of samples to

The minimum number of samples for building a regression model is equal to the number of factors present in the model plus one. However, we suggest to collect a set of additional samples that may vary from 4 to 6 times the number of variables to allow for the values of the *SST* and *SSR* to stabilize. At first sight, this iterative approach may seem inefficient, due to the re-sampling of the design region. However, if the design space is sampled using the mixed hypercube approach presented in the previous section, the samples taken in one iteration can be efficiently re-used also for the subsequent one. For continuous variables this is demonstrated in Fig. 4. For discrete variables the possibility of reusing the previous samples to compute new results is due to the deterministic structure of a factorial design. Going from a Resolution 3, to Resolution 4, Resolution 5, or eventually to a full factorial design guarantees that the additional samples are different from the previous ones allowing to maintain the

When working with factorial design, the problem of *aliasing*, or *confounding*, is often experienced. The aliasing effect is the impossibility of discerning the effect of two different factors or interactions of factors. Observing Table 2, it is clear that the effect of factor C is equal to (is confounded with) the effect of interaction AB. In fact, column C is obtained by *xor* operation between columns A and B. In general, for a Resolution 3 design no main effects are confounded with any other main effect, but main effects are confounded with two-factors interactions (and higher order) that may also be confounded with each other. The design in Table 2 is a Resolution 3 design, for instance. For a Resolution 4 design no main effects confounded with any other main effect or with any two-factor interaction, but two-factor interactions can be confounded with each other and with higher-order interactions. Resolution 5 designs allows for experimentation with no main effect or two-factor interaction confounded with any other main effect or two-factor interaction, although two-factor interactions can be confounded with higher-order interactions, (Box et al., 1979) and (Montgomery, 2001). For this motivation, when selecting the type of matrix design for the discrete variables in the mixed-hypercube sampling approach it is necessary to match the *resolution* of the matrix design with the number of samples required to compute the desired effects. For instance, a Resolution 3 design is sufficient to compute linear effects only, more sample points are needed to take into account also the interactions (Resolution 4 and 5 for

*<sup>V</sup>* (*Y*) <sup>=</sup> *SST* <sup>−</sup> *SSR* (*Y*−*x*<sup>1</sup> )

*SST*

(14)

*STx*<sup>1</sup> <sup>=</sup> *<sup>V</sup>* (*Y*) <sup>−</sup> *<sup>V</sup>*−*x*<sup>1</sup>

estimate the effect of all the terms included in the model.

balancing structure of the matrix design.

In Fig. 7 the results from the sensitivity analysis on the model of the Earth-observation mission, computed using RBSA, are presented. The first-order sensitivity indices are visualized for the constraints (top three graphs) and the objectives (lower two graphs) discussed in Section 3. The results are obtained using a second-order model, see Eq. (3), re-sampled for additional cubic terms of the factors. Two full-factorial designs (3-level and 2-level) have been used for the discrete factors (A) and (B), and (J), (K), and (L), respectively (Table 1). For the continuous variables, instead, the Sobol sequence required 60 samples. The bars represent the percent (divided by 100) contribution of the factors indicated on the horizontal axis of the graphs, their interactions (when the product of two factors is indicated), and their quadratic effects (when the product of the factor by itself is indicated) to the variability of the constraints and the objectives. Cubic effects were limited. Their contribution and the contribution of all the other effects that are not explicitly shown in the bar plots, have been encapsulated in the bars named *Other*.

The first conclusion is that the factors (E), (F), (G), (J), and (K) have a limited effect on the objectives and constraints, probably less then one would expect since some of them are related to the propellant utilization on-board, which is usually a mass driver, thus with an effect on the cost. They can eventually be fixed to a certain level/value with a minor impact on the mission. The other design variables, instead, present contrasting behaviors. The instrument aperture diameter (factor C), for instance, affects the mass of the satellite and the satellite cost (the larger the diameter the larger the mass and the cost, reasonably) but also the down-link margin. The minimum elevation angle for the observation (factor D) has an effect on coverage (the smaller D is, the better) and on the resolution at the edge of the swath (the larger D is, the better). However, factor (D) also has some influence on the down-link margin constraint. The effect of factors (C) and (D) on the down-link margin constraint, rather than the more obvious impact of the antenna diameter (factor I) and the transmitter RF power output (factor H), can be explained as follows. After these results were obtained, a close investigation on the model lead us to the relationship between the instrument aperture diameter and the *angular resolution*, that is related to the *pixel angular resolution*, thus to the *number of pixels* and finally to the *real-time data rate*, which causes the influence on the link margin. The elevation angle, instead, is related to the atmospheric attenuation that increases as the path to the receiver increase (so as the minimum elevation angle decrease). Many conservative assumptions were made for this applicative case. One of them is actually the fact that communication takes place with a ground station at the edge of the instrument swath width. The results of the sensitivity analysis will be used in the subsequent phase of the design methodology, as presented in the following section.

#### **6. Graphical support to the engineering team**

The information gathered during the sensitivity analysis is a roadmap for the engineering team to efficiently direct the design effort. The non-influential design factors can be fixed to a pre-determined level, because they will not affect the performance much, *de facto* reducing the dimensions of the design search-space. However, the influential design variables and the behavior of the system under the effects caused by their variation and their interactions shall be investigated in more detail. Indeed, the same samples used for sensitivity analysis can be used again to compute and present the response surfaces and the variable-trends linking the most influential design factors to the performance, in case of continuous variables. For discrete variables, linear and interaction graphs are computed and presented instead. The design questions that need an answer at this stage of the design process of a complex system are the following:

*What is the shape of the design-region? What are the best parameter settings to optimize the objectives and meeting the constraints? What are the best system alternatives?*

#### **6.1 Response surfaces for continuous variables**

The subject of Response Surface Methods, RSM, includes the procedures of sampling the design space, perform regression analysis, test for model adequacy and optimize the response, (Kuri & Cornell, 1996). The first three steps of the RSM are already in place, as previously discussed. The iterative approach of the RBSA, besides giving quantitative information on the sensitivity indices, also provides the regression coefficients, computed with Eq. (4), related to the best-found sample-fitting regression model. Thus, at this stage of the methodology, a surrogate model that links the design variables to the performance is available, see Eq. (5). Therefore, it is possible to visualize the trends of the objectives and the constraints as a function of the continuous design variables for each combination of discrete-variable levels. Response surfaces, and their bi-dimensional representation called contour plots, can effectively represent the shape of the subspace formed by two continuous variables. When only one continuous variable is of interest, single-variable trends are a valid alternative to contour plots.

Contour plots and single-variable trends could in principle also be computed for discrete variables, since the regression coefficients are available from the RBSA. However, the regression of a continuous function for intermediate discrete-variables levels would not be significant. To visualize the average effect of the discrete variables on the objectives and the constraints, linear and interaction graphs can be computed instead with the method shown in the following subsection.

#### **6.2 Linear and interaction graphs for discrete variables**

18 Will-be-set-by-IN-TECH

The effect of factors (C) and (D) on the down-link margin constraint, rather than the more obvious impact of the antenna diameter (factor I) and the transmitter RF power output (factor H), can be explained as follows. After these results were obtained, a close investigation on the model lead us to the relationship between the instrument aperture diameter and the *angular resolution*, that is related to the *pixel angular resolution*, thus to the *number of pixels* and finally to the *real-time data rate*, which causes the influence on the link margin. The elevation angle, instead, is related to the atmospheric attenuation that increases as the path to the receiver increase (so as the minimum elevation angle decrease). Many conservative assumptions were made for this applicative case. One of them is actually the fact that communication takes place with a ground station at the edge of the instrument swath width. The results of the sensitivity analysis will be used in the subsequent phase of the design methodology, as presented in the

The information gathered during the sensitivity analysis is a roadmap for the engineering team to efficiently direct the design effort. The non-influential design factors can be fixed to a pre-determined level, because they will not affect the performance much, *de facto* reducing the dimensions of the design search-space. However, the influential design variables and the behavior of the system under the effects caused by their variation and their interactions shall be investigated in more detail. Indeed, the same samples used for sensitivity analysis can be used again to compute and present the response surfaces and the variable-trends linking the most influential design factors to the performance, in case of continuous variables. For discrete variables, linear and interaction graphs are computed and presented instead. The design questions that need an answer at this stage of the design process of a complex system

*What is the shape of the design-region? What are the best parameter settings to optimize the objectives*

The subject of Response Surface Methods, RSM, includes the procedures of sampling the design space, perform regression analysis, test for model adequacy and optimize the response, (Kuri & Cornell, 1996). The first three steps of the RSM are already in place, as previously discussed. The iterative approach of the RBSA, besides giving quantitative information on the sensitivity indices, also provides the regression coefficients, computed with Eq. (4), related to the best-found sample-fitting regression model. Thus, at this stage of the methodology, a surrogate model that links the design variables to the performance is available, see Eq. (5). Therefore, it is possible to visualize the trends of the objectives and the constraints as a function of the continuous design variables for each combination of discrete-variable levels. Response surfaces, and their bi-dimensional representation called contour plots, can effectively represent the shape of the subspace formed by two continuous variables. When only one continuous variable is of interest, single-variable trends are a valid alternative to

following section.

are the following:

contour plots.

**6. Graphical support to the engineering team**

*and meeting the constraints? What are the best system alternatives?*

**6.1 Response surfaces for continuous variables**

Consider the analysis of a system with *M* discrete factors [*A*, *B*, ··· , *M*], each with a different number of levels [*a*, *b*, ··· , *m*], and *L* continuous ones. Thus, there are *M* + *L* = *K* design variables that form a *k* − *dimensional* design space. Referring to Figure 6, the matrix-design for the discrete variables would be a *a* × *b* ×···× *m* hypercube (considering a full-factorial) while, concerning the Sobol sequence for the continuous factors, let us assume that *l* sample points are required for each combination of discrete design-variable levels. Once the design space has been sampled and the simulations executed, the responses of the system's model can be analyzed.

Let *Y*··· represent the sum of all the responses obtained during the simulations, *Y*··· = ∑ *y* = ∑*a <sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>b</sup> <sup>j</sup>*=<sup>1</sup> ... <sup>∑</sup>*<sup>m</sup> <sup>w</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>l</sup> <sup>s</sup>*=<sup>1</sup> *yij*...*ws*. Let *Yi*... represent, the sum of all the responses with the factor *A* at level *i*, *Yi*... = ∑*<sup>b</sup> <sup>j</sup>*=<sup>1</sup> ... <sup>∑</sup>*<sup>m</sup> <sup>w</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>l</sup> <sup>s</sup>*=<sup>1</sup> *yij*...*ws*.

Considering the values of *Yi*··· normalized with the number of experiments, *<sup>n</sup>* = *<sup>b</sup>* ×···× *<sup>m</sup>* × *l*, for which the variable *A* is at level *i*, we compute the average value of the performance for *A* at level *i*:

$$\mathbb{C}\_{A\_i} = \frac{Y\_{i\dots}}{n} \tag{15}$$

The values of the *CAi* plotted against the objectives values provide the so-called linear graphs. Besides showing the trend of the objectives to the variation of a single discrete variable (with all the other variables effects averaged out), they also show the eventual presence of higher order effects, if more than two levels per factor are available from the sampling procedure. In case of ordinal discrete variables, e.g., the number of batteries in a satellite, the higher-order effects may have a certain significance indicating that the performance is not linear with increasing value of that factor. In case of categorical variables instead, e.g., the type of batteries to be implemented in the power subsystem or the type of launcher to be used for the mission, the higher-order effects are not so significant *per se* since there is no *increasing* or *decreasing* direction. This aspect have an implication on the type of matrix design selected for sampling the sub-space formed by the discrete variables only. In principle all the combinations of categorical design factors shall be experimented. Each one of these combinations represent a different system architecture that needs to be explicitly assessed. For the ordinal design factors instead, fractional-factorial designs may suffice to compute their effect on the output due to the fact that these types of variables usually have monotonic trends. However, this does not always have to be the case thus accurate matrix-design selection have to be made by the engineering team depending on the type of problem at hand.

The interaction between two discrete variables can be computed using an approach similar to that used before. For the interaction between factor A and factor B, for instance, a matrix with

Fig. 8. Interaction graphs with 2 discrete variables at 3 levels. Adapted from (Phadke, 1989). dimensions equal to *a* × *b* is filled with the following coefficients:

$$\mathbb{C}\_{A\_{i}B\_{j}} = \frac{Y\_{ij\cdots}}{r} \tag{16}$$

In this case *Yij*... indicates the sum of the *r* = *c* ×···× *m* × *l* responses with the factor *A* at level *i* and factor *B* at level *j*. For each level of *A*, for instance, *b* average performance can be plotted against the objectives values, providing the so-called interaction graphs, see Fig. 8. When the lines of an interaction graph are not parallel it indicates the presence of synergistic or anti-synergistic effects, i.e., interactions. A synergistic effect is present when the improvement of a performance given the variation of a factor is enhanced by the variation of another one. An anti-synergistic effect is the exact opposite, (Phadke, 1989). In Fig. 8, the higher-order behavior of the objective to the variation of the variable levels is indicated by the fact that the lines are not perfectly straight over the three levels of variable A, for instance.

The interactions between continuous and discrete variables, eventually detected by sensitivity analysis, can be graphically presented using a mix of contour plots, or single-variable trends, and linear graphs as will be shown in the following subsection.

The synergistic utilization of the results from sensitivity analysis with the RSM and linear and interaction graphs allows the engineering team to study only on the most relevant trends, identified with the sensitivity analysis, and to more effectively select the best combination of design-variable levels. The purpose of this methodology is to support the engineering team and the decision-makers design process and trade-off analysis, and we believe that with this combination of mathematical techniques and graphical results the initial goal is accomplished. However, at this stage of the methodology, the surrogate model could also be used with automatic optimization techniques to provide an optimum (in case of single objective) or a Pareto front of optima (in case of multiple objectives) solutions. A discussion on single- or multiple-objectives optimization techniques is beyond the scope of this chapter. A vast amount of literature dealing with this topic can be found by the interested readers. Coello Coello et al. (2007) and Back et al. (2000), for instance, provide a broad overview and many references.

#### **6.3 The Earth-observation mission, visualization of the design region**

20 Will-be-set-by-IN-TECH

Aƀ AƁ

(b) Anti-synergistic Interaction

Fig. 8. Interaction graphs with 2 discrete variables at 3 levels. Adapted from (Phadke, 1989).

*CAiBj* <sup>=</sup> *Yij*···

In this case *Yij*... indicates the sum of the *r* = *c* ×···× *m* × *l* responses with the factor *A* at level *i* and factor *B* at level *j*. For each level of *A*, for instance, *b* average performance can be plotted against the objectives values, providing the so-called interaction graphs, see Fig. 8. When the lines of an interaction graph are not parallel it indicates the presence of synergistic or anti-synergistic effects, i.e., interactions. A synergistic effect is present when the improvement of a performance given the variation of a factor is enhanced by the variation of another one. An anti-synergistic effect is the exact opposite, (Phadke, 1989). In Fig. 8, the higher-order behavior of the objective to the variation of the variable levels is indicated by the fact that the

The interactions between continuous and discrete variables, eventually detected by sensitivity analysis, can be graphically presented using a mix of contour plots, or single-variable trends,

The synergistic utilization of the results from sensitivity analysis with the RSM and linear and interaction graphs allows the engineering team to study only on the most relevant trends, identified with the sensitivity analysis, and to more effectively select the best combination of design-variable levels. The purpose of this methodology is to support the engineering team and the decision-makers design process and trade-off analysis, and we believe that with this combination of mathematical techniques and graphical results the initial goal is accomplished. However, at this stage of the methodology, the surrogate model could also be used with automatic optimization techniques to provide an optimum (in case of single objective) or a Pareto front of optima (in case of multiple objectives) solutions. A discussion on single- or multiple-objectives optimization techniques is beyond the scope of this chapter. A vast amount of literature dealing with this topic can be found by the interested readers. Coello Coello et al. (2007) and Back et al. (2000), for instance, provide a broad overview and

Bſ

Bƀ BƁ

Aſ

*<sup>r</sup>* (16)

Aƀ AƁ

(c) No interaction

Bſ Bƀ BƁ

Y

Aſ

lines are not perfectly straight over the three levels of variable A, for instance.

and linear graphs as will be shown in the following subsection.

dimensions equal to *a* × *b* is filled with the following coefficients:

Y

Aſ

many references.

Aƀ AƁ

(a) Synergistic Interaction

Bſ

Bƀ

BƁ

Y

The results obtained with the sensitivity analysis in the previous section suggested that some variables influence the objectives and the constraints more than others. This allowed us to reduce the number of important graphs and to focus the attention on only a few of them. Indeed, the graphs in Fig. 9 are an alternative, more detailed and more focused, way of looking at the same data used to compute the sensitivity analysis.

In the interaction graph of Fig. 9(a) the two discrete variables related to the orbit of the satellite are considered. For all the levels of (A) and (B) the average (as previously discussed in this section) value of the equatorial coverage is plotted. The number of days for a repeating ground-track and the total number of orbits in that time period have a synergistic effect on the coverage. In particular, as expected with a higher orbit (e.g., 13 orbits in 1 day and *H* = 1258.6 km) the average equatorial coverage is larger compared to a case with a lower orbit (e.g., 29 orbits in 2 days and *H* = 725.2 km). The combinations of factors levels A1-B3 (i.e., 15 orbits in 1 day), A2-B3 (i.e., 30 orbits in 2 days), and A3-B3 (i.e., 45 orbits in 3 days) lead to the same configuration since the altitude of the orbit is the same, *H* = 567.5 km. The comparison between the performances of an A1-B1 configuration and A3-B2 configuration on the resolution at swath-edge, and on the equatorial coverage, as a function also of the minimum elevation angle and the instrument aperture diameter (factor C) is presented in Fig. 9(b). The light-gray area represents the revisit time constraint for the A3-B2 configuration, set as 100% of equatorial coverage in 24 h. The dark-gray area represents the same constraint for the A1-B1 configuration. A higher orbit (dashed lines in Fig. 9(b)) allows to meet the re-visit constraint with a larger minimum elevation angle thus also improving the resolution performance at the edge of the swath. For the A3-B2 configuration, with = 30◦ and the instrument aperture diameter equal to 0.7 m the resolution at the edge of the swath is 12.7 m/pixel, and 1.26 m/pixel at subsatellite point. For the A1-B1 configuration, instead, the resolution at subsatellite point is slightly worse, i.e., 2.2 m/pixel, but at the edge of the swath a resolution of 7 m/pixel can be obtained. Further, for an A1-B1 configuration, the fact that the minimum elevation angle can be up to 30◦ gives the satellite the possibility to actually observe over the entire geometrical swath width with the maximum possible slewing angle, i.e., (*E*) = 50◦, and at a higher resolution than a A3-B2 configuration. The aperture diameter of the instrument, paradoxically, plays a more relevant role in the determination of the data rate, thus on the down-link margin than on the actual resolution, as demonstrated by the sensitivity analysis. Indeed, in Fig. 9(d) the down-link margin constraint is plotted as a function of the instrument aperture diameter and the minimum elevation angle, for the configuration A1-B1 and with (*H*) = 30 W and (*I*) = 1 m. An A3-B2 configuration would push the coverage constraint down, with the side result of allowing less flexibility in selecting the instrument aperture diameter. The effect on the cost is plotted in Fig. 9(c). The assumption is that a higher orbit, would require less manouvers for pointing the instrument of the satellite in one particular direction and the effect is in a reduced cost (difference between the full and the dashed lines). The constraint on the launcher mass availability is mainly driven by the instrument aperture diameter. Indeed the mass and power consumption of the payload is scaled with the diameter, and so does the mass of the satellite and its cost. The *Delta II* class of launchers allows for enough flexibility until the payload aperture diameter of about 0.9 m.

Fig. 9. Analysis main results. Δ is a tentative selected baseline. The light-gray area of (b) represents the revisit time constraint for the A3-B2 configuration, set as 100% of equatorial coverage in 24 h. The dark-gray area of (b) represents the same constraint for the A1-B1 configuration.

The triangles in Fig. 9 represent a tentative selection of the baseline. In particular an A1-B1 architecture has been selected, with (*C*) = 0.7 m, (*D*) = 30◦, (*E*) = 50◦, (*F*) = 120 s, (*G*) = 10000, (*H*) = 30 W, (*I*) = 1 m, (*J*) = 2, (*K*) = 2, (*L*) = 1. With these settings of the design variables a confirmation experiment was performed on the model. The simulation yield to a cost of the satellite of 188 *M*\$*FY*2010, a mass of 1330 kg and an overall power consumption of 1 kW. The resolution at the edge of the swath is 7.3 m/pixel and 2.2 m/pixel at sub-satellite point. The equatorial coverage after 24 h is 100% and the down-link margin is 4.1 dB. The results from the verification experiment are very close to the values that can be read from the graphs in Fig. 9. This indicates that the sampling technique and the regression analysis provided reliable results. Sensitivity analysis and graphical support in the form of contour plots, variable trends and interaction graphs enabled a thorough reasoning on the phenomena involved. This allowed us to quickly select a system baseline that meets the constraints balancing the objectives under analysis.

#### **7. Uncertainty analysis and robust design**

Uncertainty analysis and robust design are often considered complementary design activities implemented for determining the performances of the system under uncertain operating conditions. In particular, uncertainty analysis is the study of the uncertain distribution characteristics of the model output under the influence of the uncertainty distributions of the model inputs. With these settings, the purpose of uncertainty analysis is to *simply* propagate the uncertainty trough the model. When the analysis presents both controllable and uncontrollable factors, the latter being intrinsically uncertain parameters (e.g., operating environmental conditions), the purpose of the uncertainty analysis is to obtain settings of the controllable design variables that optimize the performances while at the same time minimize the impact of the uncertainties on the system. In this case we talk about robust design.

In general, uncertainty can be classified in two types, stochastic and epistemic. The stochastic or aleatory uncertainty describes the inherent variability associated with a certain phenomenon. It is usually modeled by stochastic processes when there is enough information to determine the probability distributions of the variables. The epistemic uncertainty is characterized instead by the lack of knowledge about a specific characteristic of the system. If seen in this perspective, the value of the controllable design variables and its related uncertainty can be classified as epistemic and, as discussed in the previous section, these variables are modeled as uniformly distributed between a minimum and a maximum value. However, epistemic uncertainty can also be related to uncontrollable factors for which there is too little information for determining a proper probability distribution. In this case the use of uniform distributions to characterize their uncertainty has been criticized for the main reason that a phenomenon for which there is lack of knowledge cannot be represented by any specific probability distribution (Helton et al., 2006).

For the design of a complex system, in case of both epistemic and stochastic uncertainty, probability theory alone is considered to be insufficient for a complete representation of the implications of the uncertainties on the performances. Therefore, in the following subsection we introduce a unified method for propagating the uncertainty through the model, in the presence of stochastic and epistemic uncertain factors. The main design questions we will try to answer in this section are the following:

*In case of uncertainties of any type, how do they propagate through the model of the system? What are the factors that are mostly responsible for performance dispersion? How robust is the design?*

#### **7.1 The unified sampling method**

22 Will-be-set-by-IN-TECH

0.3 0.44 0.58 0.72 0.86 1

Launcher residual mass available [kg]

200 300 400

0.2 0.4 0.6 0.8 1 1.2

A3 B2 A1 B1

Coverage [%]

0.3 0.44 0.58 0.72 0.86 <sup>1</sup> <sup>100</sup>

Instrument aperture diameter (factor C) [m]

Existing Instrument

<sup>500</sup> Satellite Cost [M\$FY2010]

A3 B2 G15k A1 B1 G10k

New Instrument

Instrument aperture diameter (factor C) [m]

(d) (e)

(a) (b) (c)

<sup>5</sup> <sup>14</sup> <sup>23</sup> <sup>32</sup> <sup>41</sup> <sup>50</sup> ƺŘŖ

Min. elevation angle (factor D) [deg]

0.6

3.7

2.2

5.3

6.9

3.7

0.3 0.44 0.58 0.72 0.86 1

Instrument aperture diameter (factor C) [m]

8.4

Down-link margin [dB] ęȱŗȬŗ

10

5.3

6.9

constraints balancing the objectives under analysis.

**7. Uncertainty analysis and robust design**

8.4

13.1

14.7

11.6

Coverage constraint

10

5

Min. elevation angle (factor D) [deg]

Increasing C 0.5, 0.6, 0.7

Down-link margin constraint

Fig. 9. Analysis main results. Δ is a tentative selected baseline. The light-gray area of (b) represents the revisit time constraint for the A3-B2 configuration, set as 100% of equatorial coverage in 24 h. The dark-gray area of (b) represents the same constraint for the A1-B1

The triangles in Fig. 9 represent a tentative selection of the baseline. In particular an A1-B1 architecture has been selected, with (*C*) = 0.7 m, (*D*) = 30◦, (*E*) = 50◦, (*F*) = 120 s, (*G*) = 10000, (*H*) = 30 W, (*I*) = 1 m, (*J*) = 2, (*K*) = 2, (*L*) = 1. With these settings of the design variables a confirmation experiment was performed on the model. The simulation yield to a cost of the satellite of 188 *M*\$*FY*2010, a mass of 1330 kg and an overall power consumption of 1 kW. The resolution at the edge of the swath is 7.3 m/pixel and 2.2 m/pixel at sub-satellite point. The equatorial coverage after 24 h is 100% and the down-link margin is 4.1 dB. The results from the verification experiment are very close to the values that can be read from the graphs in Fig. 9. This indicates that the sampling technique and the regression analysis provided reliable results. Sensitivity analysis and graphical support in the form of contour plots, variable trends and interaction graphs enabled a thorough reasoning on the phenomena involved. This allowed us to quickly select a system baseline that meets the

Uncertainty analysis and robust design are often considered complementary design activities implemented for determining the performances of the system under uncertain operating conditions. In particular, uncertainty analysis is the study of the uncertain distribution characteristics of the model output under the influence of the uncertainty distributions of the model inputs. With these settings, the purpose of uncertainty analysis is to *simply*

Resolution at swath edge [m]

<sup>1</sup> <sup>2</sup> <sup>3</sup> 0.65

Number of Orbits (factor B)

A1 A2 A3

0.7 0.75 0.8 0.85 0.9

configuration.

Average Coverage [%]

In this subsection we introduce a modified implementation of the Sobol sampling technique. A Sobol sequence only allows to sample uniformly on the design space. Uniform distributions are the only necessary distributions of the design variables when the purpose of the analysis is to select a certain baseline that optimize the performances, as discussed in the previous sections. The unified sampling technique, instead, allows to cope with any type of epistemic and stochastic distributions of the uncertain factors, typical when the focus of the analysis is that of propagating the uncertainty throughout the model.

The problem of determining the probability distribution of the output, given the probability distributions of the inputs of a model is related to the computation of a multi-dimensional integral. A direct numerical integration or the analytical solution of the integral can become practically infeasible with already few uncertain variables. Therefore, the direct Monte-Carlo simulation is amongst the most widely adopted methods for uncertainty analysis, since it does not require any type of *manipulation* of the model. When it comes to long-running models, as is usually the case for complex space systems in a collaborative environment, the method of Monte Carlo, using random-sampling techniques, has the recognized disadvantage of

Fig. 10. Representation of the cumulative distributions of (a) the epistemic uncertain variable, and (b) the stochastic (normal) uncertain variable. The dashed lines connect the BPAs to the relative uncertainty intervals. The arrows represent the projection of the sample points from the BPSs domain to the uncertainty-intervals domain.

being computationally expensive, since it generally requires a large number of simulations to compute the mean, the variance and a precise distribution of the response (Rubinstein, 1981). Helton & Davis (2003) compare LHS with a random sampling technique for the propagation of uncertainty into mathematical models. Their analysis corroborates the original results obtained by McKay et al. (1979), and demonstrates that stratified sampling provides more stable Cumulative Distribution Functions (CDFs) of the output than random sampling, with the result that less samples are required for a given accuracy in the determination of the CDFs.

As discussed previously, also epistemic uncertainty must be considered for the design of a complex system. Thus, for the development of the unified sampling technique presented in this section we inherit some ideas and some nomenclature from the evidence theory derived from the initial work of Dempster (1967; 1968) and Shafer (1976). When lack of knowledge about a certain system behavior is present, and when the available historical and statistical sources become sparse, the engineering team is forced to evaluate and combine disparate data sources not perfectly tailored to the purpose at hand based on judgmental elements. Structured expert judgment is increasingly accepted as scientific input in quantitative models, and it is dealt in a number of publications, see for instance (Cooke, 1991) and (O 'Hagan & Oakley, 2004). The result of the combination of experts judgments on the uncertainty of a specific phenomenon leads to the creation, for every single uncertain factor, of so-called *Basic Probability Assignments*, BPAs. The BPAs represent the level of confidence that the engineering team has on the fact that the value of the factor of interest lies in a certain interval of possible values. The uncertainty interval is divided into *n* subsets and for each of them a certain belief, or probability, that the actual value of the uncertain parameter will lie within that subset is assigned. The set of the *n* beliefs form the BPA for the factor under analysis.

Consider for instance the epistemic uncertain factor (A) in Figure 10(a). The uncertainty interval of factor A is equal to [0, 1], divided into 3 subsets [0, 0.2] ∪ [0.2, 0.5] ∪ [0.5, 1]. Suppose that the judgment of the engineering team-members on the uncertainty structure of factor A leads to the conclusion that the actual value of A will lie in the subset [0, 0.2] with a probability equal to 0.4, in the subset [0.2, 0.5] with a probability equal to 0.3 and in the subset [0.5, 1] with a probability of 0.3. Thus the BPA of factor A is equal to [0.4, 0.3, 0.3] and its cumulative function is reported on the y axis of Figure 10(a). The idea is to extend 24 Will-be-set-by-IN-TECH

0.630 0.841 0.952 1

Cumulative BPA factor B

0 0.048 0.159 0.370

(a) (b) Fig. 10. Representation of the cumulative distributions of (a) the epistemic uncertain variable, and (b) the stochastic (normal) uncertain variable. The dashed lines connect the BPAs to the relative uncertainty intervals. The arrows represent the projection of the sample points from

being computationally expensive, since it generally requires a large number of simulations to compute the mean, the variance and a precise distribution of the response (Rubinstein, 1981). Helton & Davis (2003) compare LHS with a random sampling technique for the propagation of uncertainty into mathematical models. Their analysis corroborates the original results obtained by McKay et al. (1979), and demonstrates that stratified sampling provides more stable Cumulative Distribution Functions (CDFs) of the output than random sampling, with the result that less samples are required for a given accuracy in the determination of the CDFs. As discussed previously, also epistemic uncertainty must be considered for the design of a complex system. Thus, for the development of the unified sampling technique presented in this section we inherit some ideas and some nomenclature from the evidence theory derived from the initial work of Dempster (1967; 1968) and Shafer (1976). When lack of knowledge about a certain system behavior is present, and when the available historical and statistical sources become sparse, the engineering team is forced to evaluate and combine disparate data sources not perfectly tailored to the purpose at hand based on judgmental elements. Structured expert judgment is increasingly accepted as scientific input in quantitative models, and it is dealt in a number of publications, see for instance (Cooke, 1991) and (O 'Hagan & Oakley, 2004). The result of the combination of experts judgments on the uncertainty of a specific phenomenon leads to the creation, for every single uncertain factor, of so-called *Basic Probability Assignments*, BPAs. The BPAs represent the level of confidence that the engineering team has on the fact that the value of the factor of interest lies in a certain interval of possible values. The uncertainty interval is divided into *n* subsets and for each of them a certain belief, or probability, that the actual value of the uncertain parameter will lie within that subset is

assigned. The set of the *n* beliefs form the BPA for the factor under analysis.

Consider for instance the epistemic uncertain factor (A) in Figure 10(a). The uncertainty interval of factor A is equal to [0, 1], divided into 3 subsets [0, 0.2] ∪ [0.2, 0.5] ∪ [0.5, 1]. Suppose that the judgment of the engineering team-members on the uncertainty structure of factor A leads to the conclusion that the actual value of A will lie in the subset [0, 0.2] with a probability equal to 0.4, in the subset [0.2, 0.5] with a probability equal to 0.3 and in the subset [0.5, 1] with a probability of 0.3. Thus the BPA of factor A is equal to [0.4, 0.3, 0.3] and its cumulative function is reported on the y axis of Figure 10(a). The idea is to extend

0 1.43 2.86 4.29 5.71 7.14 8.57 10

Intervals factor B

0 0.2 0.5 1

Intervals factor A

the BPSs domain to the uncertainty-intervals domain.

0

0.4

Cumulative BPA factor A

0.7

1

Fig. 11. Unified sampling method. Representation of (a) the uniform sampling in the BPAs domain, and (b) the corresponding sample points in the uncertainty-intervals domain.

the concept of the BPA also to the stochastic variables in such a way to obtain a unique representation of the uncertainty structure of the inputs. For a stochastic variable the cumulative distribution function is continuous. If the uncertainty interval of the stochastic factor is discretized in *m* subsets, then the discretized CDF can be expressed in the form of Basic Probability Assignments as in case of the epistemic uncertain factors. Consider the normally distributed uncertain factor (B) of Figure 10(b). Its uncertainty interval is equal to [0, 10], divided in 7 subsets, for instance, as reported on the x axis of Figure 10(b). According to the CDF of the normal distribution, the BPAs associated to this discretization are the following [0.0480, 0.1110, 0.2106, 0.2608, 0.2106, 0.1110, 0.0480]. The cumulative BPAs are reported in the y axis of Figure 10(b). In the case of stochastic uncertainty, there is the possibility of having infinite tails of the distributions, as in the case of the normal one. However, if the minimum and the maximum values of the uncertainty intervals represent a high percentile, e.g., 0.95 and 0.05, or 0.99 and 0.01 (as in the case of factor A), the error is acceptably small in most of the cases. In Figure 10(b) the gray areas represent the error that arise when considering a truncated normal distribution. The probabilities of the first and the last intervals are overestimated of a quantity equal to the smallest truncation percentile (0.01 in this case).

The unified sampling method is executed in two steps. First, a uniform sampling on the space formed by the cumulative values of the BPAs is executed, Fig. 11(a). Then, each sample point within each interval in the BPA domain is mapped to the corresponding point in the uncertainty interval domain, Fig. 11(b). This passage from the BPAs domain to the uncertainty intervals domain is also represented by the arrows in Figure 10. With this procedure the final sample is collected according to the aleatory/epistemic probability distribution of the factors.

Experience and common sense tells that the more the BPA intervals, the better the approximation of the output probability distribution. However, in the case of epistemic-factor uncertainty the number of BPA intervals depends on the degree of knowledge of the engineering team on the behavior of the factors themselves. If the initial uniform sampling is performed according to a stratified technique, the resulting response CDF will be more stable than what could be obtained by using a random technique, as demonstrated by Helton & Davis (2003) and McKay et al. (1979). Further, if a Sobol sequence is implemented all the advantages already discussed in the previous chapters would still hold. This is particularly true if seen in the perspective of computing the sensitivity analysis using the RBSA, which is directly applicable if the unified sampling method is used. The computation of sensitivity analysis under uncertainty settings allows to identify the contribution of the inputs to the uncertainty in the analysis output, so to drive the effort in better describing the uncertainty of only the most relevant factors.

#### **7.2 The Earth-observation mission, uncertainty analysis for cost, mass and power budgets**

In the traditional system engineering process, design margins are used to account for technical budget uncertainties, e.g., typically for cost, mass and power. A certain percentage of the baseline's performance is added to account for both uncertainties in the model and uncertainties about eventual assumptions made at a preliminary phase that will likely be modified in advanced phases, due to an increased level of detail and knowledge. For instance, the results presented in the previous section were obtained with a 15% margin on the total satellite mass, total power consumption and propellant stored on board. The results without margins would be different. In particular, the satellite mass would be equal to 1048 kg, the power consumption euqual to 830 W and with a cost saving of 15 *M*\$*FY*2010. The unified sampling method allows the engineering team to obtain more insight in the uncertainty structure of the solution by focussing on every single source of uncertainty. This will enable a more informed decision-making process on the allocation of the budgets to each subsystem and each element.

In the case of the Earth-observation mission we considered the uncertain parameters and the uncertainty structure presented in Table 3. A mix of normal, log-normal and epistemic distributions has been considered. The normal and the log-normal uncertain variables are centered on the values needed to obtain the results presented before. The epistemic uncertain intervals and BPAs are determined in such a way that the value of the factors needed to obtain the previous results is at the center of the first epistemic interval. Using the unified sampling method, with 200 samples, we obtained the results shown in Fig. 12. In Fig. 12(a,b,c) the probability density estimates of the performances are presented. The histograms are plotted with an adjusted scale, so to obtain a total area of the bars equal to 1. The black and gray arrows are positioned in correspondence to the values of the performance previously computed with and without margins, respectively. It is clear that the margins approach does not provide the same insight provided by the PDFs and the histograms on the performances of the system under uncertain factors. In particular, the PDF and CDF trends shown in Fig. 12 allow the engineering team to better understand the behavior of the system under analysis, bringing two main advantages. First, the uncertainty can be allocated to single subsystems and single elements more effectively using the unified sampling method. Second, the final performance can be precisely assessed according to the desired confidence level. Further, having a precise distribution of the performances allows for more effective budget-allocation management for subsequent phases of the design process. In Fig. 12(d,e,f) the empirical cumulative distribution functions of the performances are presented. The CDF estimate, computed with 2000 samples using a random sampling method, is also represented. The fact that the empirical CDF and the CDF estimate are very close to each other corroborates the initial statement that the unified sampling method, being a stratified sampling method, is able to provide accurate results with a reduced computational effort.

26 Will-be-set-by-IN-TECH

true if seen in the perspective of computing the sensitivity analysis using the RBSA, which is directly applicable if the unified sampling method is used. The computation of sensitivity analysis under uncertainty settings allows to identify the contribution of the inputs to the uncertainty in the analysis output, so to drive the effort in better describing the uncertainty of

**7.2 The Earth-observation mission, uncertainty analysis for cost, mass and power budgets** In the traditional system engineering process, design margins are used to account for technical budget uncertainties, e.g., typically for cost, mass and power. A certain percentage of the baseline's performance is added to account for both uncertainties in the model and uncertainties about eventual assumptions made at a preliminary phase that will likely be modified in advanced phases, due to an increased level of detail and knowledge. For instance, the results presented in the previous section were obtained with a 15% margin on the total satellite mass, total power consumption and propellant stored on board. The results without margins would be different. In particular, the satellite mass would be equal to 1048 kg, the power consumption euqual to 830 W and with a cost saving of 15 *M*\$*FY*2010. The unified sampling method allows the engineering team to obtain more insight in the uncertainty structure of the solution by focussing on every single source of uncertainty. This will enable a more informed decision-making process on the allocation of the budgets to each subsystem

In the case of the Earth-observation mission we considered the uncertain parameters and the uncertainty structure presented in Table 3. A mix of normal, log-normal and epistemic distributions has been considered. The normal and the log-normal uncertain variables are centered on the values needed to obtain the results presented before. The epistemic uncertain intervals and BPAs are determined in such a way that the value of the factors needed to obtain the previous results is at the center of the first epistemic interval. Using the unified sampling method, with 200 samples, we obtained the results shown in Fig. 12. In Fig. 12(a,b,c) the probability density estimates of the performances are presented. The histograms are plotted with an adjusted scale, so to obtain a total area of the bars equal to 1. The black and gray arrows are positioned in correspondence to the values of the performance previously computed with and without margins, respectively. It is clear that the margins approach does not provide the same insight provided by the PDFs and the histograms on the performances of the system under uncertain factors. In particular, the PDF and CDF trends shown in Fig. 12 allow the engineering team to better understand the behavior of the system under analysis, bringing two main advantages. First, the uncertainty can be allocated to single subsystems and single elements more effectively using the unified sampling method. Second, the final performance can be precisely assessed according to the desired confidence level. Further, having a precise distribution of the performances allows for more effective budget-allocation management for subsequent phases of the design process. In Fig. 12(d,e,f) the empirical cumulative distribution functions of the performances are presented. The CDF estimate, computed with 2000 samples using a random sampling method, is also represented. The fact that the empirical CDF and the CDF estimate are very close to each other corroborates the initial statement that the unified sampling method, being a stratified sampling method, is able

to provide accurate results with a reduced computational effort.

only the most relevant factors.

and each element.


Table 3. Settings of the design variables.*a*Intervals [0, 0.04, 0.1, 0.17, 0.25], BPA [0.4, 0.3, 0.2, 0.1]. *<sup>b</sup>*Intervals [0.2, 0.25, 0.3, 0.4], BPA [0.4, 0.35, 0.25]. *<sup>c</sup>*Intervals [0, 0.25, 0.5, 0.75, 1], BPA [0.4, 0.3, 0.2, 0.1]. *<sup>d</sup>μ* = 0 *σ* = 1, Min and Max are the 0.01 and 0.99 percentile respectively.*eσ* = 1, Max is the 0.99 percentile, Min corresponds to *X* = 0.

Fig. 12. Uncertainty-analysis results

#### **7.3 Robust design and the augmented mixed hypercube approach**

Robustness is a concept that can be seen from two different perspectives, at least according to the discussion so far. One can define robustness of the system with respect to the effect of uncontrollable factors (aleatory and/or epistemic) and, if interested in obtaining a robust design, one can select that combination of controllable design-factor values that minimizes the variance while optimizing the performance. This concept was already expressed in the previous section, and it is the most common way of thinking of robust design. However, robustness can also be defined as the insensitivity of a certain design baseline to modification of the design variables in subsequent phases of the design process, thus providing an intrinsic design-baseline robustness figure. The modification of the levels of the design variables is likely to happen, especially when the baseline is at an early stage of the design process (phase 0/A). In this sense, robustness can be linked to the programmatic risk encountered when modifying a set of design parameters at later stages of the design process. In the first case, instead, robustness is more related to the operational-life risk of the system (if the uncertainties derive from the operational environment, for instance).

In both cases the Mixed Hypercube approach and the unified sampling method, and the utilization of the design techniques proposed in this chapter, provide a valuable tool in the hand of the engineering team. The sampling approaches described in this chapter are summarized in Fig. 6 and Fig. 13. When the purpose of the analysis is to study the best settings of the controllable design variables to optimize the performances while meeting the constraints, the mixed hypercube approach (see Fig. 6 in conjunction with RBSA, response surfaces and linear and interaction graphs) provide a way to answer many of the most common design questions. When the purpose of the analysis is to obtain a robust design, thus studying the settings of the controllable design factors that optimize the performances while keeping the system insensitive to uncertain factors, then the Augmented Mixed Hypercube, AMH, approach shall be used, see Fig. 13(a). For every combination of the levels of the controllable design variables, an uncertainty analysis can be executed using the unified sampling method to obtain the performance of the system, and the relative statistics, due to uncertain factors. When, instead, the effect of the modification of the controllable design variables in later stages of the design process is under investigation, the general case presented in Fig. 13(b) can be implemented. The variables used to determine the baseline can be studied in perspective of their uncertain future variation. The continuous variables are more likely to be modified, since the discrete ones commonly represent different architectures of the system (whose change usually bring more radical modifications of the design, thus most likely high costs). However, in general a figure of robustness can be computed for each combination of discrete-factor levels. The approach in Fig. 13(b), without architectural variables, was also used for the budget-margins analysis presented in the previous section.

One last remark regards the possibility to use the Augmented Mixed Hypercube for a wider search. The analysis performed with the AMH, as presented in this chapter, is restricted to the portion of the design space delimited by the variability ranges of the design variables. Sometimes a single hypercube is sufficient to entirely cover the design space, sometimes instead a narrower hypercube might be needed to avoid major lack-of-fit conditions. In this case more than one hypercube may be implemented to study different regions of the design space as different alternative baselines of the system. In this case, the methodologies presented

Fig. 13. Augmented Mixed Hypercube sampling procedure for robust design.

in this chapter will not only support the engineering team in selecting the best configuration for each single baseline, but will also allow to compare and trade between the baselines based on their performances, constraint-violation conditions and robustness.

#### **8. Conclusions**

28 Will-be-set-by-IN-TECH

Robustness is a concept that can be seen from two different perspectives, at least according to the discussion so far. One can define robustness of the system with respect to the effect of uncontrollable factors (aleatory and/or epistemic) and, if interested in obtaining a robust design, one can select that combination of controllable design-factor values that minimizes the variance while optimizing the performance. This concept was already expressed in the previous section, and it is the most common way of thinking of robust design. However, robustness can also be defined as the insensitivity of a certain design baseline to modification of the design variables in subsequent phases of the design process, thus providing an intrinsic design-baseline robustness figure. The modification of the levels of the design variables is likely to happen, especially when the baseline is at an early stage of the design process (phase 0/A). In this sense, robustness can be linked to the programmatic risk encountered when modifying a set of design parameters at later stages of the design process. In the first case, instead, robustness is more related to the operational-life risk of the system (if the uncertainties

In both cases the Mixed Hypercube approach and the unified sampling method, and the utilization of the design techniques proposed in this chapter, provide a valuable tool in the hand of the engineering team. The sampling approaches described in this chapter are summarized in Fig. 6 and Fig. 13. When the purpose of the analysis is to study the best settings of the controllable design variables to optimize the performances while meeting the constraints, the mixed hypercube approach (see Fig. 6 in conjunction with RBSA, response surfaces and linear and interaction graphs) provide a way to answer many of the most common design questions. When the purpose of the analysis is to obtain a robust design, thus studying the settings of the controllable design factors that optimize the performances while keeping the system insensitive to uncertain factors, then the Augmented Mixed Hypercube, AMH, approach shall be used, see Fig. 13(a). For every combination of the levels of the controllable design variables, an uncertainty analysis can be executed using the unified sampling method to obtain the performance of the system, and the relative statistics, due to uncertain factors. When, instead, the effect of the modification of the controllable design variables in later stages of the design process is under investigation, the general case presented in Fig. 13(b) can be implemented. The variables used to determine the baseline can be studied in perspective of their uncertain future variation. The continuous variables are more likely to be modified, since the discrete ones commonly represent different architectures of the system (whose change usually bring more radical modifications of the design, thus most likely high costs). However, in general a figure of robustness can be computed for each combination of discrete-factor levels. The approach in Fig. 13(b), without architectural variables, was also

**7.3 Robust design and the augmented mixed hypercube approach**

derive from the operational environment, for instance).

used for the budget-margins analysis presented in the previous section.

One last remark regards the possibility to use the Augmented Mixed Hypercube for a wider search. The analysis performed with the AMH, as presented in this chapter, is restricted to the portion of the design space delimited by the variability ranges of the design variables. Sometimes a single hypercube is sufficient to entirely cover the design space, sometimes instead a narrower hypercube might be needed to avoid major lack-of-fit conditions. In this case more than one hypercube may be implemented to study different regions of the design space as different alternative baselines of the system. In this case, the methodologies presented Design-space exploration is the fundamental activity with which the model of a complex system is analyzed to understand the effect of the design choices on the performance(s) and to set the values of the variables in such a way that the final product will perform as required by the customer(s). This activity often involves many stakeholders, with many objectives to be balanced, many constraints and many design variables, thus posing the problem to be extremely difficult to solve with a non-structured approach. The purpose of this chapter was to discuss on subsequent analysis steps and synthesis methodologies that could serve as a guideline for exploring the design space of complex models in a standardized and possibly more efficient way. In particular, the goal was to bring fundamental analysis techniques from the discipline domain level to the system domain level. This is done to support the decision-making process and provide a unified design approach that could be implemented in a collaborative environment, in the presence of long-running models with limited time available for the design process. For this motivation, all the proposed techniques do not require any kind of manipulation of the original model and they are developed pursuing the reduction of the required computational effort as much as possible.

The Augmented Mixed Hypercube approach developed and presented step-by-step in this chapter demonstrated to be a flexible sampling method with which many fundamental design questions could be answered. The AMH is slightly more elaborated than other conventional sampling techniques but it allows the engineering team to gain a great deal of insight in the problem at hand with continuous and discrete, controllable and uncontrollable design variables with one unified method. The final baseline of the Earth-observing satellite, for instance, was selected according to a non-conventional mission architecture for an observation satellite, i.e., quite a high orbit altitude. This choice was mostly driven by the need to balance the coverage requirement and the resolution performance, while keeping the cost down. The *risk* of obtaining conventional design baselines is behind the corner when non-structured, expert-judgment driven, approaches are implemented. However, very often, especially in preliminary design phases, expert judgment is a fundamental ingredient to a good system baseline. In fact the AMH also allows to take expert-judgment into account with a unified epistemic-stochastic sampling approach. The Regression Based Sensitivity Analysis presented in this chapter, coupled with the AMH, allows to compute global variance-based sensitivity indices with a reduced computational effort if compared to other global sensitivity analysis methods. The great advantage of sensitivity analysis performed already at an early stage of the design process, as demonstrated with the Earth-observation mission, is that it could speed up the process itself providing the engineering team with an *X-ray machine* that allows to efficiently understand the effect of their design choices on the system.

We would like to close this chapter with few final thoughts on possible implementation of the methods proposed here. In case of low-complexity systems, when few variables are under analysis, and when previous experience on similar systems is present, these methods could be used as a confirmation of the expected trends, or as a proof for the analysis underlying assumptions. For complex and new systems, the implementation of the methods could reduce the engineering-team effort in exploring different solutions and architectures. In the cases where very experienced specialists are present within the engineering team (that would probably have already a clear picture of the priorities of the factors for the inherent problem), the standardized graphical approach could be a valid tool for them to explain thoughts and solutions. However, understanding performance trends in the presence of constraints and multiple objectives beforehand could also for them be a non-trivial task. On the other hand, the less experienced team members could benefit from the tool even with easy problems and expected behaviors, thus improving the overall design process, quality and effectiveness.

The contribution of the human factor is fundamental for obtaining a final product with a high cost/effectiveness value. With the integrated design approach presented in this chapter we do not mean to substitute the humans in the process of designing but, quite on the contrary, to better support their activities.

#### **9. Acknowledgments**

The authors are grateful to Ron Noomen (Section Astrodynamics and Space Missions, Faculty of Aerospace Engineering, Delft University of Technology) for many useful discussions on the test case presented in this chapter.

#### **10. References**


30 Will-be-set-by-IN-TECH

baseline. In fact the AMH also allows to take expert-judgment into account with a unified epistemic-stochastic sampling approach. The Regression Based Sensitivity Analysis presented in this chapter, coupled with the AMH, allows to compute global variance-based sensitivity indices with a reduced computational effort if compared to other global sensitivity analysis methods. The great advantage of sensitivity analysis performed already at an early stage of the design process, as demonstrated with the Earth-observation mission, is that it could speed up the process itself providing the engineering team with an *X-ray machine* that allows to

We would like to close this chapter with few final thoughts on possible implementation of the methods proposed here. In case of low-complexity systems, when few variables are under analysis, and when previous experience on similar systems is present, these methods could be used as a confirmation of the expected trends, or as a proof for the analysis underlying assumptions. For complex and new systems, the implementation of the methods could reduce the engineering-team effort in exploring different solutions and architectures. In the cases where very experienced specialists are present within the engineering team (that would probably have already a clear picture of the priorities of the factors for the inherent problem), the standardized graphical approach could be a valid tool for them to explain thoughts and solutions. However, understanding performance trends in the presence of constraints and multiple objectives beforehand could also for them be a non-trivial task. On the other hand, the less experienced team members could benefit from the tool even with easy problems and expected behaviors, thus improving the overall design process, quality and effectiveness.

The contribution of the human factor is fundamental for obtaining a final product with a high cost/effectiveness value. With the integrated design approach presented in this chapter we do not mean to substitute the humans in the process of designing but, quite on the contrary,

The authors are grateful to Ron Noomen (Section Astrodynamics and Space Missions, Faculty of Aerospace Engineering, Delft University of Technology) for many useful discussions on the

Antonov, I. & Saleev, V. (1979). An economic method of computing lpt sequences., *USSR*

Back, T., Fogel, D. & Michalewicz, Z. (2000). *Evolutionary Computation.*, Vol. 1-2, Institute of

Box, G., Hunter, W. & Hunter, J. (1979). *Statistics for Experimenters. An Introduction to Design,*

Box, G. & Wilson, K. (1951). On the experimental attainment of optimum conditions., *Journal*

Bratley, P. & Fox, B. (1988). Implementing sobol's quasirandom sequence generator, *ACM*

*Computational Mathematics and Mathematical Physics* 19(1): 252–256.

*Data Analysis and Model Building*, Wiley, New York.

*Transactions on Mathematical Software* 14(1): 88–100.

*of the Royal Statistical Society* 13(B): 1–45.

efficiently understand the effect of their design choices on the system.

to better support their activities.

test case presented in this chapter.

Physics Publishing, Bristol.

**9. Acknowledgments**

**10. References**


### **Functional Analysis in Systems Engineering: Methodology and Applications**

Nicole Viola, Sabrina Corpino, Marco Fioriti and Fabrizio Stesina *Politecnico di Torino Italy* 

#### **1. Introduction**

32 Will-be-set-by-IN-TECH

70 Systems Engineering – Practice and Theory

Sobieszczanski-Sobieski, J. & Haftka, R. T. (1995). Multidisciplinary aerospace design

Sobol, I. (1979). On the systematic search in a hypercube., *SIAM Journal on Numerical Analysis*

Sobol, I. M. (1993). Sensitivity analysis for nonlinear mathematical models, *Mathematical*

Sobol, I. M. (2001). Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates, *Mathematics and Computers in Simulation* 55: 271–280. Taguchi, G. (1987). *System of Experimental Design. Engineering Method to Optimize Quality and Minimize Costs.*, UNIPUB/Kraus International Publications, New York. Tedford, N. P. & Martins, J. R. (2006). On the common structure of MDO problems:

Viana, F., Venter, G. & Balabanov, V. (2010). An algorithm for fast optimal latin hypercube

Yi, S. I., Shin, J. K. & Park, G. J. (2008). Comparison of MDO methods with mathematical

*Exhibit*, Reno, Nevada. AIAA 96-0711.

*Modeling and Computational Experiment* 1: 407–414.

*Optimization Conference*, Portsmouth, VA. AIAA 2006-7080.

design of experiments, *Int. J. Numer. Meth. Engng* (82): 135–156. Wertz, J. & Larson, W. (1999). *Space Mission Analysis and Design*, Springer, New York.

examples, *Structural and Multidisciplinary Optimization* 35: 391–402.

16(5): 790–793.

optimization: Survey of recent developments, *34th Aerospace Sciences Meeting and*

A comparison of architectures, *11th AIAA/ISSMO Multidisciplinary Analysis and*

Functional Analysis is a fundamental tool of the design process to explore new concepts and define their architectures. When systems engineers design new products, they perform Functional Analysis to refine the new product's functional requirements, to map its functions to physical components, to guarantee that all necessary components are listed and that no unnecessary components are requested and to understand the relationships between the new product's components. The chapter begins with the definition of the role of Functional Analysis in conceptual design (section 2) and then proceeds with the description of a proposed methodology (section 3 and sub-sections 3.1, 3.2 and 3.3) and with the presentation of its applications (section 4 and sub-sections 4.1, 4.2 and 4.3) at subsystem, system and system of systems levels. Eventually some conclusions are drawn.

The design process, in particular the design process of complex systems, can be split into three phases (Raymer, 1999):


Even though Functional Analysis applies to every phase of the design process, it turns out to be particularly useful during conceptual design, when there is still a wide range of potentially feasible solutions for the future product. The precious role of Functional Analysis consists in individuating as many available options as possible, without forgetting any ideas that may offer significant advantages. In the remainder of the chapter we refer specifically to the application of Functional Analysis during the conceptual design phase to explore complex systems.

#### **2. Functional analysis in conceptual design**

The conceptual design process is schematically illustrated in Figure 1, where the role of Functional Analysis is highlighted, as well as its interactions with all other building blocks of the conceptual design methodology. Starting from the mission statement, the mission objectives can be derived. Once the broad goals of the system, represented by the mission objectives, have been established, the system requirements can be defined. On the basis of the system requirements, the conceptual design process evolves through the system architecture and the mission definition. The system architecture definition consists of the two main tasks: Functional Analysis and System Sizing.

Fig. 1. Conceptual design process flow-chart

Primary results of Functional Analysis are the functional tree and the product tree: the former identifies the basic functions, which the system has to be able to perform, while the latter individuates all system physical components, which are able to carry out the basic functions. In other words, these components may be the equipments or the subsystems, which constitute the whole system. Once the components of the product tree have been identified, it is possible to investigate how they are connected to form the system. It is thus possible to develop both the functional block diagram (secondary or additional result of

the system requirements, the conceptual design process evolves through the system architecture and the mission definition. The system architecture definition consists of the

> System Requirements (Mission, Interface, Operational, Functional, Performance, Physical and Product Assurance).

Mission statement. Mission objectives.

> Functional Analysis. Definition of Functional Tree and Product Tree (System and Subsystems' Architecture).

> > System sizing. Definition of System's Mass, Power and Thermal Budgets.

> > > ?

System's design synthesis.

No

Yes

*System Architecture Definition*

Primary results of Functional Analysis are the functional tree and the product tree: the former identifies the basic functions, which the system has to be able to perform, while the latter individuates all system physical components, which are able to carry out the basic functions. In other words, these components may be the equipments or the subsystems, which constitute the whole system. Once the components of the product tree have been identified, it is possible to investigate how they are connected to form the system. It is thus possible to develop both the functional block diagram (secondary or additional result of

two main tasks: Functional Analysis and System Sizing.

Mission Analysis. Definition of Mission's Phases and Scenarios.

Definition of System Modes of Operation.

Fig. 1. Conceptual design process flow-chart

*Mission Definition*

Functional Analysis) and the physical block diagram of each subsystem and of the whole system. In order to complete the system architecture, the definition of the system budgets (mass, electric power, thermal power budgets, etc.) has to be carried out. However this task can be fulfilled only after the system modes of operation have been established. The modes of operation are part of the mission definition and can in their turn been set up only after the subsystems and their equipments have been identified. Once both the mission and the system architecture have been preliminary defined, before proceeding any further with the system design synthesis, it is important to verify whether or not all system requirements have been satisfied. Being the design activity typically a process of successive refinements, several iterations may be necessary before achieving the system design synthesis, thus freezing the system design.

Iterations may occur at every stage of the conceptual design process, thus resulting in a continuous trade or refinement of system requirements. In particular, as far as functional requirements (which are part of system requirements) are concerned, their refinement is mainly due to the feedback of Functional Analysis outputs and specifically of functional tree outputs. The basic functions, i.e. the bottom level functions of the tree, are in fact used to completely define or just refine the functional requirements. Unlike system requirements, which are detailed descriptions or quantitative expressions of the system itself, taking into account what we would like to achieve and what the budget allows us to achieve, mission objectives are the broad goals that the system shall achieve to be productive. Thus, whereas system requirements are traded throughout the design process, mission objectives may be slightly or not at all modified during conceptual design. For these reasons the top level function of the functional tree, i.e. the very first step of the Functional Analysis, can either be one mission objective or one top level functional requirement.

Functional Analysis as a fundamental tool of the design process is discussed by a number of references. Wertz and Larson (Wertz & Larson, 2005) present the Functional Analysis to decompose the functional requirements and focus only on one single task of Functional Analysis, i.e. the functional tree. NASA (NASA, 2007) and ESA (ESA, 2009) consider Functional Analysis as the systematic process of identifying, describing, and relating the functions a system has to be able to perform, in order to be successful, but does not consider it as a design tool to address how functions will be performed, i.e. to map functions to physical components. Particular emphasis is given to the possibility of capturing the technical requirements by performing Functional Analysis (ESA, 2009). In contrast we present Functional Analysis both to define the system functional architecture, through the development first of the product tree and then of the functional block diagram, and to define or refine the functional requirements, through the accomplishment of the functional tree. The following section describes into the details the tasks of the proposed Functional Analysis methodology.

#### **3. Functional Analysis: Methodology**

Starting from the mission objectives/top level system requirements or directly from the mission statement, the Functional Analysis allows identifying the physical components, the so-called building blocks, which constitute the future product, and how they are interrelated to build up the functional architecture of the future product. Moreover through Functional Analysis the functional requirements can be defined or anyway refined.

In conceptual design Functional Analysis can be applied at different levels: subsystem level (like the avionic subsystem of an aircraft, consisting of various pieces of equipment; see subsection 3.1), system (like a satellite consisting of various subsystems; see sub-section 3.2) and system of systems level (like a Moon base, consisting of various systems; see sub-section 3.3). According to the considered level, the physical components or building blocks, which make up the future product, are therefore equipments, subsystems or systems.

*Functional Architecture*

Fig. 2. The Functional Analysis

Figure 2 shows the flow-chart of the proposed Functional Analysis methodology, illustrating all its tasks, how the various tasks are related one another and the inputs/outputs of each task.

The tasks, which have to be accomplished in order to carry out Functional Analysis, are listed hereafter:


On the basis of the mission objectives/top level system requirements the functional tree has to be developed as first step of Functional Analysis. Once the basic functions have been identified and the functional tree has therefore been completed, the functions/components matrix can be built and the basic components of the product tree can be individuated. Once the basic components have been determined, both the product tree and the connection matrix can be completed. Eventually, knowing all components (thanks to the product tree) and their relationships (thanks to the connection matrix), the functional block diagram can be fulfilled.

As highlighted in Figure 2, the main core of Functional Analysis is made up of the functional tree, the functions/devices matrix and the product tree. In fact through the functional tree and particularly through the identification of the basic functions, the functional requirements of the future product can be defined or refined, and through the product tree the building blocks of the future product can be determined, thus laying the major groundwork for the definition of the functional architecture of the future product. The functional architecture can then be completed once the relationships between the various components are clearly identified, i.e. after developing the connection matrix and the functional block diagram.

Primary outputs or objectives of Functional Analysis are therefore (see red boxes in Figure 2):


74 Systems Engineering – Practice and Theory

In conceptual design Functional Analysis can be applied at different levels: subsystem level (like the avionic subsystem of an aircraft, consisting of various pieces of equipment; see subsection 3.1), system (like a satellite consisting of various subsystems; see sub-section 3.2) and system of systems level (like a Moon base, consisting of various systems; see sub-section 3.3). According to the considered level, the physical components or building blocks, which

Functional tree

*basic functions*

*Functional Requirements*

*Functional Architecture*

Functions/devices matrix

*basic components*

Product tree

*Main core of the Functional Analysis*

Figure 2 shows the flow-chart of the proposed Functional Analysis methodology, illustrating all its tasks, how the various tasks are related one another and the

The tasks, which have to be accomplished in order to carry out Functional Analysis, are

On the basis of the mission objectives/top level system requirements the functional tree has to be developed as first step of Functional Analysis. Once the basic functions have been identified and the functional tree has therefore been completed, the functions/components matrix can be built and the basic components of the product tree can be individuated. Once the basic components have been determined, both the product tree and the connection matrix can be completed. Eventually, knowing all components (thanks to the product tree) and their relationships (thanks to the connection matrix), the functional block diagram can

make up the future product, are therefore equipments, subsystems or systems.

Connection matrix

*links between components*

functions/components (or functions/devices) matrix;

Functional block diagram

*Functional Architecture*

Fig. 2. The Functional Analysis

inputs/outputs of each task.

 product (or physical) tree; connection matrix; functional block diagram.

listed hereafter:

be fulfilled.

functional tree;

Secondary output or objective of the Functional Analysis is (see green boxes in Figure 2):

functional block diagram.

In the next sub-sections all tasks of Functional Analysis are considered separately and described into the details. In particular the most important rules, that have to be known to fulfil each task, are given and the procedure, that has to be followed, to move from one task to the next one, is explained.

#### **3.1 Functional tree**

The functional tree gives the possibility of representing a product by means of the functional view, which is alternative to the more common physical view. The functional and physical views are complementary not opposite views. In fact through the functional view we look at a new product asking ourselves "what does it do?", while through the physical view we look at a new product asking ourselves "what is it?", which is without any doubts the most immediate question that arises in our mind, when looking at something that is unknown. Both views are valid as they are fundamental approaches to analyze complex systems by subdividing them into parts, characterized by a poor or high level of details, depending on the need of thoroughness and/or on the level of the analysis itself.

The functional tree allows splitting the higher level functions, which stem from the mission objectives/top level system requirements, into lower level functions and eventually it allows identifying the basic functions that have to be performed by the future product. Higher level functions are complex functions that have to be decomposed into simpler functions, i.e. lower level functions, in order to accomplish the analysis. Therefore, starting from the socalled top level function, the functional tree generates various branches, which move from the most complex function to the basic functions, i.e. those functions at the bottom of the tree that cannot be split any further. Main output of the functional tree is therefore the identification of the basic functions through the decomposition of the higher level functions. The basic functions help defining or refining the functional requirements of the future product, as each basic function can be rewritten as a functional requirement. As an example, the basic function of Figure 3 "To detect infra-red (IR) threads" can be rewritten as "The system shall be able to detect infra-red (IR) threads". Figure 3 shows an example of functional tree. The top level function is "To perform defence", particularly the defence of a military aircraft. The blue box represents the top level function, while the red boxes represent the basic functions. Starting from the top level function and getting down to the basic functions, two successive levels of functions decomposition can be noted: the first (green boxes in Figure 3) and the second level functions (yellow boxes in Figure 3).

Fig. 3. Example of functional tree

In order to carry out the functional tree, the next rules have to be followed:


military aircraft. The blue box represents the top level function, while the red boxes represent the basic functions. Starting from the top level function and getting down to the basic functions, two successive levels of functions decomposition can be noted: the first

(green boxes in Figure 3) and the second level functions (yellow boxes in Figure 3).

In order to carry out the functional tree, the next rules have to be followed:

2. The definition of each function shall be as general as possible. Pursuing maximum generality, when describing functions, allows fostering the search of alternative solutions, in order not to forget any valuable options. This fundamental rule can be satisfactorily applied at the highest levels of the functional tree. However the lower are the levels of the tree, the less general are the functions' definitions. It is true in fact that the more you go down the three branches, the simpler become the functions and the more you get into the details of your analysis, thus making choices between available solutions. For example, if we look at the functional tree depicted in Figure 3, we note that the definitions of the first level functions are still very general, as they represent the logical decomposition of the top level function into the following sequence: "to get information" ("to detect the threats" in Figure 3), "to process information" (this function is included into "to respond to the threats" in Figure 3) , "to provide something with that information" ("to respond to the threats" in Figure 3) and/or "to provide somebody with that information" ("to display to the crew the information" in Figure 3). Then dropping to lower levels of the tree, we see that the basic functions refer

3. Lower level functions shall be either part of higher level functions or additional

4. Lower level functions shall derive from higher level functions by asking "how" that higher level function can be performed. Therefore we move from the top to the bottom of the tree, through its various branches, asking ourselves "how". Viceversa we move

1. each function shall be expressed by means of verb and noun.

to specific threats or counter-measures.

functions.

Fig. 3. Example of functional tree

from the bottom to the top of the tree by asking ourselves "why". Looking again at the example reported in Figure 3, we may decompose the top level function "to perform defence" by asking ourselves: "how can the defence (of a military aircraft) be performed?". The answer to this question, that will thus represent the first subdivision of the top level function, may be (as shown in Figure 3): "to detect the threats" (i.e. "to get information"), "to respond to the threats" (i.e. "to process information" and "to provide something with that information") and "to display to the crew the information" (i.e. "to provide somebody with that information").


Eventually, getting back to the comparison between the function and physical views, the main advantages/criticalities of the functional view (i.e. typical approach of the functional tree) are reported hereafter.

The most significant advantages can be summarised as follows:

 the development of the functional tree, starting from mission objectives/top level system requirements, implies a thorough analysis of the mission objectives/top level system requirements themselves. This guarantees that the product, defined on the basis of the functional tree, meets all customer's needs and this is particularly important, if we remember that the functional tree is a design tool, useful to develop a new product. It is worth remembering here that, when we carry out the functional tree, we know very little about the new product. We just know the mission objectives and typically we have a preliminary draft of the system requirements but we ignore all elements that will constitute the new product. Thanks to the functions/devices matrix and then to the product tree, we will be able to say what elements will constitute the new product.


The most significant criticalities can be summarised as follows:


#### **3.2 Functions/devices matrix and product tree**

Once the basic functions have been identified, it is possible to choose the components that will perform those functions by means of the functions/components (or functions/devices) matrix. The functions components matrix is therefore used to map functions to physical components.

The functions/components matrix can be built simply by matching the bottom of the functional tree, consisting of all basic functions, with one column of components able to perform those functions. Starting from the column containing the first basic function under consideration, the component able to perform that function can be defined by simply answering the question: "which component is able to perform this function?". This component can then be written down in the first row of the column of devices. The same process applies to all basic functions. Starting from the analysis of the first basic function, new components progressively fill in the column of devices. Eventually all basic components are determined. Table 1 shows a possible solution for the functions/devices matrix related to the functional tree illustrated in Figure 3. Following the procedure reported above, we take into account the first basic function on the left hand side of the functions/devices matrix, "to detect infra-red (IR) threats". If we ask ourselves which component or better which equipment is able to perform this function, we may answer that both the missile warning receiver and the infra-red (IR) warning receiver are able to fulfil the task. Then we write down both equipments in two separate rows of the functions/devices matrix and tick the intersections between these rows and the column of the basic function under consideration. Applying the same procedure, we gradually complete the functions/devices matrix, thus identifying all basic equipments.

Looking at Table 1 and remembering the logical decomposition of the top level function reported in Figure 3 ("to get information": "to detect the threats", "to process information": "to respond to the threats", "to provide something with that information": "to respond to the threats" and "to provide somebody with that information": "to display to the crew the information"), we note that:

The abstract view, typical of the functional tree, fosters the search of alternative

The functional view is absolutely coherent with the systems engineering view, which

 starting from the same mission objective/top level system requirement, different functional trees can be developed, depending on the people working at it and on the envisaged solutions. It is clear therefore that carrying out a functional tree is a typical design activity, which requires the widespread knowledge of the systems engineering designer, whose mind is not confined to any specific discipline but can embrace the

 As typically the available options may be many, the main risk resides in the possibility of forgetting some concepts that may offer significant advantages for the future

Once the basic functions have been identified, it is possible to choose the components that will perform those functions by means of the functions/components (or functions/devices) matrix. The functions components matrix is therefore used to map functions to physical

The functions/components matrix can be built simply by matching the bottom of the functional tree, consisting of all basic functions, with one column of components able to perform those functions. Starting from the column containing the first basic function under consideration, the component able to perform that function can be defined by simply answering the question: "which component is able to perform this function?". This component can then be written down in the first row of the column of devices. The same process applies to all basic functions. Starting from the analysis of the first basic function, new components progressively fill in the column of devices. Eventually all basic components are determined. Table 1 shows a possible solution for the functions/devices matrix related to the functional tree illustrated in Figure 3. Following the procedure reported above, we take into account the first basic function on the left hand side of the functions/devices matrix, "to detect infra-red (IR) threats". If we ask ourselves which component or better which equipment is able to perform this function, we may answer that both the missile warning receiver and the infra-red (IR) warning receiver are able to fulfil the task. Then we write down both equipments in two separate rows of the functions/devices matrix and tick the intersections between these rows and the column of the basic function under consideration. Applying the same procedure, we gradually

complete the functions/devices matrix, thus identifying all basic equipments.

Looking at Table 1 and remembering the logical decomposition of the top level function reported in Figure 3 ("to get information": "to detect the threats", "to process information": "to respond to the threats", "to provide something with that information": "to respond to the threats" and "to provide somebody with that information": "to display to the crew the

solutions, thus avoiding biased choices.

**3.2 Functions/devices matrix and product tree** 

product.

components.

information"), we note that:

looks at the system as the integration of various elements.

whole multi-disciplinary system as integration of various parts.

The most significant criticalities can be summarised as follows:



Table 1. Example of functions/devices matrix

Thanks to the functions/devices matrix we now know the basic components or building blocks, which constitute the future product. By simply grouping together the basic components, the product or physical tree of the new product can be generated. Unlike the functional tree, which has a typical top-down approach, the development of the product tree follows a straightforward bottom-up process. As we do know, according to the considered level, i.e. subsystem, system or system of systems level, the building blocks are respectively equipments, subsystems or systems. In case, for instance, the building blocks are equipments, they may be grouped into subsystems to form the whole system or, better, the product tree of the whole system, like illustrated in Figure 4.

Fig. 4. Product tree and costs/functions matrix

In particular Figure 4 also shows the so-called functions/costs matrix, which is exactly the same as the functions/devices matrix except for the fact that here there is a column of costs instead of a column of devices. Quite obviously the functions/costs matrix can be built only after the functions/devices matrix, i.e. once the basic components have been identified. Main difference of the functions/costs matrix with respect to the functions/devices matrix lies in the consideration of the assembly cost. In fact, apart from the cost of each single basic component, the cost due to the assembly has to be taken into account, in order to estimate the cost of each single function and consequently the cost of the whole product.

#### **3.3 Connection matrix and functional block diagram**

80 Systems Engineering – Practice and Theory

Thanks to the functions/devices matrix we now know the basic components or building blocks, which constitute the future product. By simply grouping together the basic components, the product or physical tree of the new product can be generated. Unlike the functional tree, which has a typical top-down approach, the development of the product tree follows a straightforward bottom-up process. As we do know, according to the considered level, i.e. subsystem, system or system of systems level, the building blocks are respectively equipments, subsystems or systems. In case, for instance, the building blocks are equipments, they may be grouped into subsystems to form the whole system or, better, the

product tree of the whole system, like illustrated in Figure 4.

Fig. 4. Product tree and costs/functions matrix

In particular Figure 4 also shows the so-called functions/costs matrix, which is exactly the same as the functions/devices matrix except for the fact that here there is a column of costs instead of a column of devices. Quite obviously the functions/costs matrix can be built only after the functions/devices matrix, i.e. once the basic components have been identified. Main difference of the functions/costs matrix with respect to the functions/devices matrix Once the basic components have been identified, the links between the various components within the system can be determined. This goal is achieved by means of the connection matrix, which, as the name implies, highlights the connections between all building blocks.

Fig. 5. Example of connection matrix

The connection matrix can either be a triangular (see Figure 5) or a square matrix, where both rows and columns have the same basic components. Starting from the first row and then proceeding down the column of basic devices, all components have to be analyzed, in order to understand whether or not there are connections between them. In case two components have a connection because, for instance, they are requested to exchange information, then the box where the two components intersects has to be ticked. As we can see, for example, all boxes where sensors (highlighted in red colour in Figure 5) and displays (highlighted in green colour in Figure 5) intersect have been ticked to show that sensors and displays exchange information.

Fig. 6. The functional block diagram

Fig. 7. The physical block diagram

Fig. 6. The functional block diagram

Fig. 7. The physical block diagram

It is worth underlying that nothing is said explicitly about the nature of the connections. For instance, in Figure 5, which shows a possible solution for the connection matrix related to the functional tree of Figure 3 and to the functions/devices matrix of Table 1, the type of connection between all equipments is a pure signal of information between sensors, processors, passive or active counter-measures and displays.

A different representation of the same concept, expressed by the connection matrix, is obtained through the functional block diagram, where building blocks, that need to be connected, are linked through point-to-point connections. In case these connections are arrows and not just simple lines, the functional block diagram provides the reader with additional information, if compared to the connection matrix, as it highlights not merely connections but where these connections are pointing to, i.e. if they are half duplex or full duplex connections. Just by looking at a functional block diagram, it is therefore possible to understand that, for instance, sensors are transmitting information to displays and not viceversa. Like in the connection matrix, also in the functional block diagram nothing is said about the nature of the connections, which may be, for instance, power, signal or fluid lines. This information is instead provided by the physical block diagram, which may be considered as complementary to the functional block diagram.

Figure 6 shows an example of functional block diagram for a complex system, consisting of various subsystems. This system, named Permanent Habitable Module (PHM) (Viola et al., 2007) is the first module of a permanent future human settlement on the Moon, designed to sustain the presence of three astronauts on the lunar surface. All main subsystems are highlighted in different boxes and the connections between them are shown. For sake of clarity, Figure 7 illustrates the physical block diagram of the same system presented in Figure 6. Four different types of connections between the building blocks have been envisaged: structures (green lines in Figure 7), power (red lines in Figure 7), signal (black lines in Figure 7) and fluid lines (blue lines in Figure 7).

Structures guarantee, specifically by means of secondary and tertiary structures, the anchorage of all subsystems and particularly of all their equipments to the primary structure. A power line supplies the various building blocks with the necessary power. As far as the signal lines are concerned, it is worth noting that, unlike the functional block diagram where there are point-to-point connections, in the physical block diagram there is a main bus to transmit commands and receive feedbacks to/from the various subsystems. Eventually the building blocks that need an active cooling interface to dissipate heat are connected by a ducting fluid line with the Thermal Control Subsystem.

In the next section three different applications of the Functional Analysis methodology are presented and discussed.

#### **4. Functional Analysis: Applications**

As the Functional Analysis can be applied at different levels, three different examples of applications of the methodology are presented in the following sub-sections:


#### **4.1 Functional Analysis at subsystem level: The avionic system of a Very Light Business Jet aircraft**

This sub-section deals with the application of the proposed Functional Analysis methodology at subsystem level to define the avionic system of a Very Light Business Jet (VLBJ). The VLBJ segment is constituted by civil transport jet-powered aircraft with maximum takeoff weight ranging from 2 to 4,2 tons, cruise speed of about 600 – 700 Km/h and payload capability varying from 4 to 8 passengers. The VLBJ avionics has been chosen as useful example because of its new functionalities and characteristics, which are not implemented in the avionic system of other civil aircraft. In fact VLBJs are designed to be certified as single pilot operations. This is made possible by advanced avionics automation, functional integration and easy-to-use capability.

Considering the aircraft mission profile, the environment where the aircraft will have to operate (air traffic control, landing and takeoff aids, navigation aids) and passengers and pilot requirements, the following macro-functions can be identified:


For sake of simplicity only the macro-function "to allow navigation" will be dealt with here, in terms of functional tree and functions/devices matrix.

The complete functions decomposition of the top level function "to allow navigation" is reported hereafter.

#### **1. To allow navigation**

#### **1.1 To acquire data**

	- **1.1.6.1.1** To acquire data about navigation aids (VOR-DME) ground station
		- **1.1.6.1.1.1** To memorize waypoints (VOR-DME stations)
		- **1.1.6.1.1.2** To acquire radial and distance
		- **1.1.6.1.1.3** To calculate flight coordinates
	- **1.1.6.1.2** To acquire data for autonomous navigation
		- **1.1.6.1.2.1** To memorize waypoints coordinates
			- **1.1.6.1.2.2** To calculate flight coordinates
	- **1.1.6.2** To acquire climb, descent and approach trajectory
	- **1.1.6.3** To acquire landing path
	- **1.1.6.4** To acquire different approach trajectory
	- **1.1.6.5** To acquire missed approach procedure
	- **1.1.6.6** To acquire holding procedure

#### **1.2 Data processing**

84 Systems Engineering – Practice and Theory

This sub-section deals with the application of the proposed Functional Analysis methodology at subsystem level to define the avionic system of a Very Light Business Jet (VLBJ). The VLBJ segment is constituted by civil transport jet-powered aircraft with maximum takeoff weight ranging from 2 to 4,2 tons, cruise speed of about 600 – 700 Km/h and payload capability varying from 4 to 8 passengers. The VLBJ avionics has been chosen as useful example because of its new functionalities and characteristics, which are not implemented in the avionic system of other civil aircraft. In fact VLBJs are designed to be certified as single pilot operations. This is made possible by advanced avionics automation,

Considering the aircraft mission profile, the environment where the aircraft will have to operate (air traffic control, landing and takeoff aids, navigation aids) and passengers and

For sake of simplicity only the macro-function "to allow navigation" will be dealt with here,

The complete functions decomposition of the top level function "to allow navigation" is

**1.1.6.1.1** To acquire data about navigation aids (VOR-DME) ground

**1.1.6.1.1.1** To memorize waypoints (VOR-DME stations)

**1.1.6.1.1.2** To acquire radial and distance **1.1.6.1.1.3** To calculate flight coordinates **1.1.6.1.2** To acquire data for autonomous navigation **1.1.6.1.2.1** To memorize waypoints coordinates **1.1.6.1.2.2** To calculate flight coordinates **1.1.6.2** To acquire climb, descent and approach trajectory

**4.1 Functional Analysis at subsystem level: The avionic system of a Very Light** 

**Business Jet aircraft** 

 to allow navigation. To perform flight controls. To allow communications.

reported hereafter.

**1. To allow navigation 1.1 To acquire data**

functional integration and easy-to-use capability.

pilot requirements, the following macro-functions can be identified:

in terms of functional tree and functions/devices matrix.

**1.1.1** To identify weather situation **1.1.2** To detect magnetic field

**1.1.4** To acquire airplane data **1.1.5** To acquire airport data **1.1.6** To acquire flight plan data

**1.1.3** To acquire surrounding terrain altitude

station

**1.1.6.3** To acquire landing path

**1.1.6.6** To acquire holding procedure

**1.1.6.4** To acquire different approach trajectory **1.1.6.5** To acquire missed approach procedure


#### **1.3 Data management**


#### **1.4 To display information**


On the basis of the basic functions listed above, the functions/devices matrix can be created, as shown in Table 2, which, for sake of simplicity, illustrates only part of the complete functions/devices matrix related to the top level function "to allow navigation". It is worth remembering that, as in this case the Functional Analysis is applied at subsystem level, the basic components are the main subsystem equipments. The functions/devices matrix has thus been called functions/equipments matrix.

Eventually Figure 8 illustrates the functional block diagram of the complete avionic system, where both half duplex and full duplex connections between equipments are highlighted.


Table 2. Part of the complete functions/equipments matrix

#### **4.2 Functional Analysis at system level: The cubesat e-st@r**

In this sub-section an example of the methodology is given by the application of the Functional Analysis to a Cubesat project. The e-st@r (Educational SaTellite @ politecnico di toRino) program is taken as case-study. The project is an educational initiative carried out by students and researchers of Politecnico di Torino within an ESA program aiming at the launch and orbit operations of nine cubesats, developed by as many European Universities, to promote space activities among young generations. E-st@r program guidelines are illustrated in Figure 9.

The mission statement sounds as follows: "*Educate aerospace-engineering students on complex systems design and management, team work, and standards implementation. Achieve insight in the development of enabling technologies for low-cost access to space"*.

System (FMS) X X X X X

Computer X X

<sup>X</sup>

In this sub-section an example of the methodology is given by the application of the Functional Analysis to a Cubesat project. The e-st@r (Educational SaTellite @ politecnico di toRino) program is taken as case-study. The project is an educational initiative carried out by students and researchers of Politecnico di Torino within an ESA program aiming at the launch and orbit operations of nine cubesats, developed by as many European Universities, to promote space activities among young generations. E-st@r program guidelines are

The mission statement sounds as follows: "*Educate aerospace-engineering students on complex systems design and management, team work, and standards implementation. Achieve insight in the* 

(ADF) <sup>X</sup>

Range (VOR) <sup>X</sup>

To acquire surrounding terrain altitude

To identify weather situation

Weather Radar System (WX) <sup>X</sup>

ADAHRS (ADS+AHRS + Magnetometer)

Syntetic Vision

Flight Management

Navigation

Automatic Direction Finder

VHF Omni

Distance Measurement Equipment (DME)

illustrated in Figure 9.

**Basic equipment** 

To detect magnetic field

X

Table 2. Part of the complete functions/equipments matrix

**4.2 Functional Analysis at system level: The cubesat e-st@r** 

*development of enabling technologies for low-cost access to space"*.

System <sup>X</sup>

Flight Computer X

**Basic functions** 

To acquire airport data

To memorize waypoints (VOR-DME stations)

To acquire radial and distance

To calculate flight coordinate s

To acquire airplane data

Fig. 8. VLBJ avionics functional block diagram

Fig. 9. e-st@r program guidelines

The following assumptions can be derived from the mission statement:


Notwithstanding the necessity of keeping cost down and taking into account the educational spirit of the e-st@r program, which implies the will of enhancing the interests and competences of the students, e-st@r has also scientific objectives, which reflect real interests of the scientific and industrial communities. Taking into account all high-level requirements and constraints, as a result of a trade-off analysis it has been decided that the system would accomplish a mission aimed at testing an active Attitude Determination and Control System (ADCS).

In conclusion, the mission scenario can be summed up as follows: a cubesat shall be inserted into a LEO by the end of 2012. The cubesat shall be piggybacked by the Vega LV during its Maiden Flight. Mission duration for this kind of project shall be in the range of 3-12 months. The cubesat shall be operated from ground in a simply and cheap way. High grade of operations autonomy is desirable. Students shall be designers, developers, manufacturers, operators and managers of the entire mission. The mission shall demonstrate some kind of non-space technologies and try to space-qualify them. The primary payload shall be a simple active ADCS. As secondary payload, the test of commercial items is considered. The mission data shall be available to the cubesat community and to radio-amateurs union. No commercial purposes shall be pursued.

Functional Analysis methodology has been used to derive the requirements for the system and to determine which subsystems are needed to carry out the mission. The second iteration of the Functional Analysis allows deriving next level requirements for equipments and components. A part of the complete functional tree for the e-st@r mission is shown in Figure 10. The mission segments are identified by the first level functions (i.e. "To connect ground and space segments", "To do on ground operations", "To reach the orbit", "To do in orbit operations" and "To comply with space debris mitigation regulations") and they reflect the mission architecture's elements.

The elements of the e-st@r mission architecture are reported hereafter:


the program shall be carried out by students. They shall design, manufacture, verify

The program shall be carried out in compliance with current regulations and applicable

The program shall have educational relevance, which means that students must learn

 An experiment shall be included in the space system. The experiment shall be simple and cheap, but at the same time it must permit to achieve insight in a discipline and/or

The program driver shall be the research for low-cost solutions in design, manufacture,

Notwithstanding the necessity of keeping cost down and taking into account the educational spirit of the e-st@r program, which implies the will of enhancing the interests and competences of the students, e-st@r has also scientific objectives, which reflect real interests of the scientific and industrial communities. Taking into account all high-level requirements and constraints, as a result of a trade-off analysis it has been decided that the system would accomplish a mission aimed at testing an active Attitude Determination and

In conclusion, the mission scenario can be summed up as follows: a cubesat shall be inserted into a LEO by the end of 2012. The cubesat shall be piggybacked by the Vega LV during its Maiden Flight. Mission duration for this kind of project shall be in the range of 3-12 months. The cubesat shall be operated from ground in a simply and cheap way. High grade of operations autonomy is desirable. Students shall be designers, developers, manufacturers, operators and managers of the entire mission. The mission shall demonstrate some kind of non-space technologies and try to space-qualify them. The primary payload shall be a simple active ADCS. As secondary payload, the test of commercial items is considered. The mission data shall be available to the cubesat community and to radio-amateurs union. No

Functional Analysis methodology has been used to derive the requirements for the system and to determine which subsystems are needed to carry out the mission. The second iteration of the Functional Analysis allows deriving next level requirements for equipments and components. A part of the complete functional tree for the e-st@r mission is shown in Figure 10. The mission segments are identified by the first level functions (i.e. "To connect ground and space segments", "To do on ground operations", "To reach the orbit", "To do in orbit operations" and "To comply with space debris mitigation regulations") and they

 Ground segment: one main ground control station + one backup ground control station (mobile and transportable). Radio amateur network. Cubesat laboratory at Polito.

The following assumptions can be derived from the mission statement:

technology to be used in the future to allow low-cost space mission.

and test, and operate a space system.

operations and management of space systems.

standards.

by practice.

Control System (ADCS).

commercial purposes shall be pursued.

reflect the mission architecture's elements.

Subject: data measurement.

Space segment: Cubesat = payload and bus.

Launch segment: Vega LV and CSG (French Guyana)

The elements of the e-st@r mission architecture are reported hereafter:

Fig. 10. Part of the complete functional tree for e-st@r mission

As an example, the product tree of two elements of the e-st@r mission architecture, the space (i.e. the cubesat, made up of payload and bus) and the ground segment, is shown in Figure 11. It is worth noting that, while the space segment can be directly linked through a functions/devices matrix to the first level function "To do in orbit operations" (see Figure 10), the ground segment can be directly linked to the first level function "To do ground operation"(see Figure 10). Eventually Figure 12 illustrates the physical block diagram of the cubesat. The block diagram shows all subsystems (apart from structures) and their connections. The design and sizing of the subsystems in phase A have been carried out using common available methods (Wertz & Larson, 2005), (Fortescue et al., 2003).

Fig. 11. Product tree of the e-st@r system: the ground and the space segment

Fig. 12. Physical block diagram of the e-st@r cubesat

#### **4.3 Functional Analysis at system of systems level: the permanent human Moon base PHOEBE**

The system of systems here considered is a permanent human Moon base. The Functional Analysis methodology has been applied, in order to accomplish the primary objectives, i.e. in order to develop the functional tree and the product tree of the Moon base.

Fig. 11. Product tree of the e-st@r system: the ground and the space segment

Fig. 12. Physical block diagram of the e-st@r cubesat

**PHOEBE** 

**4.3 Functional Analysis at system of systems level: the permanent human Moon base** 

The system of systems here considered is a permanent human Moon base. The Functional Analysis methodology has been applied, in order to accomplish the primary objectives, i.e.

in order to develop the functional tree and the product tree of the Moon base.

The Moon base has been given the name PHOEBE, which stays for: Permanent Human mOon Exploration BasE (Viola et al., 2008).

The mission statement is reported hereafter: "*To establish a permanent lunar base for a nominal crew of 18 astronauts (maximum 24 during crew rotation) with a turnover time of 6 months, to support scientific research, In-Situ Resources Utilization (ISRU) development, surface exploration and commercial exploitation; its evolution will provide an outpost for further space exploration".*

After the definition of the mission statement, nine mission objectives have been determined. The main top level system requirements are schematically represented in Figure 13, where they can be traced back to their correspondent mission objectives.

Once the top level system requirements, which stem from the mission statement and mission objectives, have been defined, the design process has proceeded with the accomplishment of the Functional Analysis, in order to determine all building blocks, i.e. the systems or modules, of the Permanent Human Moon Base that satisfy the top level system requirements. Main results of the Functional Analysis are presented hereafter. In particular Figure 14 illustrates the so-called "first level" functional tree, where the top level function "To carry out a Permanent Human Moon Base" has been split into 10 first level functions. Each first level function has then been divided into lower level functions to identify the basic level functions, i.e. those functions that can immediately be connected to one building block of the Moon base.

Fig. 13. Mission objectives and top level system requirements

Fig. 14. PHOEBE First level functional tree

For sake of clarity, Figure 15 shows how the first level function "To reach progressively independence from Earth" has been decomposed into its basic functions:


Fig. 15. Functional tree of the first level function: "To reach progressively independence from Earth"

maintenance of fixed and mobile elements.

To extract and convert water ice.

To support surface exploration and operations.

For sake of clarity, Figure 15 shows how the first level function "To reach progressively

Fig. 15. Functional tree of the first level function: "To reach progressively independence

independence from Earth" has been decomposed into its basic functions:

To provide transportation systems.

To reach progressively indipendence from Earth.

> To ensure continous communications.

nominally, 24 maximum for 14 days To support surface operations for

To carry out a Permanent Human Moon Base

To support life astronauts: 18

To perform ISRU activities.

To provide scientific facilities.

To support economic exploitation of the Moon.

Fig. 14. PHOEBE First level functional tree

 to provide plants growth facilities. To store the food produced.

To extract resources from the waste.

 To store the retrieved consumable. To provide the electrical power.

from Earth"

To retrieve TBD (To Be Defined) consumables.

Once the basic functions have been identified, it is possible to choose the building blocks of the Moon base that will perform those functions. Considering for instance the basic functions presented in Figure 15, the corresponding building blocks can be obtained through the functions/building blocks (or functions/devices) matrix (see Table 5).


Table 3. PHOEBE: example of functions/building blocks matrix

As addressed in Table 3, six building blocks have been identified:


Applying the same methodology to all the other first level functions listed in Figure 14, the complete product tree of the Moon base, i.e. all PHOEBE systems, can be obtained, as Figure 16 illustrates, where the various building blocks have been grouped into four different categories or segment: transportation, in-space, mobile and fixed segment.

The identification of all systems of the Moon base and the understanding of their related functions are the final results of the presented Functional Analysis at system of systems level.

#### **5. Conclusion**

94 Systems Engineering – Practice and Theory

Applying the same methodology to all the other first level functions listed in Figure 14, the complete product tree of the Moon base, i.e. all PHOEBE systems, can be obtained, as Figure 16 illustrates, where the various building blocks have been grouped into four different

categories or segment: transportation, in-space, mobile and fixed segment.

Fig. 16. PHOEBE product tree

The Functional Analysis is without any doubts one of the most fundamental tool of systems engineering design to develop a new product, as it guarantees a thorough analysis of the requirements, it fosters the search of alternative solutions, thus avoiding or at least limiting the risk of forgetting valuable options, and eventually it allows identifying the physical components of the future product and their relationships. It is therefore of paramount importance for every systems engineer to learn how to apply Functional Analysis to explore new concepts and then satisfactorily come out with innovative architectures.

After a brief introduction to underline the precious role of Functional Analysis within the conceptual design process, the chapter describes into the details all steps that have to be taken and all rules that have to be followed to accomplish Functional Analysis. Eventually the chapter presents three different applications of the methodology at subsystem, system and system of systems level.

#### **6. Acknowledgment**

The authors wish to thank all students that have worked at the e-st@r program and all students that have attended the SEEDS (SpacE Exploration and Development Systems) Master.

#### **7. References**


## **A Safety Engineering Perspective**

Derek Fowler and Ronald Pierce *JDF Consultancy LLP UK* 

#### **1. Introduction**

96 Systems Engineering – Practice and Theory

Viola, N., Messidoro, P. & Vallerani, E. (2007). Overview of the first year activities of the

Viola, N., Vallerani, E., Messidoro, P. & Ferro, C. (2008). Main results of a permanent

Hyderabad, India, 24-28 September 2007

2008

SEEDS Project Work, *Proceedings of 58th International Astronautical Congress*,

human Moon base Project Work activity 2006-2007, *Proceedings of 59th International Astronautical Congress*, Glasgow, Scotland, 29 September – 3 October,

> Safety is a viewpoint. By this we mean that safety is not in itself an attribute of a system but is a property that depends on other attributes and on the context in which the system is used. The question that arises (and which we will attempt to answer in some detail) is which attributes of a system determine whether it is safe or not, in its context of use.

> Throughout this chapter, the term *system* is used in the widest sense - ie it includes not just the technical elements (equipment) but also all other elements - eg the human-operator and operational procedures - that necessarily make up the complete, end-to-end system.

> We will start by attempting to dispel what seems to be a not-infrequent misconception (also reflected in some safety standards) that safety is mainly dependent on reliability (and / or integrity, depending on one's definition of the terms - see section 3.1 below). This we feel is important for those readers who may have had some previous exposure to areas of safety engineering in which this view is held and will lead to the inescapable conclusion that we need a broader view of system safety than is sometimes taken.

> Next we will establish some basic safety concepts firstly by defining key terms, and then by considering two distinct categories of safety-related system and seeing how the system properties determine safety in each case.

> Finally, and for the most part of this Chapter, we will explain how the broader approach to safety works and show that it is closely linked with (not 'special and different' from) systems engineering in general.

#### **2. "Safety is reliability" – Dispelling a myth**

[Leveson, 2001] in a review of major software-related accidents and the implication for software reliability, presents compelling evidence that software reliability had never been the cause of such disasters - on the contrary, in every case investigated, the software had performed in exactly the manner that it was designed to. The problem was that the software was designed to do the wrong thing for the circumstances under which it "failed" (or, as in the case of Ariane V, for example) was used for a purpose (ie in a context - see above) different from that for which it was originally designed. Professor Leveson quite rightly, therefore, poses the question as to why, in most software safety standards, so much emphasis is placed on processes to improve software reliability whilst not ensuring also that the resulting systems actually perform the intended function – ie allowing them to be what one might call "reliably unsafe". This same misconception is prevalent also at the system level in, for example, European Air Traffic Management (ATM) - see [Fowler, 2007].

We can illustrate the problem by considering the simple, everyday example of a car airbag for which, for the sake of this discussion, we wish to make a case that it would be safe.

If we were to simply follow a failure-based process - ie focus on how reliable the airbag needs to be in order to be 'safe' - we would start (at the wrong point, as we will see shortly) by identifying the hazards1 presented by the airbag. Such hazards are those caused by the airbag's two main failure modes: failure to operate when required, and operating when not required. We would then use a risk classification scheme to derive safety requirements that specify the maximum frequency with which those failures could be allowed to occur and from that we would deduce more detailed safety requirements which limit the frequency with which the causes of the hazards could be allowed to occur.

Even if the results were valid, they would lead us only to:


Of course, what is missing is any evidence of a positive safety contribution from the airbag only the possibility of actually being killed / seriously injured by it - without which we would have no case for fitting one.

If instead we were to take a more **rational** view, we would start from the position that in the event of, say, a head-on collision without an airbag there is a very high risk of death or serious injury to the driver (and other front-seat occupant(s)) of a car. This risk we can call *pre-existing* because, by definition, it is inherent in driving and has nothing whatsoever to do with the airbag – indeed it is to mitigate this risk that we are intending to fit the airbag in the first place. So, the more rational approach would be to:


Thus, given the correct set of functional properties – eg shape, location, strength, compressibility, sensitivity to 'g' forces, speed of deployment etc – as well as adequate reliability and integrity, our safety case should show that the airbag would make a positive contribution to the reduction in the identified pre-existing risk that is very much greater than the system-generated risk due to airbag failure. This would be a much more balanced, and rational conclusion than what emerged above from considering only airbag failure.

 1 A state of a system that could lead to an accident - see section 3.1.

#### **3. System safety – Basic concepts**

#### **3.1 Definitions**

98 Systems Engineering – Practice and Theory

the resulting systems actually perform the intended function – ie allowing them to be what one might call "reliably unsafe". This same misconception is prevalent also at the system

We can illustrate the problem by considering the simple, everyday example of a car airbag for which, for the sake of this discussion, we wish to make a case that it would be safe.

If we were to simply follow a failure-based process - ie focus on how reliable the airbag needs to be in order to be 'safe' - we would start (at the wrong point, as we will see shortly) by identifying the hazards1 presented by the airbag. Such hazards are those caused by the airbag's two main failure modes: failure to operate when required, and operating when not required. We would then use a risk classification scheme to derive safety requirements that specify the maximum frequency with which those failures could be allowed to occur and from that we would deduce more detailed safety requirements which limit the frequency

 an understanding of how reliable the airbag needs to be - so that it operates when required; this would, however, not give any assurance that, when it did operate, the airbag would actually protect the front-seat occupants from death or serious injury in

 the totally **irrational** conclusion that putting an airbag in a car would only increase the risk of death or serious injury to the front-seat occupants, because of the finite (albeit

Of course, what is missing is any evidence of a positive safety contribution from the airbag only the possibility of actually being killed / seriously injured by it - without which we

If instead we were to take a more **rational** view, we would start from the position that in the event of, say, a head-on collision without an airbag there is a very high risk of death or serious injury to the driver (and other front-seat occupant(s)) of a car. This risk we can call *pre-existing* because, by definition, it is inherent in driving and has nothing whatsoever to do with the airbag – indeed it is to mitigate this risk that we are intending to fit the airbag in the

 firstly assess how effective the airbag would be when it did work – ie by how much the *pre-existing* risk from driving would be reduced by the airbag - and what properties of

Thus, given the correct set of functional properties – eg shape, location, strength, compressibility, sensitivity to 'g' forces, speed of deployment etc – as well as adequate reliability and integrity, our safety case should show that the airbag would make a positive contribution to the reduction in the identified pre-existing risk that is very much greater than the system-generated risk due to airbag failure. This would be a much more balanced, and rational conclusion than what emerged above from considering only airbag failure.

level in, for example, European Air Traffic Management (ATM) - see [Fowler, 2007].

with which the causes of the hazards could be allowed to occur.

small) possibility that it would operate when not intended to!

the airbag determine the amount of this reduction; and then assess the *system-generated* risk, induced by airbag failure.

Even if the results were valid, they would lead us only to:

first place. So, the more rational approach would be to:

1 A state of a system that could lead to an accident - see section 3.1.

the event of a collision, and

would have no case for fitting one.

This section defines the safety terms that are used in the rest of this chapter. In most cases, as there is no single, universally accepted definition, the ones given below have been adapted from those in Part 4 of international functional-safety standard IEC 61508 [IEC, 2010] and, if not actually used by, should at least be understood in, any safety-related sector.

	- *Failure* termination of the ability of a functional unit to provide a required function or operation of a functional unit in any way other than as required

#### **3.2 Risk acceptability**

#### **3.2.1 The ALARP principle**

Risk may, in extremis, be either so great that it would be intolerable under any circumstances or so small as to be insignificant and therefore may be discounted altogether. In practice, however, risk will usually fall somewhere between these two extremes and the ALARP principle requires that any risk shall be reduced to a level that is as low as reasonably practicable, bearing in mind two factors: the benefits resulting from its acceptance, and the costs of any further reduction. ALARP is described in more detail in IEC 61508 [IEC, 2010], Part 5, Annex C; other standards and practices use different acronyms such as ALARA (USA) and SFAIRP (Aus). In the UK, the ALARP principle has a specific legal connotation and expert advice should be sought before applying it! [Ladkin, 2008].

A practical way of specifying what is *tolerable* risk, and in some cases applying the ALARP principle, either qualitatively or quantitatively, is the so-called Risk Classification Scheme (also known as a Hazard-Risk Index).

#### **3.2.2 Risk Classification Schemes**

Risk Classification Schemes (RCSs) are used in a number of industry sectors. Their form is as variable as their usage but a typical example, from the ATM sector [EUROCONTROL 2010], is shown in Figure 1.

<sup>2</sup> Whether or not a hazardous event results in harm depends on whether people, property or the environment are exposed to the consequence of the hazardous event and, in the case of harm to people, whether any such exposed people can escape the consequences of the event after it has occurred


An RCS is typically set out as a matrix, in which the severity of possible outcomes is mapped against the frequency with which the outcomes might occur.

Fig. 1. A Risk Classification Scheme

In this ATM example, the **severity** of outcome ranges from Class 1 (an accident involving death and/or serious injury3) to Class 4 (the lowest level of safety-significant incident) and, in practice, would be explained by detailed descriptions and illustrative examples. The **frequency** of outcome is shown both qualitatively and quantitatively, as the probability per flight hour (or per operating hour). The grid is populated with the tolerability / acceptability of the risk and, in this example, includes the ALARP principle4 as follows:


An RCS should be tailored to the purpose and function of the system or service concerned and the risks and benefits involved. It would normally be published in the Safety Management System for the organisation responsible for the operation, and be approved by the relevant safety-regulatory authority. It can be used in one of two ways:


<sup>3</sup> In European ATM, it is not usual practice for accidents to be 'graded' by the number of people killed and/or seriously injured.

<sup>4</sup> If the ALARP principle were not applied to the RCS, the red boxes (labelled 'A') might remain the same as in Figure 2 but the rest of the grid would show only what was tolerable.

An RCS is typically set out as a matrix, in which the severity of possible outcomes is

A **1**

A

A A

B

In this ATM example, the **severity** of outcome ranges from Class 1 (an accident involving death and/or serious injury3) to Class 4 (the lowest level of safety-significant incident) and, in practice, would be explained by detailed descriptions and illustrative examples. The **frequency** of outcome is shown both qualitatively and quantitatively, as the probability per flight hour (or per operating hour). The grid is populated with the tolerability / acceptability of the risk and, in this example, includes the ALARP principle4

Risk Class B is tolerable only if risk reduction is impracticable or cost grossly

Risk Class C is tolerable if cost of risk reduction would exceed the benefit of

An RCS should be tailored to the purpose and function of the system or service concerned and the risks and benefits involved. It would normally be published in the Safety Management System for the organisation responsible for the operation, and be approved by

 for safety monitoring of an on-going operation (see section 6.5 below) - in this case the achieved risk can be compared with what is tolerable / acceptable according to the

 for *a priori* safety assessment - in this case the severity of each potential outcome is assessed and the required maximum frequency of occurrence, in order to achieve an

3 In European ATM, it is not usual practice for accidents to be 'graded' by the number of people killed

4 If the ALARP principle were not applied to the RCS, the red boxes (labelled 'A') might remain the

the relevant safety-regulatory authority. It can be used in one of two ways:

acceptable (or at least tolerable) level of risk, is obtained from the RCS.

same as in Figure 2 but the rest of the grid would show only what was tolerable.

**Severity Class**

A A

B A

C D

**2 3 4**

A A B C D D B

C

D D D

A D

mapped against the frequency with which the outcomes might occur.

P<10-4 P<10-5 P<10-6 P<10-7

Frequent P>10-4

Probability per flt hr

P<10 Extremely -8 Improbable

Improbable

Fig. 1. A Risk Classification Scheme

Risk Class A is defined as intolerable

disproportionate to improvement

Risk Class D is defined as broadly acceptable

as follows:

improvement

RCS, and / or

and/or seriously injured.

Occasional Remote

Probable

One of the main attractions of RCSs is that they are relatively simple to use - and therein lies a potential problem, unless each user it careful to check the following:


The significance of some, if not all, of these 'health warnings' should become more apparent in the subsequent sections of this chapter.

#### **3.3 Safety-related systems and their properties**

#### **3.3.1 Safety-related systems – General**

Consider the two types of safety-related system (SRS) shown in Figure 2. Case (a) is a system - say, a complete nuclear power plant - which simply provides a service into its operational environment. Because the service in the case of a nuclear power plant is the provision of electrical power then, from a purely safety viewpoint, we do not care whether the service is provided or not. What we do care about are the hazards (eg radiation leakage), and the related level of risk, that a failure internal to the system might present to its operational environment (and to the people therein).

Case (b) is quite different. Here we have, first of all, a set of hazards that already exist in the operational environment. If, for example, the System was our car airbag (see section 2 above) then these hazards (and associated risks) would be those (*pre-existing*) hazards / risks inherent in driving a car, and the operational environment (from the airbag's perspective) would be the whole car.

Fig. 2. Two types of Safety related System

As we will see in more detail in section 3.3.3 below, what is very important about Figure 2 is that for case (b) the mitigation of *pre-existing* hazards / risks (ie what we want the system to do) and the inevitable introduction of *system-generated* hazards / risks (ie what we don't want the system to do) depend on entirely different properties of the system.

Before that, however, we will introduce IEC 61508 [IEC, 2010], probably the most widely accepted, international standard on the functional safety of systems.

#### **3.3.2 IEC 61508**

IEC 61508 has a particular model of how SRSs influence the real world that is based on the concept of the *Equipment Under Control* (EUC) which itself is regarded as the hazard-creating system5 and for which SRS are designed in order to mitigate those hazards [Pierce & Fowler, 2010]. Since IEC 61508 is a generic standard, to be adapted for application to a wide range of specific industry sectors, it has no particular view on the nature of the EUC, which could be a nuclear reactor, a chemical plant, an oil rig, a train, a car, or an aircraft etc6.

The standard then defines two types of SRS that are intended to mitigate the hazards and risks associated with the EUC:


As far as a Control System is concerned, the hazards and risks associated with its EUC are clearly *pre-existing*, since they are caused by the latter not the former. Similarly, the hazards and risks associated with the combination of an EUC and its Control System(s) are preexisting as far as a Protection System is concerned.

With this concept, an SRS (Control and/or Protection system) is put in place to reduce the pre-existing risks to an acceptable level. IEC 61508 refers to this as *Necessary Risk Reduction* but does not actually stipulate what is "acceptable", this being left to local or national considerations, including legal frameworks, for the applicable industry sector.

As we will see in section 3.3.3 below, safety integrity requirements on Control Systems are usually expressed as probability of failure per operating hour, whereas for Protection Systems they are usually expressed as probability of failure on demand. In either case, the target probability will of course depend on the level of the pre-existing risk. The

<sup>5</sup> Equivalent to Figure 2 case (a).

<sup>6</sup> In these examples, the EUC is very tangible and for these it is probably a better term than the equivalent term "operational environment" used in Figure 3(b). However, in some cases - eg air traffic management and a railway level crossing (see section 4) - the EUC is much less tangible and "operational environment" might be better. Whatever term is used, the principles are exactly the same and the concept of pre-existing risk is paramount!

<sup>7</sup> Eg, an ATP system is designed to stop a train if it passes a signal at danger.

As we will see in more detail in section 3.3.3 below, what is very important about Figure 2 is that for case (b) the mitigation of *pre-existing* hazards / risks (ie what we want the system to do) and the inevitable introduction of *system-generated* hazards / risks (ie what we don't

Before that, however, we will introduce IEC 61508 [IEC, 2010], probably the most widely

IEC 61508 has a particular model of how SRSs influence the real world that is based on the concept of the *Equipment Under Control* (EUC) which itself is regarded as the hazard-creating system5 and for which SRS are designed in order to mitigate those hazards [Pierce & Fowler, 2010]. Since IEC 61508 is a generic standard, to be adapted for application to a wide range of specific industry sectors, it has no particular view on the nature of the EUC, which could be

The standard then defines two types of SRS that are intended to mitigate the hazards and

*Control Systems* (eg a railway signalling system) which provide *Safety Functions* that are

 *Protection Systems* (eg an automatic train protection (ATP) system or car airbag) which provide *Safety Functions* that are designed to intervene when they detect a hazardous state developing within the EUC and/or its Control System(s), and put the EUC / its

As far as a Control System is concerned, the hazards and risks associated with its EUC are clearly *pre-existing*, since they are caused by the latter not the former. Similarly, the hazards and risks associated with the combination of an EUC and its Control System(s) are pre-

With this concept, an SRS (Control and/or Protection system) is put in place to reduce the pre-existing risks to an acceptable level. IEC 61508 refers to this as *Necessary Risk Reduction* but does not actually stipulate what is "acceptable", this being left to local or national

As we will see in section 3.3.3 below, safety integrity requirements on Control Systems are usually expressed as probability of failure per operating hour, whereas for Protection Systems they are usually expressed as probability of failure on demand. In either case, the target probability will of course depend on the level of the pre-existing risk. The

6 In these examples, the EUC is very tangible and for these it is probably a better term than the equivalent term "operational environment" used in Figure 3(b). However, in some cases - eg air traffic management and a railway level crossing (see section 4) - the EUC is much less tangible and "operational environment" might be better. Whatever term is used, the principles are exactly the same

considerations, including legal frameworks, for the applicable industry sector.

designed to maintain continuously a tolerable level of risk for the EUC, and

want the system to do) depend on entirely different properties of the system.

a nuclear reactor, a chemical plant, an oil rig, a train, a car, or an aircraft etc6.

Control System(s) into a safe, or at least safer, state7.

existing as far as a Protection System is concerned.

accepted, international standard on the functional safety of systems.

**3.3.2 IEC 61508** 

risks associated with the EUC:

5 Equivalent to Figure 2 case (a).

and the concept of pre-existing risk is paramount!

7 Eg, an ATP system is designed to stop a train if it passes a signal at danger.

objective of the SRS for Control and Protection systems is *risk control* and *risk reduction* respectively8.

In both cases, IEC 61508 is quite clear that the safety functional requirements (specifying functionality and performance of the Safety Functions) must be completely and correctly identified before the SRS can be designed. This requires hazard and risk analysis of the EUC not (initially at least) hazard and risk analysis of the SRS(s) themselves. Once the safety functionality and performance requirements of the Safety Functions have been identified, the tolerable failure rates of the Safety Functions can then be identified, and the Safety Integrity Level (SIL) for each Safety Function established9.

#### **3.3.3 Safety properties**

We can build on our simple example of a car airbag to explain more generically, and develop, the above principles since it can readily be seen that an airbag fits Figure 2 case (b), and the IEC concept of a Protection System, very well. Figure 3 shows the risk, in the Operational Environment (or EUC), with and without an SRS – ie RU and RA respectively. As we saw for the airbag in section 2 above, the safety case for the SRS depends on its making a (much) bigger positive contribution to safety when operating as intended (ie the *success* case as represented by the green, right-to-left arrow) than any negative contribution caused by its failure or incorrect / spurious operation (ie the *failure* case as represented by the solid red, left-to-right arrow).

Fig. 3. Risk Graph for a Safety-related System

<sup>8</sup> Version 2 of IEC 61508, issued in 2010, is much clearer about the application of risk reduction to protection systems and risk control to continuously-operating control systems, than was the earlier (2000) version.

<sup>9</sup> This is exactly what we said for the airbag example in section 2, but is not always followed in some industry sectors!

There are a number of very important points to note about this diagram:


We can also see from Figure 3 that in the limit that RM approaches RT, so the integrity required of the SRS approaches infinity! This raises further important questions regarding the origins and use of traditional risk-classification schemes, which are often are based entirely on RT and do not take any account of RM in setting tolerable failure rates for a system. As we saw in section 3.2.2 above, RCSs generally model only the system's negative effects on safety, not its positive contributions and, therefore, to get a more complete picture of where risks lie in a system we need to turn to more sophisticated forms of risk modelling.

#### **3.3.4 Risk modelling**

One of the systems engineering techniques that is commonly used for the static modelling of risk in a safety assessment is Fault Tree Analysis (FTA) [IEC, 2006b]. This is illustrated, for a very basic Protection System, in Figure 4.

An accident would occur if one, or both, of two conditions occurs, as follows:

 firstly, the (pre-existing) hazard occurs and the consequences of the hazard are not mitigated; in this case, if the hazard were never mitigated, then the accident rate would be the same as the hazard occurrence rate – ie the hazard occurrence rate would be the

<sup>10</sup> Eg for an airbag these include fire, or being hit from behind by a vehicle with high relative velocity.

<sup>11</sup> The word "primarily" is used here because (as is more generally the case) it is may be possible to provide additional functionality to mitigate some of the causes and / consequences of system-generated hazards .

 RM is the theoretical minimum risk that would exist in the complete absence of failure / spurious operation of the SRS – it is not zero, because there usually are some

since RM is defined as the risk in the absence of failure, it must be determined only by

 the risk increase RA-RM is caused entirely by failure / spurious operation of the SRS thus it is the *system-generated* risk and is determined primarily11 by the *reliability &* 

 if we now introduce RT, the maximum tolerable level of risk, then an interesting conclusion emerges: given that RT is fixed (eg by a regulatory body), then the maximum tolerable failure rate of the SRS - ie a function of the length of the extended red (l-r) arrow (RT-RM) - depends on the length of the green (r-l) arrow (RU-RM); in other words, the tolerable failure rate depends on how successful the SRS is in reducing the pre-

 if, as we desire, RA-RM<<RU-RM, then the overall risk actually achieved (ie RA) is much more sensitive to changes in the length of the green (r-l) arrow (ie to changes in functionality and performance) than to proportionate changes in the length of the red (l-

We can also see from Figure 3 that in the limit that RM approaches RT, so the integrity required of the SRS approaches infinity! This raises further important questions regarding the origins and use of traditional risk-classification schemes, which are often are based entirely on RT and do not take any account of RM in setting tolerable failure rates for a system. As we saw in section 3.2.2 above, RCSs generally model only the system's negative effects on safety, not its positive contributions and, therefore, to get a more complete picture of where risks lie in a system we need to turn to more sophisticated forms of risk modelling.

One of the systems engineering techniques that is commonly used for the static modelling of risk in a safety assessment is Fault Tree Analysis (FTA) [IEC, 2006b]. This is illustrated, for a

 firstly, the (pre-existing) hazard occurs and the consequences of the hazard are not mitigated; in this case, if the hazard were never mitigated, then the accident rate would be the same as the hazard occurrence rate – ie the hazard occurrence rate would be the

10 Eg for an airbag these include fire, or being hit from behind by a vehicle with high relative velocity. 11 The word "primarily" is used here because (as is more generally the case) it is may be possible to provide additional functionality to mitigate some of the causes and / consequences of system-generated

An accident would occur if one, or both, of two conditions occurs, as follows:

There are a number of very important points to note about this diagram:

accident scenarios for which an SRS cannot provide mitigation10

the safety case for the SRS is based on showing that RA<<RU

r) arrow (ie to changes in reliability & integrity).

*integrity* of the SRS

**3.3.4 Risk modelling** 

hazards .

very basic Protection System, in Figure 4.

existing risk in the first place

RU has nothing to do with the SRS – ie it is the *pre-existing* risk, as above

the *functionality & performance* of the SRS, as explained in section 2 above

overall, RU-RT fits the IEC 61508 definition of Necessary Risk Reduction

pre-existing risk (RU) defined in Figure 3. The situation that the hazard is not mitigated could arise because Protection System either: operates but is not effective; or fails to operate.

 or secondly, the Protection System operates spuriously (eg when it is not supposed to, or in a way different from what was required) and the consequences of this (systemgenerated) hazard are not mitigated.

Fig. 4. Simple Accident Fault Tree - Protection System

It is the presence of the external input RU that distinguishes Figure 4 from Fault Trees in general, and it is this that enables the computation of both the positive and negative contributions to safety.

It should be noted that a simple failure (to operate) of the Protection System is shown on the *success* side of the model rather than on the *failure* side - corresponding to shortening the green arrow on Figure 3 rather than lengthening the red arrow. This would be valid for our airbag if driver behaviour were not affected by the knowledge that the car had an airbag because the risk of airbag failure would simply be the same as not having an airbag at all for the period of the failure. However, if drivers drove less carefully and / or faster in expectation that the airbag would always protect them in the event of a head-on collision then the consequences (and therefore the risk) from failure of the airbag to operate when required would be correspondingly greater - in this case the effect of the failure would better be shown on the failure side of the model. The latter case is an example of what is very common in safety-related environments, especially where humans are involved, and requires extra care to be taken when incorporating such loss-type failures in a risk model.

In practice, risk models are much more complex than the simple illustration in Figure 4. Their uses range from a discrete safety assessment of part of an overall system up to developing a risk model for an entire operation. An example of the latter use, from the ATM field, is EUROCONTROL's Integrated Risk Picture (IRP) [Perrin & Kirwan, 2009]. As a complete model of both positive and negative contributions to safety, the IRP has proved to be a much more powerful tool, in the management of functional safety, than a simple RCS (see section 3.2.2 above) and can be used in many different ways including *a priori* safety assessment, safety monitoring and safety-strategy formulation. Such models are used also in some parts of the nuclear and rail industries but, to the authors' knowledge, not in other industry sectors at the moment.

#### **4. Requirements engineering – The key to safety assessment**

Capturing, and then satisfying, a complete and correct set of safety requirements is as fundamental to any *a priori* safety assessment as requirements engineering is to systems engineering in general, as explained below.

#### **4.1 Requirements capture**

Some crucial issues regarding requirements capture can be expressed through the simple, but rigorous, requirements-engineering (RE) model shown in Figure 5. This model has been adapted from [Jackson, 1995], in the introduction to which Dr Jackson sums up the requirements-capture problem perfectly, as follows:

*"We are concerned both with the world, in which the machine serves a useful purpose, and with the machine itself. The competing demands and attractions of these two concerns must be appropriately balanced. Failure to balance them harms our work".* 

Fig. 5. Jackson Requirements-capture Model - General Form

In this context, what has been said in section 2 above about the lack of a success approach in safety assessments is an example of a pre-occupation with the machine itself at the expense of considering its useful purpose (ie to reduce pre-existing risk). Figure 5 helps clear our thinking as follows.

In practice, risk models are much more complex than the simple illustration in Figure 4. Their uses range from a discrete safety assessment of part of an overall system up to developing a risk model for an entire operation. An example of the latter use, from the ATM field, is EUROCONTROL's Integrated Risk Picture (IRP) [Perrin & Kirwan, 2009]. As a complete model of both positive and negative contributions to safety, the IRP has proved to be a much more powerful tool, in the management of functional safety, than a simple RCS (see section 3.2.2 above) and can be used in many different ways including *a priori* safety assessment, safety monitoring and safety-strategy formulation. Such models are used also in some parts of the nuclear and rail industries but, to the authors' knowledge, not in other

Capturing, and then satisfying, a complete and correct set of safety requirements is as fundamental to any *a priori* safety assessment as requirements engineering is to systems

Some crucial issues regarding requirements capture can be expressed through the simple, but rigorous, requirements-engineering (RE) model shown in Figure 5. This model has been adapted from [Jackson, 1995], in the introduction to which Dr Jackson sums up the

*"We are concerned both with the world, in which the machine serves a useful purpose, and with the machine itself. The competing demands and attractions of these two concerns must be* 

P, S R

Domain System i/f

Service (s)

In this context, what has been said in section 2 above about the lack of a success approach in safety assessments is an example of a pre-occupation with the machine itself at the expense of considering its useful purpose (ie to reduce pre-existing risk). Figure 5 helps clear our

**Specification** *S*

Application

**Requirements** *R*

**Domain Properties** *P*

**4. Requirements engineering – The key to safety assessment** 

industry sectors at the moment.

**4.1 Requirements capture** 

engineering in general, as explained below.

**Design** *D*

thinking as follows.

'Real World'

D S

requirements-capture problem perfectly, as follows:

*appropriately balanced. Failure to balance them harms our work".* 

Fig. 5. Jackson Requirements-capture Model - General Form

In the Jackson model, the *system* exists in the *real world*. The part (ie subset) of the real world that influences the system, and into which the system provides a *service* through an interface (*i/f*), is known as the *application domain*. *Requirements* are what we want to make happen in the application domain12 and are defined in that domain - not in the system.

A *specification* is what the system has to do across the interface in order that the *requirements* can be satisfied - ie a specification takes an external, or "black-box", view of the system. Another way of thinking about a specification is that it contains all the shared properties between the service provider and the service user - therefore it might include things that the service user has to do, not just what the system has to do.

*Design*, on the other hand, describes what the system itself is actually like and includes all those characteristics that are not directly required by the users but are implicitly necessary in order for the system to fulfil its specification and thereby satisfy the user requirements. Design is essentially an internal, or "white-box", view of the system.

The formal notation in the "bubbles" in Figure 5 defines two relationships that must be shown to be true in requirements capture:


The distinction, and relationship, between requirements, specifications, application-domain properties, and design are not merely academic niceties; rather, they provide the essential foundations for developing systems that do, and can be shown to do, everything required.

What is described above in this section applies, of course, to systems engineering in general. However, if the assertion at the beginning of section 1 is correct then it should be possible to apply the same principles to the safety perspective. By comparing Figure 5 with Figure 2 (b) we can see that there is a direct equivalence for safety, as shown in Figure 6.

The main differences from Figure 5 are limited to:


Otherwise, everything that is said in section 4.1.1 above applies to safety.

<sup>12</sup> Since the service users are in the Application Domain these requirements are sometimes called *User* Requirements

<sup>13</sup> Indeed they are the most important properties of the operational environment / EUC!

#### **4.2 Safety requirements satisfaction**

*Implementation* of the design, in the built and integrated system, involves a **third** relationship that must be shown to be true:

1. the implementation *I* satisfies the design *D*

The validity of this relationship requires two objectives to be satisfied in implementation of the design - ie showing that:


Because these two objectives are generic - ie apply to all properties of a system - there is no difference in principle between the satisfaction of safety requirements and the satisfaction of design requirements in general. That said, there is usually a difference in degree, in that safety requirements require a higher level of assurance that they have been captured, and then satisfied, completely and correctly.

#### **5. Safety assurance and safety cases**

#### **5.1 Safety assurance**

Safety assurance, like systems assurance in general, relies on planned, systematic activities to provide the necessary confidence that a service or functional system satisfies its requirements (which are themselves complete and correct), in its intended environment14. *Assurance activities* are systematic in that they specify how the *assurance objectives* (ie what has to be demonstrated) are to be achieved, as indicated in Figure 7.

 14 From a safety perspective, this would mean achieving an acceptable or tolerable level of safety - see definition of safety assurance in [European Commission, 2005]

P, S C

**Specification** *S*

Fig. 6. Safety Engineering form of the Jackson Requirements-capture Model

system satisfy the requirements established for the design, and

has to be demonstrated) are to be achieved, as indicated in Figure 7.

definition of safety assurance in [European Commission, 2005]

System i/f Environment / EUC

Service (s)

*Implementation* of the design, in the built and integrated system, involves a **third** relationship

The validity of this relationship requires two objectives to be satisfied in implementation of

the required properties (functionality, performance, reliability and integrity) of the built

 no emergent properties (eg common-cause failures) and unwanted functionality have been introduced inadvertently such that they could adversely affect the ability of the

Because these two objectives are generic - ie apply to all properties of a system - there is no difference in principle between the satisfaction of safety requirements and the satisfaction of design requirements in general. That said, there is usually a difference in degree, in that safety requirements require a higher level of assurance that they have been captured, and

Safety assurance, like systems assurance in general, relies on planned, systematic activities to provide the necessary confidence that a service or functional system satisfies its requirements (which are themselves complete and correct), in its intended environment14. *Assurance activities* are systematic in that they specify how the *assurance objectives* (ie what

14 From a safety perspective, this would mean achieving an acceptable or tolerable level of safety - see

built system to satisfy the (safety) requirements established for the design.

**Design** *D*

**4.2 Safety requirements satisfaction** 

then satisfied, completely and correctly.

**5.1 Safety assurance** 

**5. Safety assurance and safety cases** 

1. the implementation *I* satisfies the design *D*

that must be shown to be true:

the design - ie showing that:

'Real World'

D S

Operational

**Safety Criteria C**

**Operational Environment / EUC Properties** *P*

Fig. 7. Safety Assurance – Basic Elements

Which assurance objectives have to be achieved, and the rigour with which they have to be achieved, are often determined by assurance levels (ALs), which are based upon the potential consequences of the anomalous behaviour of the system element concerned, as determined by the system safety assessment process.

The AL implies that the level of effort recommended for showing compliance with safety requirements increases with both the severity of the end effect of the element failure. and the probability / likelihood of occurrence of that end effect, given that the failure has occurred [Mana et al, 2007]15. The results (outputs) of the activities are then used to show that the assurance objectives have been achieved.

For high-integrity systems in particular, there is a further issue that safety assurance is often used to address and concerns the safety-integrity of system elements - software functions and human tasks, in particular. Whereas it may be necessary to specify Safety Integrity Requirements for all elements of a system in order to show compliance with a numerical Safety Criterion, it is usually very difficult to show in a direct way - through, for example, test results - that such requirements are actually satisfied in implementation. In such situations, it becomes necessary to adopt a more indirect, assurance-based approach which uses the rigour of the development processes to give confidence that the requirements are likely to be / have been satisfied. This is reflected in, for example, airborne software standard DOD 178B [RTCA, 1992] and IEC 61508 both of which are assurance based.

The problem with safety standards is that their use can become highly *proceduralized*, leaving open two important questions:


We can address this problem by putting safety assurance into an *argument* framework but in order to understand this we first need to look have a brief look at Safety Cases.

<sup>15</sup> In some standards, the likelihood of occurrence of the failure is also taken into account - ie the assurance is based on the risk associated with a failure, not just the consequence thereof. This is the case with IEC 61508 and related standards, in which the term SIL (Safety Integrity Level) is used instead of AL.

#### **5.2 Safety cases**

Safety assessments are often done within the context of a Safety Case which*,* like a legal case, comprises two main elements:


Safety arguments are normally set out hierarchically; this is shown in Figure 8 using and adapted form of goal-structuring notation (GSN). In safety work [Kelly & Weaver, 2004], GSN is simply a graphical representation of an argument / evidence structure and usually starts with the top-level claim (Arg 0) that something is (or will be) acceptably (or tolerably) safe; this is then decomposed such that it is true only if, and only if, the next-level argument statements (in this case Arg 1 to 4) are all true. The *strategy* text should explain the rationale for that decomposition.

Fig. 8. A generic high-level Safety Argument

The *claim* is supported by vital contextual information, as follows:


Safety assessments are often done within the context of a Safety Case which*,* like a legal

a set of *arguments* - ie statements which claim that something is true (or false), together

Safety arguments are normally set out hierarchically; this is shown in Figure 8 using and adapted form of goal-structuring notation (GSN). In safety work [Kelly & Weaver, 2004], GSN is simply a graphical representation of an argument / evidence structure and usually starts with the top-level claim (Arg 0) that something is (or will be) acceptably (or tolerably) safe; this is then decomposed such that it is true only if, and only if, the next-level argument statements (in this case Arg 1 to 4) are all true. The *strategy* text should explain the rationale

> **Arg 0** *<<Claim* that something is *safe>>*

Arg 0>>

<<*Strategy* to explain the rationale for decomposing

 the *context* for the claim must include a description of the operational environment for which the claim is being made - we can deduce from section 4.1 above how critical this

 *assumptions* are usually facts on which the claim depends and over which the organisation responsible for the safety case has no managerial control - eg driver

if the claim relates to a major change to a safety-related system, it is good practice to

**Arg 3** <<Argument that **<C>** is true>>

[tbd] [tbd]

**Ev001** <<Evidence that **Arg 3** is valid>>

**Arg 2** <<Argument that **<B>** is true>>

The *claim* is supported by vital contextual information, as follows:

what is meant by *acceptably safe* is defined by means of *safety criteria*

**Arg 4** <<Argument that **<D>** is true>>

**C002**

**C001**

Applies to <<Operational Environment>>

> <<Justification for the subject of the Claim>>

supporting *evidence* to show that the argument is actually true.

**5.2 Safety cases** 

with

for that decomposition.

**Cr001**

**A001**

*<<Safe* is defined by *Safety Targets>>*

*<<Assumptions* to be declared and validated in the Safety Case>>

> **Arg 1** <<Argument that **<A>** is true>>

> > [tbd]

Fig. 8. A generic high-level Safety Argument

is to the validity of the claim

behaviour, in the case of our airbag

provide a *justification* for that change.

case, comprises two main elements:

The arguments would then be further sub-divided until a level is reached at which a piece of documented evidence, of a manageable size, could be produced to show that the corresponding argument statement is valid. The question is how to ensure that a safety argument is complete and rigorous – for this, we use the three formal relationships derived in section 4, as follows.


Then, by adding two further arguments:


we have a sufficient, high-level safety argument for developing a new or modified system, bringing it into service and maintaining it throughout its operational life [Fowler et al, 2009]. Since it is the safety argument that determines ultimately what we need to demonstrate, we can use it to drive the whole assurance process as shown in Figure 9.

Fig. 9. Safety Assurance within an Argument Framework

The key point about this diagram is that it is the needs of the argument (ie the generation of evidence) that drive the activities - not the other way around - and the lifecycle phases contain only those activities that are necessary to support the argument.

#### **6. Safety assessment in the project lifecycle**

The above assurance principles apply to the five phases of a typical project lifecycle, as shown in Figure 10.

In practice, the Safety Plan (produced at the start of a project) should set out a specific safety argument and assurance objectives – with a rationale as to how they were derived to suit the nature and scope of the safety assessment concerned - the lifecycle assurance activities to be carried out, and the tools, techniques etc to be employed. Since the Safety Case (developed during, but finalised at the end, of a project) uses the same argument, it needs only to present the evidence resulting from the activities and provide the rationale as to how that evidence satisfies the argument.

Fig. 10. Overall Safety Lifecycle Process

We will now examine in more detail what is done in the way of safety assurance objectives and activities in each phase of the lifecycle.

#### **6.1 Definition phase**

From section 5, we can see that in this phase we need to show that the system has been *specified* to meet the appropriate safety criteria in the given operational environment (or EUC). We will use a new, innovative16 railway level-crossing17 control (and possibly protection) system for a planned new two-way suburban road in order to illustrate some of the points in the steps described in sections 6.1.1 to 6.1.4 below - it should be noted that the analysis presented here is not intended to be exhaustive.

#### **6.1.1 Operational environment**

For our level-crossing, the properties of the operational environment would need to include:


<sup>16</sup> This is intended to be a hypothetical example; traditional railway standards - eg [RSSB, 2007] - for level crossings would probably not apply.

<sup>17</sup> Where a railway and road intersect at the same level.

present the evidence resulting from the activities and provide the rationale as to how that

Definition

Operation & Maintenance

We will now examine in more detail what is done in the way of safety assurance objectives

From section 5, we can see that in this phase we need to show that the system has been *specified* to meet the appropriate safety criteria in the given operational environment (or EUC). We will use a new, innovative16 railway level-crossing17 control (and possibly protection) system for a planned new two-way suburban road in order to illustrate some of the points in the steps described in sections 6.1.1 to 6.1.4 below - it should be noted that the

For our level-crossing, the properties of the operational environment would need to include: users of the crossing - eg passenger and freight trains, road vehicles (of which 80% are cars / light vans and the rest are trucks up to 40 tonnes weight) and occasional pedestrians; exceptionally (once very 1-2 months), large slow-moving vehicles carrying

16 This is intended to be a hypothetical example; traditional railway standards - eg [RSSB, 2007] - for

Transfer into Operation

Implementation

High-level Design

**ASSURANCE ACTIVITIES**

**SAFETY CASE**

Arg 0

Arg 1

Arg 2

Arg 4

Arg 5

Arg 3

Lower-level Safety Arguments & Assurance Objectives

Lower-level Safety Arguments & Assurance Objectives

Evidence Evidence

evidence satisfies the argument.

Lower-level Safety Arguments & Assurance Objectives

Lower-level Safety Arguments & Assurance Objectives

Arg 0

**SAFETY PLAN**

Arg 1

Arg 2

Arg 4

Arg 5

Fig. 10. Overall Safety Lifecycle Process

**6.1 Definition phase** 

**6.1.1 Operational environment** 

average length of train - 120 m

level crossings would probably not apply.

17 Where a railway and road intersect at the same level.

and activities in each phase of the lifecycle.

analysis presented here is not intended to be exhaustive.

abnormally heavy loads will need to use the crossing

Arg 3 - two-way rail traffic levels 150 passenger trains per day (mainly between 07:00 and 23:00 hours) and 10 freight trains per day (mainly at night)

#### **6.1.2 Pre-existing hazards**

The main pre-existing hazard is " **HAZPE#1** - any situation in which, on current intentions, a road user and a train would inadvertently occupy the crossing at the same time". The use of "on current intentions" is crucial since it is describing a hazard not an actual accident. We could use mathematical modelling here to estimate the frequency with which this hazard would occur for a completely uncontrolled crossing and hence estimate the pre-existing risk.

#### **6.1.3 Safety criteria**

A suitable quantitative criterion would be that the likelihood of an accident involving multiple fatalities shall not exceed one per 100 years, supported by a second, ALARP criterion. However, given a possible range of outcomes of the hazard in this case, it might be appropriate to make use of a suitable RCS (along the lines of Figure 1 in section 3.2.2 above) in order also to set criteria for outcomes of lesser severity19.

#### **6.1.4 The specification**

We recommend the use of the term Safety Objectives to describe the safety aspects of the specification. The reason is that it helps us remember that, in accordance with the Jackson model of section 4.1 above, what we are seeking to do here is describe, from the users' perspective, what the system must do, not to determine how the system will achieve that in its design.

First of all we need to consider the success case and assess how the pre-existing hazard is mitigated for all *normal* conditions in the operational environment - ie all those conditions that our SRS is likely to encounter on a day-to day basis - constructing various operational scenarios (eg single and multiple trains) as necessary. Two examples of a Safety Objective for this are as follows:


<sup>18</sup> This might need to be justified on ALARP grounds. 19 However, for the purposes of this simple illustration, we will assume that if a train travelling at normal speed collides with a road vehicle there will be some fatalities.

<sup>20</sup> We would need an equivalent rule (ie Safety Objective) for multiple-train situations.

Note that these are genuine objectives (as above) and that the illustrative numbers would need to be verified by some form of dynamic risk modelling based on the environment described in section 6.1.1.

Next we need to assess how well the pre-existing hazard is mitigated for all *abnormal* conditions in the operational environment - ie all those adverse conditions that our SRS might exceptionally encounter - again, constructing various operational scenarios as necessary. An example of a Safety Objective for this is as follows:


This situation is expected to occur infrequently (see section 6.1.1 above) and therefore the system is allowed to operate in a different mode - in this case the train no longer has priority over the road vehicle - from the *normal* case. Again, some form of dynamic risk modelling could be used to determine a suitable exclusion distance for approaching trains.

Finally, we need to consider the potential failure modes of the system, at the service level. At this level, we are not concerned with the causes of failure21, only with the consequences of failure - for which we would often use Event-tree Analysis for assessing multiple possible outcomes of a particular failure. It is important that the identification of possible failure modes be as exhaustive as possible; a useful starting point is to take each of the success-case Safety Objectives and ask the question what happens if it not satisfied. This will lead to the *system-generated* hazards, an example of which is:


Using the operational data from section 6.1.1 we can derive the following Safety Objective to limit the frequency of the hazard such that the appropriate portion of the tolerable-risk criterion (see section 6.1.3) is satisfied for this hazard:


Note that this illustrative figure takes account of the total number of system-generated hazards (assumed to be four in this illustration), the frequency with which road and rail traffic uses the crossing, and the providential mitigation that even if a vehicle incorrectly enters the crossing there is a significant probability that it would not actually collide with a train. Note also that the hazard occurrence rate is expressed as a frequency even though the SRS is not continuously operating - this was done in accordance with IEC 61508 because the demand rate on the SRS is relatively high (ie up to 150 operations per day).

Thus, at the end of the Definition Phase we should have a set of Safety Objectives which, if they are satisfied in the system design and implementation, would ensure that the preexisting risk is mitigated, and the system-generated risk is limited, such that the level crossing would satisfy the specified quantitative Safety Criteria.

<sup>21</sup> This is done in the failure analysis of the high-level design - see section 6.2 below

<sup>22</sup> *Closed* here is defined by SO#2 (ie from 1 minute before an approaching train reaches the crossing, until the crossing is clear ) - it does not necessarily imply a physical closure

Note that these are genuine objectives (as above) and that the illustrative numbers would need to be verified by some form of dynamic risk modelling based on the environment

Next we need to assess how well the pre-existing hazard is mitigated for all *abnormal* conditions in the operational environment - ie all those adverse conditions that our SRS might exceptionally encounter - again, constructing various operational scenarios as


This situation is expected to occur infrequently (see section 6.1.1 above) and therefore the system is allowed to operate in a different mode - in this case the train no longer has priority over the road vehicle - from the *normal* case. Again, some form of dynamic risk modelling

Finally, we need to consider the potential failure modes of the system, at the service level. At this level, we are not concerned with the causes of failure21, only with the consequences of failure - for which we would often use Event-tree Analysis for assessing multiple possible outcomes of a particular failure. It is important that the identification of possible failure modes be as exhaustive as possible; a useful starting point is to take each of the success-case Safety Objectives and ask the question what happens if it not satisfied. This will lead to the



Note that this illustrative figure takes account of the total number of system-generated hazards (assumed to be four in this illustration), the frequency with which road and rail traffic uses the crossing, and the providential mitigation that even if a vehicle incorrectly enters the crossing there is a significant probability that it would not actually collide with a train. Note also that the hazard occurrence rate is expressed as a frequency even though the SRS is not continuously operating - this was done in accordance with IEC 61508 because the

Thus, at the end of the Definition Phase we should have a set of Safety Objectives which, if they are satisfied in the system design and implementation, would ensure that the preexisting risk is mitigated, and the system-generated risk is limited, such that the level

<sup>22</sup> *Closed* here is defined by SO#2 (ie from 1 minute before an approaching train reaches the crossing,

demand rate on the SRS is relatively high (ie up to 150 operations per day).

21 This is done in the failure analysis of the high-level design - see section 6.2 below

crossing would satisfy the specified quantitative Safety Criteria.

until the crossing is clear ) - it does not necessarily imply a physical closure

could be used to determine a suitable exclusion distance for approaching trains.

necessary. An example of a Safety Objective for this is as follows:

*system-generated* hazards, an example of which is:

criterion (see section 6.1.3) is satisfied for this hazard:

not exceed 5x10-5 per operating hour

described in section 6.1.1.

#### **6.2 High-level design phase**

Having derived what Jackson [Jackson, 1995] refers to as a Specification for the system, the system-development task becomes less safety-specific, and has even more common with general system-engineering principles, except for one key feature - the higher level of confidence (ie assurance) that is required in the results of safety assessment.

Design, as we have seen, is about the internal properties of the system but for this phase we restrict the analysis to a logical design, in which Safety Requirements describe the main human tasks and machine-based functions that constitute the system, and the interactions between them. An illustration, based on our 'futuristic' level-crossing control system (LCCS) of section 6.1, is given in Figure 11.

Fig. 11. Example of a Logical Model

A description of this example and the way that the model works is beyond the scope of this chapter - suffice it to say that the new system comprises a fully automated Level-crossing Controller and a Road Vehicle Monitor, which detects the presence of road vehicles within the crossing area. It is to be integrated into a regionally-based "Moving Block" signalling system using Communications Based Train Control - all the train control systems including ATP, but excluding the Onboard Computer, are subsumed into the TCS box23.

The main points to note are as follows:


 23 Although level crossing control would normally be integrated with the Control Centre, for the purposes of this illustration we assume a separate subsystem.

<sup>24</sup> As we will see, physical design is taken to be the first stage of Implementation.

Safety Requirements capture what each of those "actors" needs to provide in terms of functionality, performance, reliability and integrity in order to satisfy the specified Safety Objectives. Whereas Figure 11 shows the actors and the way in which they interact quite clearly, the functionality that they provide is contained in the textural Safety Requirements, and the links between those functions are not easily seen. For this reason, on functionally-rich systems we often use a Functional Model, showing an abstract view of the system functions and data, as a bridge between the Specification and the Logical Design, thus increasing the confidence of the completeness and correctness of the latter [Fowler et al, 2009].

It is very important to note that making an argument for a Logical Design is not simply a matter of showing traceability of the individual Safety Requirements, for the Logical Design, back to the Safety Objectives of the Specification. This would ignore three possibilities: that the design as a whole might be in someway internally incoherent; that new failure properties could emerge at the design level that were not apparent at the higher (service) level; or that the Safety Requirements are too demanding of technology and / or human performance. Thus it is necessary to provide assurance that the Logical Design:


By now it will (we hope!) be no surprise that to analyse, verify and validate the design from a safety perspective we use classical systems-engineering techniques, including:


Furthermore, since the human elements of the system have started to emerge, we can use human factors (HF) techniques such as Cognitive Task Analysis (CTA) and Human Reliability Assessment (HRA) to assess initially whether the task, performance and reliability & integrity demands which the Safety Requirements place on the human operators are at least realistic.

By the end of the High-level Design Phase we should have a set of Safety Requirements covering the success and failure cases - that are sufficient to ensure that, if they are satisfied in the Implementation, the specified Safety Objectives would be met.

#### **6.3 Implementation phase**

116 Systems Engineering – Practice and Theory

Safety Requirements capture what each of those "actors" needs to provide in terms of functionality, performance, reliability and integrity in order to satisfy the specified Safety Objectives. Whereas Figure 11 shows the actors and the way in which they interact quite clearly, the functionality that they provide is contained in the textural Safety Requirements, and the links between those functions are not easily seen. For this reason, on functionally-rich systems we often use a Functional Model, showing an abstract view of the system functions and data, as a bridge between the Specification and the Logical Design, thus increasing the confidence of the completeness and correctness of the latter

It is very important to note that making an argument for a Logical Design is not simply a matter of showing traceability of the individual Safety Requirements, for the Logical Design, back to the Safety Objectives of the Specification. This would ignore three possibilities: that the design as a whole might be in someway internally incoherent; that new failure properties could emerge at the design level that were not apparent at the higher (service) level; or that the Safety Requirements are too demanding of technology and / or human

has all of the functionality and performance attributes that are necessary to satisfy the

will deliver this functionality and performance for all *normal* conditions of the operation

is robust against (ie work through), or at least resilient to (ie recover easily from), any

 has sufficient reliability and integrity to satisfy the Safety Objectives of the Specification is realistic in terms of the feasibility of a potential physical system to satisfy the Safety Requirements, and the ability of validation & verification methods to demonstrate, at the appropriate time and to the necessary level of confidence, that the Safety

By now it will (we hope!) be no surprise that to analyse, verify and validate the design from

Use-case Analysis [ISO/IEC, 2005] and Fast-time / Real-time simulations - for the

By the end of the High-level Design Phase we should have a set of Safety Requirements covering the success and failure cases - that are sufficient to ensure that, if they are satisfied

in the Implementation, the specified Safety Objectives would be met.

a safety perspective we use classical systems-engineering techniques, including:

 Fault-tree Analysis [IEC, 2006b] to assess the causes of failure, "top down" , and Failure Modes Effects & Criticality Analysis [IEC, 2006a] to check the FTA, "bottom up". Furthermore, since the human elements of the system have started to emerge, we can use human factors (HF) techniques such as Cognitive Task Analysis (CTA) and Human Reliability Assessment (HRA) to assess initially whether the task, performance and reliability & integrity demands which the Safety Requirements place on the human

performance. Thus it is necessary to provide assurance that the Logical Design:

environment that it is likely to encounter in day-to-day operations

Safety Objectives of the (service-level) Specification

*abnormal* conditions that it may exceptionally encounter

Requirements are eventually satisfied.

requirements traceability [Hull et al, 2005]

normal, abnormal and failure scenarios

operators are at least realistic.

[Fowler et al, 2009].

We have defined the Implementation Phase such that it comprises development of a Physical Design and the realisation of the Physical Design in the built system. In making an argument for Implementation, we need to show that:


In the physical design, we take the Safety Requirements from the Logical Design and allocate them to the elements of the Physical System, as follows:


These in turn lead to further design, Safety Requirements derivation and implementation for each of these elements and then to integration of the complete system - for further reading see [ISO/IEC, 2008b and ISO/IEC, 2008a]. The steps would follow, for example, the classical "V-model" of system development in which the safety engineer must ensure that the physical system as a whole (and its constituent parts) have sufficient reliability and integrity, and complete and correct functionality and performance, to satisfy the higherlevel Safety Requirements. These are discussed in turn, as follows.

#### **6.3.1 Building reliable systems – General**

The engineering of a safety related system must ensure, as a minimum, that the safety reliability and integrity requirements are met. It is useful to consider first how failures occur and what techniques can be used to reduce failure rates to meet the safety criteria.

*Random* failures can occur in hardware of any kind due to physical degradation and wearout mechanisms; the exact time when such a failure will occur is unknown but statistical distributions can be used to predict failure rates and quantification of overall system reliability can be modelled by techniques such as FTA mentioned earlier. Techniques for making hardware elements sufficiently reliable are considered in section 6.3.2 below.

*Systematic* failures by contrast are caused by design defects – they will occur whenever a system enters a state in which a latent defect is revealed. Software failures are always

 25 It is acknowledged that, in some industries / countries, verification and validation may have the opposite meanings to those used herein.

systematic26, but hardware designs (especially those such as computer processor chips) can also exhibit systematic failures. Methods of ensuring that software is sufficiently free of defects, such that it can meet its safety function and performance requirements with sufficient reliability and integrity, are discussed in section 6.3.4 below. The concept of a systematic failure may also be applicable to human factors - eg if a procedure is designed incorrectly.

*Common-cause* failures are ones in which redundant subsystems fail at the same time due to the same external events (eg earthquake or tsunami), internal causes (eg power supply failure) or due to a systematic error affecting multiple systems (known as a common mode failure). This could be a software defect or a human maintenance intervention, for example. Common cause failures in practice often limit the achievable reliability of complex systems. The general approach is to attempt to identify and eliminate sources of common cause failure where possible and also to be conservative with reliability predictions to cater for the possibility of unknown causes.

#### **6.3.2 Hardware safety**

The main techniques for ensuring that random hardware failures are sufficiently unlikely are use of high reliability components and redundancy. High-reliability components are expensive so redundancy is used in practice except in special circumstances (such as space applications) where repair is difficult or impossible. Redundancy simply means having two or more subsystems each of which can perform the required safety functions; if one fails then a standby can take over provided that there is some mechanism (automated or manual) to detect the failure. Further gains in reliability can sometimes be achieved by using diversity, where the standby system(s) are not identical to each other. Ideally the diversity should be both conceptual (using different physical processes or measurements) and methodological (different design methods) since this helps to reduce the likelihood of common mode failures (discussed in the next section). Part 2 of IEC 61508 [IEC, 2010], in particular, provides requirements and guidance on hardware safety techniques and measures.

Of course, we must not forget the fundamental principle that (with the possible exception of Information Systems) the functionality and performance properties of hardware is as important to safety as its reliability and integrity is - see the airbag example in section 2.

#### **6.3.3 Human factors safety**

HF is a topic that in the past in many industries has had only scant coverage in functional safety [Sandom, 2002]. More recently, things have improved, not the least in European ATM for which EUROCONTROL has developed the "HF Case" [EUROCONTROL, 2007].

In the HF Case, Human Factors are considered on two different levels, the System Level and the Human Performance Level. The underlying philosophy is that the design of tasks, procedures and tools must correspond to the safety requirements on both the level of the overall system as well as on the level of the individual human operator.

 26 Although they may be revealed in a quasi-random way.

systematic26, but hardware designs (especially those such as computer processor chips) can also exhibit systematic failures. Methods of ensuring that software is sufficiently free of defects, such that it can meet its safety function and performance requirements with sufficient reliability and integrity, are discussed in section 6.3.4 below. The concept of a systematic failure may also be applicable to human factors - eg if a procedure is designed

*Common-cause* failures are ones in which redundant subsystems fail at the same time due to the same external events (eg earthquake or tsunami), internal causes (eg power supply failure) or due to a systematic error affecting multiple systems (known as a common mode failure). This could be a software defect or a human maintenance intervention, for example. Common cause failures in practice often limit the achievable reliability of complex systems. The general approach is to attempt to identify and eliminate sources of common cause failure where possible and also to be conservative with reliability predictions to cater for the

The main techniques for ensuring that random hardware failures are sufficiently unlikely are use of high reliability components and redundancy. High-reliability components are expensive so redundancy is used in practice except in special circumstances (such as space applications) where repair is difficult or impossible. Redundancy simply means having two or more subsystems each of which can perform the required safety functions; if one fails then a standby can take over provided that there is some mechanism (automated or manual) to detect the failure. Further gains in reliability can sometimes be achieved by using diversity, where the standby system(s) are not identical to each other. Ideally the diversity should be both conceptual (using different physical processes or measurements) and methodological (different design methods) since this helps to reduce the likelihood of common mode failures (discussed in the next section). Part 2 of IEC 61508 [IEC, 2010], in particular, provides

Of course, we must not forget the fundamental principle that (with the possible exception of Information Systems) the functionality and performance properties of hardware is as important to safety as its reliability and integrity is - see the airbag example in section 2.

HF is a topic that in the past in many industries has had only scant coverage in functional safety [Sandom, 2002]. More recently, things have improved, not the least in European ATM

In the HF Case, Human Factors are considered on two different levels, the System Level and the Human Performance Level. The underlying philosophy is that the design of tasks, procedures and tools must correspond to the safety requirements on both the level of the

for which EUROCONTROL has developed the "HF Case" [EUROCONTROL, 2007].

requirements and guidance on hardware safety techniques and measures.

overall system as well as on the level of the individual human operator.

26 Although they may be revealed in a quasi-random way.

incorrectly.

possibility of unknown causes.

**6.3.3 Human factors safety** 

**6.3.2 Hardware safety** 

Fig. 12. The HF "Gearbox"

The HF aspects can be classified into six areas at each level, and for each specific task, procedure or tool some of twelve areas may be more important than others; however, the underlying principle is that the HF evaluation should involve both levels. The resulting approach is known as the HF Gearbox, as illustrated in Figure 12.

From a safety perspective, the HF Gearbox, at the human-performance level, addresses:


From a safety perspective, the HF Gearbox, at the system level, addresses:


competency, on-the-job training, emergency / abnormal-situation training, testing of training effectiveness, effects on operational task performance


It is crucial from a safety perspective that HF is not considered to be a separate activity rather it must be fully integrated into the safety assessment and resulting safety case.

#### **6.3.4 Software safety**

#### **6.3.4.1 An IEC 61508 perspective on software safety**

Part 3 of IEC 61508 [IEC, 2010] starts at the point where the software requirements specification for a safety related system has been developed, in terms of safety functional behaviour and safety integrity. The system engineering approach described in this chapter should, of course, be used in deriving those software requirements in the first place.

IEC 61508 Part 3 is based on a conventional V lifecycle model for the design and implementation of software. A process-based approach is convenient for software developers who have to follow the standard, but Part 3 stresses that it is not the existence of the process itself but the evidence resulting from the *application* of the process which will demonstrate the achievement of safe software. The development lifecycle stages comprise:


Verification is required after each stage, to check that the output of the design stage is a correct refinement of the input and has other necessary properties. Verification as a minimum includes software code reviews but can include other forms of analysis ranging from code complexity analysis and rigorous inspection techniques up to formal proof that the software has certain properties. Testing of the software mirrors the design, as follows:


The stress laid by IEC 61508 Part 3 on module testing is justified by experience that software which has been well tested at the module level will usually reveal few errors during later testing, although it is a step often skimped due to time pressures during development.

 *Procedures, Roles and Responsibilities*: allocation of task, involvement, workload, trust / confidence, skill degradation, procedure format and positioning, procedure

 *Teams and Communication*: Team structures / dynamics / relations, coordination, handover processes, communication workload, phraseology, language differences,

*Recovery from Failures*: Human error potential, error prevention / detection / recovery,

It is crucial from a safety perspective that HF is not considered to be a separate activity -

Part 3 of IEC 61508 [IEC, 2010] starts at the point where the software requirements specification for a safety related system has been developed, in terms of safety functional behaviour and safety integrity. The system engineering approach described in this chapter

IEC 61508 Part 3 is based on a conventional V lifecycle model for the design and implementation of software. A process-based approach is convenient for software developers who have to follow the standard, but Part 3 stresses that it is not the existence of the process itself but the evidence resulting from the *application* of the process which will demonstrate the achievement of safe software. The development lifecycle stages comprise:

Verification is required after each stage, to check that the output of the design stage is a correct refinement of the input and has other necessary properties. Verification as a minimum includes software code reviews but can include other forms of analysis ranging from code complexity analysis and rigorous inspection techniques up to formal proof that the software has certain properties. Testing of the software mirrors the design, as follows:

module or unit testing, to demonstrate that the each software module behaves in

integration test at the software design level(s) (which includes hardware/software

safety validation testing at the software requirements level, to provide confidence that

The stress laid by IEC 61508 Part 3 on module testing is justified by experience that software which has been well tested at the module level will usually reveal few errors during later testing, although it is a step often skimped due to time pressures during development.

integration testing), to show that all modules function together as intended

the safety function and performance requirements are met.

rather it must be fully integrated into the safety assessment and resulting safety case.

should, of course, be used in deriving those software requirements in the first place.

training effectiveness, effects on operational task performance

communication methods, interference effects, information content

structure, procedure content, procedure realism

detection of, and recovery from, equipment failures.

**6.3.4.1 An IEC 61508 perspective on software safety** 

**6.3.4 Software safety** 

software architectural design,

accordance with its specification

 detailed design, module design,

coding.

competency, on-the-job training, emergency / abnormal-situation training, testing of

Detailed techniques and measures are recommended in Annexes A and B of IEC 61508 Part 3 to support each lifecycle stage. Techniques are either Highly Recommended (HR), Recommended (R) or noted without any specific force (-). In a very small number of cases, techniques are Not Recommended. The nature and rigour of the HR techniques and the number to be applied increases with the SIL – this is common to a number of software safety standards and guidelines. However, it is not possible simply to apply all the HR techniques for a given SIL, since some may be contradictory; therefore judgement must be applied in selecting the techniques which will give the greatest benefit. If a relevant HR technique is not used, its omission must be agreed with the independent safety assessor (see below). The standard stresses that it is the combination of testing and analysis that provides the necessary confidence that the software will be safe.

Properties of the design artefact(s) are stated for each lifecycle stage as a guide to selecting and justifying the techniques and measures to be used. Properties include:


An "intrinsic" error is one which can be recognised regardless of the functions which the software is to realise – examples at the source code level could include division by zero, numeric overflow, or access via a null pointer. This class of errors typically cause run-time "crashes" and there are analytical tools which can help to eliminate such errors.

Use of pre-existing software elements (such as an operating system or communications software) is allowed, provided that sufficient evidence of reliable operation can be provided. This can include evidence from non-safety applications subject to certain conditions.

Although IEC 61508 Part 3 is based on the V lifecycle, alternative lifecycles can be used. For example, if code is generated automatically from a high-level requirements model (which is possible for some types of control applications) then the software design and coding stages can be omitted, although testing will still be required. The selection of tools to support each lifecycle stage and technique is specifically addressed in Part 3 – a tool which makes a direct contribution to the final software (such as a compiler) must be chosen and justified with greater care than one where the output of the tool is readily amenable to manual inspection and correction.

In common with Parts 1 and 2 of IEC 61508, independent functional safety assessment is required, the degree of independence depending on the SIL to be achieved. For example, a person independent of the designer is sufficient at SIL 1 whereas an independent organisation is required at SIL 4. The assessor should examine both the process itself (the selection of techniques and measures) and the products of the development (for example the test and analysis results and the backing evidence that the testing and analysis have been sufficiently thorough). It is interesting to note that independence between designer/implementer and verifier/tester is not required in IEC 61508, although it is required in other standards and guidelines for safe software.

All evidence resulting from test, analysis and field service experience must be recorded and kept under configuration management along with the design artefacts and software code– therefore, applying the standard should generate the evidence required to meet SW01 (see subsection 6.3.4.2).

#### **6.3.4.2 A regulatory perspective on software safety**

There are two main ways of gaining **assurance** that the safety requirements have been properly and completely implemented in an SRS: assurance which is obtained directly from the attributes of the product itself; and that which is obtained from the characteristics of the processes which gave rise to the product.

So what is an appropriate balance between product and process assurance, and what part should the various standards play in the achievement and demonstration of system safety? The SW01 section of the UK CAA's safety regulation CAP 670 [UK CAA] takes an objectivebased approach that provides a sensible answer to these questions27. SW01 takes an approach to safety assurance which is deliberately non-prescriptive in terms of the development process; instead, it demands arguments and evidence of the achievement of five safety assurance objectives – ie to show that the:


SW01 defines seven behavioural *attributes* of safety-related software, which the safety assurance must address, with equal rigour (or a valid argument presented as to why a particular attribute has not been addressed), as follows: functionality, timing, accuracy, robustness, reliability, resource usage, and overload tolerance.

In the context of requirements satisfaction, SW01 allows assurance to be offered from three different sources – ie testing, analysis (of design), and field service experience (FSE). For each source of assurance, two forms of evidence are required, for each *attribute*, as follows:


*Testing* is restricted largely to tests of the final product (executable code) or a very close relation of it. *Direct* evidence is concerned with what tests were carried out and what the

<sup>27</sup> Although SW01 covers specifically safety-related software, most of the principles in it can apply equally to the wider aspects of the system

therefore, applying the standard should generate the evidence required to meet SW01 (see

There are two main ways of gaining **assurance** that the safety requirements have been properly and completely implemented in an SRS: assurance which is obtained directly from the attributes of the product itself; and that which is obtained from the characteristics of the

So what is an appropriate balance between product and process assurance, and what part should the various standards play in the achievement and demonstration of system safety? The SW01 section of the UK CAA's safety regulation CAP 670 [UK CAA] takes an objectivebased approach that provides a sensible answer to these questions27. SW01 takes an approach to safety assurance which is deliberately non-prescriptive in terms of the development process; instead, it demands arguments and evidence of the achievement of

software safety requirements correctly state what is necessary and sufficient to achieve

each software safety requirement can be traced to the same level of design at which its

software implemented as a result of software safety requirements is not interfered with

 all assurance relates to a known executable version of the software, a known range of configuration data and a known set of software products, data and descriptions that

SW01 defines seven behavioural *attributes* of safety-related software, which the safety assurance must address, with equal rigour (or a valid argument presented as to why a particular attribute has not been addressed), as follows: functionality, timing, accuracy,

In the context of requirements satisfaction, SW01 allows assurance to be offered from three different sources – ie testing, analysis (of design), and field service experience (FSE). For each source of assurance, two forms of evidence are required, for each *attribute*, as follows: *Direct* evidence - that which provides actual measures of the product (software) attribute concerned and is the most direct and tangible way of showing that a particular

 *Backing* evidence would relate to quality of the process by which those measures were obtained and provides information about the quality of the *direct* evidence, particularly

*Testing* is restricted largely to tests of the final product (executable code) or a very close relation of it. *Direct* evidence is concerned with what tests were carried out and what the

27 Although SW01 covers specifically safety-related software, most of the principles in it can apply

subsection 6.3.4.2).

**6.3.4.2 A regulatory perspective on software safety** 

five safety assurance objectives – ie to show that the:

by other software [that is not safety related]

assurance objective has been achieved

equally to the wider aspects of the system

the amount of confidence that can be placed in it.

have been used in the production of that version.

robustness, reliability, resource usage, and overload tolerance.

tolerable safety, in the system context the software satisfies its safety requirements

satisfaction is demonstrated

processes which gave rise to the product.

results showed in terms of satisfaction of the safety requirements. *Backing* evidence is concerned with showing that the tests were specified correctly and carried out adequately.

*FSE* is based on previous operational use of the software. *Direct* evidence is concerned with analysis of data from FSE and what the results of that analysis showed in terms of satisfaction of the safety requirements. *Backing* evidence is concerned with showing that the environment from which the data was obtained is sufficiently similar to that to which the reused software will be subjected, that an adequate fault-recording process was in place when the software was originally deployed, and that the data-analysis process was adequate and properly carried out.

*Analysis* covers any proof of requirements satisfaction that is obtained from the design or other representation of the software, including models, prototypes, source code etc. *Direct* evidence is concerned with what the results of the particular *analysis* techniques showed in terms of satisfaction of the safety requirements. *Backing* evidence is concerned with showing that design and other representations of the software were appropriate and adequate and that the *analysis* was adequately specified and properly conducted; where *analysis* was carried out on source code, it is necessary also to show that the object code correctly translates the source code.

In general, the rigour demanded of the evidence increases as the integrity required of the software increases. However, SW01 defined this integrity only in terms of the consequence of failure – ie it does not take account of the probability that such a failure will occur, as is the case, for example, in IEC 61508.

The way in which evidence from the three sources can be used in combination varies according to the attribute for which evidence is offered, and also depends on the integrity required of the software. As the required integrity increases, SW01 allows less dependence on a single source of evidence, and places more emphasis on *analysis* of design as the primary source of evidence.

The advantage of testing over design analysis is that it is carried out on the end product rather than on a representation of that product. On the other hand, the effectiveness of testing can be limited by problems with test coverage and with confidence levels in respect of statistical attributes of the system. For these reasons, assurance from testing usually takes second place to design analysis for the more safety-critical systems, though it is interesting to note that SW01, for example, mandates some degree of testing even where the primary source of assurance is design analysis.

As a source of assurance, design analysis has one further advantage over testing – it is available much earlier in the lifecycle and can therefore make a major contribution to the reduction of programme risk. Iterative development techniques seek to bring forward the availability of test evidence but in so doing bring with them their own problems, including a possible reduction in effectiveness of design assurance unless specific measures are taken to avoid this.

#### **6.4 Transfer phase**

The Transfer-into-Operation Phase takes the assurance process up to the point that the system concerned is ready to be brought into operational service. In making an argument for Transfer into Operation, we need to show that:


Preparation for operational service is about showing that all the necessary operational procedures, trained personnel and technical resources are in place to operate and maintain the system and that, under conditions as close a possible to real operational service, the system / resources as a whole will meet the expectations of the service users (ie system validation). Typically, a safety case would be submitted for management approval and regulatory endorsement before transition from the current system to the new system begins. Therefore it is necessary to show beforehand that any risks associated with transition will be managed such that the AFARP criterion will be met throughout this process.

Especially in safety-critical industries that require an uninterrupted operation for 365 days per year, live system testing and the subsequent transition from the old to the new system are often hazardous. Thus a Transition Plan needs to drawn up and executed to ensure that:


#### **6.5 Operational phase**

The safety focus during the whole of the systems in-service life is on providing assurance of its continuing safe operation. This is vital for a number of reasons, including:


Thus we need to ensure first of all that the system (comprising equipment, people and procedures) will be supported so as to maintain the required level of safety. The evidence in support of this will be mainly in the form of SMS processes (and related operational and engineering procedures) and how it will be ensured that they will be properly applied including the use of surveys and audits related to the application of those SMS processes.

Secondly, in order to provide assurance of actual safety achievement we need to show that:


Finally, we need to show that there are procedures in place to manage future changes to the system and / or it operational environment.

#### **7. Conclusions**

124 Systems Engineering – Practice and Theory

everything necessary has been to prepare the new (or modified) system for operational

 the process of bringing the system into service – ie transitioning from the current system to the full new system – will itself be as safe as reasonably practicable

Preparation for operational service is about showing that all the necessary operational procedures, trained personnel and technical resources are in place to operate and maintain the system and that, under conditions as close a possible to real operational service, the system / resources as a whole will meet the expectations of the service users (ie system validation). Typically, a safety case would be submitted for management approval and regulatory endorsement before transition from the current system to the new system begins. Therefore it is necessary to show beforehand that any risks associated with transition will be

Especially in safety-critical industries that require an uninterrupted operation for 365 days per year, live system testing and the subsequent transition from the old to the new system are often hazardous. Thus a Transition Plan needs to drawn up and executed to ensure that: the hazards and risks associated with the transition process have been completely

contingency plans are in place to revert to a safe state if the transition is not entirely

The safety focus during the whole of the systems in-service life is on providing assurance of

the *a priori* safety assessment, covered in Arg 1 to 3, might not be complete and correct

the system, including the equipment and human elements, might degrade in

the system will most probably be subject to changes at various times during its

Thus we need to ensure first of all that the system (comprising equipment, people and procedures) will be supported so as to maintain the required level of safety. The evidence in support of this will be mainly in the form of SMS processes (and related operational and engineering procedures) and how it will be ensured that they will be properly applied including the use of surveys and audits related to the application of those SMS processes. Secondly, in order to provide assurance of actual safety achievement we need to show that:

its continuing safe operation. This is vital for a number of reasons, including:

the operational environment might change during the life of the system.

 there is a culture to encourage full and accurate reporting of safety incidents the frequency of safety incidents will be measured against pre-defined indicators

appropriate corrective action will be taken to prevent incident recurrence.

managed such that the AFARP criterion will be met throughout this process.

measures have been put in place to reduce those risks *ALARP*

the old system components can be removed safely

all reported incidents will be properly investigated

identified and correctly assessed

service

successful

**6.5 Operational phase** 

in every particular

operational service

operational life

We started by asserting that safety is not a separate attribute of a system but is a property that depends on other attributes, and on the context in which the system is used. The misconception that adequate reliability and integrity are sufficient to ensure the safety of a system has been prevalent in, for example, the ATM sector [Fowler & Grand-Perret, 2007], but is dispelled in the specific context of software by the results of extensive research [Leveson, 2001] and more generally herein by rationale argument using the example of a car airbag.

After introducing some basics safety concepts and principles, we then showed that safety is as much dependent on correct functionality & performance of the system as it is on system reliability & integrity - the former set of attributes being necessary for the mitigation of *preexisting* risk (inherent in the operational environment) and the latter for controlling *systemgenerated* risk (caused by system failure). This led to the view that what was needed was a broader approach (what we called the *success & failure* approach) to system safety assessment, a view that was then shown to be consistent with the principles underlying the generic functional-safety standard IEC 61508 [IEC, 2010] - principles that, however, are not always captured in industry-specific instantiations of this standard.

We then turned to a vital aspect of systems engineering - ie requirements engineering, some important principles of which are advocated in [Jackson, 1995] - and found direct equivalence between the derivation of the required safety properties of a system and the derivation of its non-safety properties.

Finally, we introduced the principles of safety assurance and safety cases and showed how they should drive all the processes of a safety assessment, throughout the lifecycle. Whilst emphasising the importance of ensuring that the level of assurance is appropriate to the safety-criticality of the system, we now leave it to the knowledgeable systems engineer to recognise the common processes of a system development lifecycle in this chapter and to conclude for himself / herself that safety is (with the assurance proviso) actually just a viewpoint (albeit a very important one) on systems engineering!

#### **8. Acknowledgment**

The authors would like to record their appreciation for the many very helpful suggestions made by Drs Michael A Jackson and Anthony Hall during the development of this chapter.

#### **9. References**

European Commission, 2005, Regulation (EC) No 2096/2005, *Common Requirements for the Provision of Air Navigation Services*, published in the Official Journal of the European Union

Eurocontrol, 2007, *The Human Factors Case: Guidance for Human Factors Integration*  Eurocontrol, 2010, http://www.skybrary.aero/index.php/Risk\_Assessment


Hull E, Jackson K and Dick J, 2005, *Requirements Engineering,* Springer, ISBN 1852338792


### **Life Cycle Cost Considerations for Complex Systems**

 John V. Farr *United States Military Academy USA* 

#### **1. Introduction**

126 Systems Engineering – Practice and Theory

Fowler D, Grand-Perret S, 2007, *Penetrating the Fog of Safety Assessment - and Vice-versa*,

Fowler D, Perrin E, and Pierce R, 2009, *2020 Foresight - a Systems-engineering Approach to* 

International Electrotechnical Commission, 2010, IEC 61508, *Functional Safety of Electrical/* 

Jackson M A, 1995, *The World and the Machine*, Proceedings of 17th International Conference

Kelly T and Weaver R, 2004, *The Goal Structuring Notation – a Safety Argument Notation*,

Ladkin P B, 2008, *An Overview of IEC 61508 on E/E/PE Functional Safety*,

Leveson N G, 2001, *The Role of Software in Recent Aerospace Accidents*, 19th International

Mana P, De Rede J-M and Fowler D, 2007 *Assurance Levels for ATM system elements: Human,* 

Perrin E and Kirwan B, 2009, *Predicting the Future: The Integrated Risk Picture*, Proceedings of the 4th IET International Conference on System Safety Engineering, London, UK Pierce, R and Fowler D, 2010, *Applying IEC 61508 to Air Traffic Management*, Proceedings of

Rail Safety and Standards Board - RSSB (2007). *Engineering Safety Management (The Yellow* 

RTCA, 1992, DO-178B, *Software Considerations in Airborne Systems and Equipment Certification*, Sandom C, 2002, *Human Factors Considerations for System Safety*, in Components of System

Safety, Redmill F and Anderson T [Eds.], proceedings of 10th Safety Critical Systems Symposium, 5th-7th February 2002 Southampton, Springer-Verlag, UK UK CAA, 2011, UK Civil Aviation Authority, CAP670, *Air Traffic Services Safety Requirements*

*Operational Procedure, and Software*, proceedings of the 2nd IET International

ISO/IEC 12207, 2008a, *Systems and Software Engineering* - *Software Lifecycle Processes*, V2.0 ISO/IEC 15288, 2008b, *Systems and Software Engineering - System Lifecycle Processes*, V 2.0 ISO/IEC 19501, 2005, *Information technology, Open Distributed Processing — Unified Modelling* 

Hull E, Jackson K and Dick J, 2005, *Requirements Engineering,* Springer, ISBN 1852338792 International Electrotechnical Commission, 2006a, IEC 60812, *Analysis techniques for system* 

*reliability – Procedure for failure mode and effects analysis (FMEA)* International Electrotechnical Commission, 2006b, IEC 61025, *Fault Tree Analysis*

*Electronic/ Programmable Electronic Safety Related Systems*, V 2.0

USA

*Language (UML)*, V 1.4.2

on Software Engineering, IEEE, pp283-292

http://www-users.cs.york.ac.uk/~tpk/dsn2004.pdf

System Safety Conference, Huntsville AL, USA

Conference on System Safety, London, UK

http://www.causalis.com/IEC61508FunctionalSafety.pdf

the Eighteenth Safety Critical Systems Symposium, Bristol, UK

*Book), Volumes 1 and 2 - Fundamentals and Guidance*, Issue 4 Reason J, 2000, http://www.bmj.com/cgi/content/full/320/7237/768

Proceedings of the 2nd IET International Conference on System Safety, London, UK

*Assessing the Safety of the SESAR Operational Concept*, Proceedings of the 8th USA/Europe Air Traffic Management Research and Development Seminar, Napa,

> Because of complexity and technology, the upfront costing of complex systems has become a tremendous challenge. We understand how to cost hardware and to a lesser extent software. However, we are still developing tools and processes for costing the integration and interfaces of complex systems. As we scale to larger and more complex systems, system-ofsystems (SoS), and enterprises our ability to determine costs becomes less relevant and reliable. Our estimates can be off by an order of magnitude. Unfortunately, this often the result of requirements creep as much as it is our inability to translate requirements to products.

> Cost estimation techniques can be divided into three categories: parametric costs estimates or PCEs, analogies, and detailed engineering builds. Figure 1 shows their applicability throughout a typical product life cycle. We chose to ignore accounting in the chapter. However, capturing expenses in a formal manner is certainly the best way to ascertain costs. Obviously, developing true costing amounts and utilizing good cost management requires good accounting practices and the tracking of expenses using activity based costing techniques. Table 1 summarizes the advantages and disadvantges of these various techniques.

> In this chapter we present some of the methods, processes, tools (MPTs) and other considerations for conducting analysis, estimation and managing the life cycle costs (LCCs) of complex systems.

#### **2. Life cycle considerations**

In today's global business environment, engineers, information technology professionals and practitioners, and other related product development professionals integrate hardware, software, people, and interfaces (i.e., complex systems) to produce economically viable and innovative applications while ensuring that all pieces of the enterprise are working together. No product or services are immune from cost, performance, schedule, quality, and risks and tradeoffs. Yet engineers spend most of their formal education focused on performance and most of their professional careers worrying about resources and schedule. Too often we become fixated on the technical performance to meet the customer's expectations without worrying about the downstream costs that contribute to the total LCCs of a system. Unfortunately, in many cases the LCCs or total ownership costs (TOCs) are ignored because either the total costs would make the project untenable (especially for large government projects) or the increased acquisition costs needed to reduce the LCCs would make the project unacceptable.

Fig. 1. Cost estimation techniques throughout the life cycle (modified from NASA, 2008)


worrying about the downstream costs that contribute to the total LCCs of a system. Unfortunately, in many cases the LCCs or total ownership costs (TOCs) are ignored because either the total costs would make the project untenable (especially for large government projects) or the increased acquisition costs needed to reduce the LCCs would make the

Fig. 1. Cost estimation techniques throughout the life cycle (modified from NASA, 2008)

Could provide

detailed estimate Reliance of actual development data

Development data may not reflect

Require existing actual production

 Subjective/bias may be involved Limited to mature technologies Reliance of single data point Hard to identify appropriate

 Software and hardware often do not scale linearly Not always possible to find

programs of similar scope and

 Often mistakenly use contract prices to substitute for actual

cost correctly Higher uncertainty

 Various levels of detail involvement

cost

data

analog

complexity

**Technique Description Advantages Disadvantages** 

Reliance of

Save time

historical data Less complex than other methods

project unacceptable.

**Actual Costs/ Extrapolation** 

**Analogy/ Comparative/ Case-based Reasoning** 

Use costs spent during prototyping,

Compare available data from similar project previously completed and adjust estimates for the proposed project

hardware engineering development models and early production items to project future costs for the identical system


Table 1. Summary of LCCs estimating techniques (from Young et al., 2010)

We have an extensive array of economic techniques and tools at our disposal to predict and monitor LCCs and schedules yet overruns are commonplace and in general are the rule and not the exception; especially for large software enabled systems. Figure 2 shows some of the external and internal factors that we must tackle in conducting cost analysis and then must be addressed when managing the program in the most effective manner.

Fig. 2. Some of the factors that can affect the cost of a system (modified from Stevens Institute of Technology, 2008)

The specific purposes utilizing a LCCs perspective in acquisition management, product development, product upgrades, etc., includes:


#### **3. Issues surrounding complex systems**

Figure 3 shows cost incurred and the ability to influence LCCs over a typical systems life cycle. The figure clearly shows the importance of upfront systems engineering and managing requirements. Because we do not allocate sufficient resources early in a program/project we often make bad engineering decisions that lead to unplanned downstream costs.

From a LCCs perspective what is even more critical is that while developing products and programming and committing funds when we simply do not have the techniques to estimate costs to a high degree of accuracy. The top down tools we used to estimate costs early in the product development cycle are gross rules of thumb at best. When combined

We have an extensive array of economic techniques and tools at our disposal to predict and monitor LCCs and schedules yet overruns are commonplace and in general are the rule and not the exception; especially for large software enabled systems. Figure 2 shows some of the external and internal factors that we must tackle in conducting cost analysis and then must

be addressed when managing the program in the most effective manner.

Fig. 2. Some of the factors that can affect the cost of a system (modified from Stevens

The specific purposes utilizing a LCCs perspective in acquisition management, product

Control cost through using LCCs contractual provisions in procurements,

Reduce/capture TOCs through using LCCs tradeoffs in the systems

Understanding TOCs implications to determine whether to proceed to next

Figure 3 shows cost incurred and the ability to influence LCCs over a typical systems life cycle. The figure clearly shows the importance of upfront systems engineering and managing requirements. Because we do not allocate sufficient resources early in a program/project we often make bad engineering decisions that lead to unplanned

From a LCCs perspective what is even more critical is that while developing products and programming and committing funds when we simply do not have the techniques to estimate costs to a high degree of accuracy. The top down tools we used to estimate costs early in the product development cycle are gross rules of thumb at best. When combined

Institute of Technology, 2008)

development, product upgrades, etc., includes:

**3. Issues surrounding complex systems** 

development phase.

downstream costs.

Estimate the TOCs to the stakeholder,

engineering/product development process,

Assist in day-to-day procurement decisions, and

with requirements creep, unstable funding, etc., cost estimates of + 100% are to be expected. As shown in Figure 4 many factors can contribute to cost and schedule overruns.

Fig. 3. Costs incurred and committed during our systems life cycle acquisition process (modified from Andrews, 2003)

Fig. 4. Challenges cost estimators typically face (modified GAO, 2009)

The techniques for estimating systems costs vary depending upon where we are in the life cycle. Taking our seven-phase model of conceptual exploration, component advanced development; systems integration and preliminary design, systems demonstration and test and evaluation, production, and operations support and disposal, different techniques might be used to estimate costs. For example, early in conceptual exploration the only technique that might be satisfactory is some type of parametric cost estimation techniques such as Constructive Systems Engineering Cost Model (COSYSMO), which will be explained later in detail. As move further into the product development cycle (say at the end of preliminary design) estimating will be conducted using bottoms up approach/engineering build of the system. Finally, as we enter into production, we will modify our engineering bottoms up model to more accurately reflect the final design elements of hard, software, and interfaces/integration and track costs using formal accounting techniques. Table 2 demonstrates that very early in the product development cycle we simply do not know enough about the system to accurately develop costs. Unfortunately this is when budgets are allocated, bids developed, etc. In order for LCCs to become more accurate we most use software and other formal engineering tools sooner in the design.


Table 2. Cost and schedule estimates as a function technical baseline work products (modified from Barker, 2008)

#### **4. Hardware, software, systems engineering and management costs**

#### **4.1 Hardware costs**

132 Systems Engineering – Practice and Theory

design) estimating will be conducted using bottoms up approach/engineering build of the system. Finally, as we enter into production, we will modify our engineering bottoms up model to more accurately reflect the final design elements of hard, software, and interfaces/integration and track costs using formal accounting techniques. Table 2 demonstrates that very early in the product development cycle we simply do not know enough about the system to accurately develop costs. Unfortunately this is when budgets are allocated, bids developed, etc. In order for LCCs to become more accurate we most use

**Top Down**

**Top Down**

**Analogous** 

**Bottom Up** 

**Bottoms Up**

**Bottoms Up**

Table 2. Cost and schedule estimates as a function technical baseline work products

Interfaces **Analogous** 

of Similar Projects

 Based Upon Technology Maturity Based Upon Architecture Complexity

Estimates Based Upon Architecture

Selected, Testing Plan, etc.

Technical Work Products Delivered Solution Architecture

**Methodologies Used to Develop Cost Estimates** 

 Based Upon Number/Complexity of Requirements Based Upon Number/Complexity of Scenarios Based Upon Number/Complexity of External

Estimates Based Upon Complexity of Technical

Work Products Compared to Similar Complexity

*Estimates Are Based On Experience And Historical Data With A ±75% Accuracy*

 Based Upon Number/Complexity of Requirements Based Upon Number/Complexity of Scenarios

 Estimate Based on Complexity of Technical Work Products Against Known Projects

*Estimates Are Based On Experience And Formal Design And Systems Engineering (SE) Tools With A ±50% Accuracy*

Estimates Based Upon Architecture, Technologies

*Estimates Are Based On Formal Design (Work Breakdown Structure, COCOMO, COSYSMO, Function Point, etc) And SE Tools With A ±10% Accuracy*

Schedules, Implementation Details, and Other

*Estimates Are Detailed Bottoms Up Based Upon All Technical Work Products*

Estimates Based Upon Detailed Design, Test

software and other formal engineering tools sooner in the design.

**Technical Work Products From Which Estimates Are Developed**

Customer Customer Requirements Capabilities Characteristics Concept of Operations or

CONOPS

System System Requirements

Preliminary Architecture

Component Requirements Hardware (HW) and Software (SW) Systems Architecture Document All HW, SW, Processes, and Interfaces

Test Architecture

System Into Production HW, SW, and Processes Design and Test Strategy Service Agreements

**Baseline Created** 

Component (HW, SW, Process)

Design, Test, and Production

(modified from Barker, 2008)

If we use a hierarchal approach (a system of systems/enterprise is composed of systems, systems are composed of subsystems, and subsystems are composed of components) any of these levels will be the building block of a bottoms-up estimate. In its simple form, hardware can be separated into physical component that comprise these building blocks plus the labor for estimating purposes. We can think if this as levels of our work breakdown structure or WBS. Note that when developing LCCs for any component of systems is to correctly develop the WBS and assigning hardware (HW), software (SW), integration, etc., for every phase.

As a first cut and if the WBS is developed correctly, we could use these categories as a way to classify costs. Unfortunately, depending upon where you are in the product life cycle we will need to adjust costs to account for technology maturity which might include readiness levels (Technology Readiness Levels or TRLs, Systems Readiness Levels or SRLs, Integration Readiness Levels or IRLs), learning curve issues, etc. NASA (2011) presents a tutorial on TRLs.

As you transition from a top down cost estimating relationship such as COSYSMO, you could use rough relations to estimate these costs over the product life cycle and refine them as the design becomes more final. The WBS and cost models developed must evolve as you move further down the life cycle.

#### **4.2 Software**

Software dominates most complex systems. The COnstructive COst Model or COCOMO family of models (see the Center for Systems and Software Engineering, 2011a) are the most widely used software estimation tools in industry. Most developers have validated models for translating lines of code in costs. The challenge for estimating software costs is translating requirements to some type of architecture/requirements to lines of code. Without experience in developing the product software and integrations costs are impossible to develop. The GAO (2009) presents a good overview of the challanges and technqiues for estimating and costing software.

#### **4.3 Interfaces/Integration at the system level**

No overarching methodology exists for costing the integration of hardware, software, and developing the interfaces. Interfaces/integration challenges are the key reason why the costs of systems scale non linearly. We know from the DoD, NASA, and other developers of large SoS problems that we do not know how to estimate their costs. The GAO (2009) summarized current major DoD procurements all had experienced significant cost and schedule growth.

#### **4.4 Systems engineering/project management costs**

One area that has received significant attention because it is often underfunded and has been connected to major cost overruns is systems engineering and project management (SE/PM). Figure 5 shows some of the SE/PM functions that comprise this category. Stem, et al (2006) reported that the average SE/PM costs for major aircraft programs had increase from 8% in the 1960s to about 16% in the 1990s of the total development costs. The SE/PM components are significant to controlling costs, schedule, and quality during product design. However, what are the SE/PM concerns post production? These also are significant for upgrades and supportability issues.

Fig. 5. SE/PM as a function of Integrated Logistics Support (ILS) for a typical Air Force program (from Stem, et al., 2006)

According to Stem et al. (2006) of Rand there is about roughly a 50/50 split of systems engineering and project management costs for most large defense programs. An as shown in Figure 6, these costs can be significant and depending upon maturity, oversight, complexity, etc., can account for about 20% of the development costs. This figure uses lot numbers across product line. Unfortunately, COSYSMO only provides a technique for estimating systems engineering cost during the development phase. Research is underway to identify quantitative means for estimating project management costs from a top down perspective (Young et al, 2011). For services based costing (SBC) to evolve this will be needed.

The COSYSMO is a model that can help people reason about the economic implications of systems engineering on projects. Similar to its predecessor, COCOMO II (Center for Systems and Software Engineering, 2011b), it was developed at the University of Southern California as a research project with the help of BAE Systems, General Dynamics, Lockheed Martin, Northrop Grumman, Raytheon, and SAIC. COSYSMO follows a parametric modeling approach used to estimate the quantity of systems engineering labor, in terms of person months, required for the conceptualization, design, test, and deployment of large-scale software and hardware projects. User objectives include the ability to make Proposal estimates, investment decisions, budget planning, project tracking, tradeoffs, risk management, strategy planning, and process improvement measurement (see Valerdi, 2005 and 2006).

Fig. 6. Average systems engineering and project management costs for 22 major Air Force programs (from Stem et al, 2006)

Each parameter in the COSYSMO Algorithm is part of a Cost Estimating Relationships (CERs) that was defined by systems engineering experts. COSYSMO is typically expressed as (Valerdi, 2005, 2006)

$$\text{PM}\_{\text{NS}} = A \left( \sum\_{k} (\alpha\_{e,k} \Phi\_{e,k} + \alpha\_{n,k} \Phi\_{n,k} + \alpha\_{d,k} \Phi\_{d,k}) \right)^{E} \prod\_{j=1}^{14} \text{EM}\_j \tag{1}$$

where:

134 Systems Engineering – Practice and Theory

al (2006) reported that the average SE/PM costs for major aircraft programs had increase from 8% in the 1960s to about 16% in the 1990s of the total development costs. The SE/PM components are significant to controlling costs, schedule, and quality during product design. However, what are the SE/PM concerns post production? These also are significant

Fig. 5. SE/PM as a function of Integrated Logistics Support (ILS) for a typical Air Force

(Young et al, 2011). For services based costing (SBC) to evolve this will be needed.

According to Stem et al. (2006) of Rand there is about roughly a 50/50 split of systems engineering and project management costs for most large defense programs. An as shown in Figure 6, these costs can be significant and depending upon maturity, oversight, complexity, etc., can account for about 20% of the development costs. This figure uses lot numbers across product line. Unfortunately, COSYSMO only provides a technique for estimating systems engineering cost during the development phase. Research is underway to identify quantitative means for estimating project management costs from a top down perspective

The COSYSMO is a model that can help people reason about the economic implications of systems engineering on projects. Similar to its predecessor, COCOMO II (Center for Systems and Software Engineering, 2011b), it was developed at the University of Southern California as a research project with the help of BAE Systems, General Dynamics, Lockheed Martin, Northrop Grumman, Raytheon, and SAIC. COSYSMO follows a parametric modeling approach used to estimate the quantity of systems engineering labor, in terms of person months, required for the conceptualization, design, test, and deployment of large-scale software and hardware projects. User objectives include the ability to make Proposal estimates, investment decisions, budget planning, project tracking, tradeoffs, risk

for upgrades and supportability issues.

program (from Stem, et al., 2006)


The size of the system is the weighted sum of the system requirements (REQ), system interfaces (IF), algorithms (ALG), and operational scenarios (SCN) parameters and represents the additive part of the model while the EM factor is the product of the 14 effort multipliers.

Obviously there are some shortcomings to this type of approach that would be inherent in any top down model develop early in the life cycle and would include:


#### **5. Methods and tools**

#### **5.1 Engineering economy**

Engineering economics/economy, is a subset of economics for application to engineering projects. Engineering economics uses relatively simple mathematical techniques to make decisions about capital projects by making comparison of various alternatives. Engineering economy techniques allows for comparisons by accounting for the time value of money. Most engineers are trained in engineering economy and it is the predominate collection of techniques that are used in support of LCCs analysis of complex systems.

Spreadsheets have dramatically changed how we conduct economic analysis of alternatives. What once involved manipulation of equations and tables can now modeled in a spreadsheet using only a few basic commands. The use of spreadsheets are ideal because


#### **5.2 Simulation based costing**

Systems and enterprises at the most basic level are an integrated composition of elements or sub systems governed by processes that provide a capability to satisfy a stated need or objective. Thus, simulation is an ideal way to analyze these systems. To develop a system or enterprise successfully you must first define the problem that exists, identify the mission requirements (or business drivers) of the organization(s) needing the problem to be solved, evaluate high-level CONOPS for solving the problem, select the concept that makes the most sense in light of the product or mission requirements, develop an operational concept around the selected concept, create architectures and derived requirements for the subsystems, components, and configuration items consistent with the decomposition of the system, design the integration, test and evaluation process for the parts of the system, conduct the integration and test process for the parts of the system, manufacture/assemble the parts of the system, deploy the system, train operators and maintainers, operate/maintain the system, refine the system, and finally retire the system. Simulation can play a key role during each of these phases to assess risk for operational analysis and LCCs. Simulation can be used to prototype the systems, evaluate CONOPS, and used in determining the cost and associated risk.

For simulation based costing (SBC) analysis constructive simulations are the primary analysis tool. Simulation is important for cost analysis because

the system can be prototyped,

136 Systems Engineering – Practice and Theory

The model is developed on historical data – unless you have significant experience in

 Requirements are difficult to use for estimating in that it is difficult to correlate requirements and effort. COSYSMO does recognize this implicitly by distinguishing

Engineering economics/economy, is a subset of economics for application to engineering projects. Engineering economics uses relatively simple mathematical techniques to make decisions about capital projects by making comparison of various alternatives. Engineering economy techniques allows for comparisons by accounting for the time value of money. Most engineers are trained in engineering economy and it is the predominate collection of

Spreadsheets have dramatically changed how we conduct economic analysis of alternatives. What once involved manipulation of equations and tables can now modeled in a spreadsheet using only a few basic commands. The use of spreadsheets are ideal because

 Most problems repetitive calculations that can be expressed as simple formulas as a function of time. Note that Excel has built in functions for most engineering economy

Sensitivity analysis is key to conducting good analysis and by properly designing a

Complex models can be rapidly and easily built and are for the most part self

The user can develop professional reports and plots using the functionality in most

Systems and enterprises at the most basic level are an integrated composition of elements or sub systems governed by processes that provide a capability to satisfy a stated need or objective. Thus, simulation is an ideal way to analyze these systems. To develop a system or enterprise successfully you must first define the problem that exists, identify the mission requirements (or business drivers) of the organization(s) needing the problem to be solved, evaluate high-level CONOPS for solving the problem, select the concept that makes the most sense in light of the product or mission requirements, develop an operational concept around the selected concept, create architectures and derived requirements for the subsystems, components, and configuration items consistent with the decomposition of the system, design the integration, test and evaluation process for the parts of the system, conduct the integration and test process for the parts of the system, manufacture/assemble the parts of the system, deploy the system, train operators and maintainers, operate/maintain the system, refine the system, and finally retire the system. Simulation can play a key role during each of these phases to assess risk for operational analysis and LCCs. Simulation can be used to prototype the systems, evaluate CONOPS, and used in

spreadsheet the parameters can be changed and plots easily developed.

techniques that are used in support of LCCs analysis of complex systems.

that domain the model should not be used; and

between pure and equivalent requirements.

**5. Methods and tools 5.1 Engineering economy** 

equations.

documenting.

spreadsheets.

**5.2 Simulation based costing** 

determining the cost and associated risk.


Figure 7 demonstrates how simulation can be used throughout the life cycle to assess risk. Note how the distribution of the cost estimate (Y axis) and in the input (triangles on the X axis) both have less variability as the product/project becomes more mature and defined.

Fig. 7. Cost risk as a function of product life cycle phases

#### **5.3 Parametric cost estimation**

The following definitions are used to describe parametric cost estimation (modified from NASA, 2008 and DoD, 1995):


functions, e.g., cost quantity relationships, inflation factors, staff skills, schedules, etc. Parametric cost models yield product or service costs at designated levels and may provide departmentalized breakdown of generic cost elements. A parametric cost model provides a logical and repeatable relationship between input variables and resultant costs.

 Cost Estimating Relationship or CERs - An algorithm relating the cost of an element to physical or functional characteristics of that cost element or a separate cost element; or relating the cost of one cost element to the cost of another element. CERs can be a functional relationship between one variable and another and may represent a statistical relationship between some well-defined program element and some specific cost, etc. Many costs can be related to other costs or non-cost variables in some fashion but not all such relationships can be turned into CERs.

PCEs utilizes CERs and associated mathematical algorithms, logic, processes to establish cost estimates and are probably the most widely used tool to capture experience. Figure 8 shows a process that can be used for developing CERs for PCEs. Like any mathematical based process, it should only be used for the range described by the "relationship" data.

Fig. 8. Process for determining parametric cost estimates (modified from DoD, 1995)

The techniques used in estimating software are much more mature than systems. At best the tools commonly used are estimates and analogies and have little mathematical basis. Whether purely a service's centric or a physical system, most products now have significant software element. The methodology for estimating software has been around for over 30 years and can be classed as PCEs tool. However, because of new languages, hardware/software integration challenges, computer aided software tools, etc., techniques/algorithms must be continually updated. Software estimating is still dominated by experience supplement with quantitative techniques. NASA (2002) has an online handbook describing indepth parametric cost estimating.

#### **5.4 Analogy**

138 Systems Engineering – Practice and Theory

 Cost Estimating Relationship or CERs - An algorithm relating the cost of an element to physical or functional characteristics of that cost element or a separate cost element; or relating the cost of one cost element to the cost of another element. CERs can be a functional relationship between one variable and another and may represent a statistical relationship between some well-defined program element and some specific cost, etc. Many costs can be related to other costs or non-cost variables in some fashion

PCEs utilizes CERs and associated mathematical algorithms, logic, processes to establish cost estimates and are probably the most widely used tool to capture experience. Figure 8 shows a process that can be used for developing CERs for PCEs. Like any mathematical based process, it should only be used for the range described by the "relationship" data.

> **Data Evaluation and Normalization**

Cost/Quantity • Constant Year \$ • Escalation

> **Regression and Curvefit**

**Test Relationships Data Analysis and**

• Unit

C = aX C = aX**<sup>b</sup>** C = aX + b

Fig. 8. Process for determining parametric cost estimates (modified from DoD, 1995)

**Validation, Verification, Certification**

**CER Database**

**Correlation**

• R2

• Data Plots • Data Subsets • Dimensional Analysis • Software

**Selection of Variables** • Rational Dependent Variables • Interdependencies

but not all such relationships can be turned into CERs.

resultant costs.

**Data Collection**

• Company Databases • References • Contractors • DoD, NASA. Other

**Select CER(s)**

C = aX + by

functions, e.g., cost quantity relationships, inflation factors, staff skills, schedules, etc. Parametric cost models yield product or service costs at designated levels and may provide departmentalized breakdown of generic cost elements. A parametric cost model provides a logical and repeatable relationship between input variables and

> Analogy estimates are performed on the basis of comparison and extrapolation using like items or efforts. In many instances this can be accomplished using simple relationships or equations representative of detailed engineering builds of past projects. Obviously, this is the preferred means to conduct a cost estimate based upon past programs that is technically representative of the program to be estimated. Cost data is then subjectively adjusted upward or downward, depending upon whether the subject system is felt to be more or less complex than the analogous program (from NASA, 2008).

#### **5.5 Engineering build or bottom up methodology**

The engineering build or bottom up methodology rolls up individual estimates for each element/item/component into the overall cost estimate. This can be accomplished at the WBS element or at the component level. This costing methodology involves the computation of the cost of a WBS element by estimating at the lowest level of detail and computing quantities and levels effort to determine the total system cost. Obviously, this is the most accurate means to develop a cost estimate. The challenge is early in the systems development that a bottom's up approach cannot be utilized because the systems haven't been fully designed. Ideally, you would like to take bottom-up estimates and scale based upon experience. In order to imporve our cost estimates we must conduct bottoms-up estimating soon in the product life cycle. This requires good systems engineering to translate requirements to physical architecture.

#### **6. From requirements to architectures**

From a set a system requirements or CONOPs a functional description is developed where the system level requirements or "whats" are translated to "hows" using tools such as functional block diagrams. This functional hierarchy process and interdependencies are shown in Figure 9. The functional description provides the basis for either a physical architecture or a WBS.

#### **7. Costing software**

Almost every aspect of our modern society is controlled by software. You can look no further than the defense industry to see how dramatic and persuasive software has become. Consider the following military examples the


Fig. 9. Role of functional and physical views of a system (from Stevens Institute of Technology, 2009)

Software requirements growth (% of functionality provided by software) has grown from less than 10% in the 1980s to 80% in our current world (National Research Council, 2008).

Software is also redefining the consumer's world. Microprocessors embedded in today's automobiles require software to run, permitting major improvements in their performance, safety, reliability, maintainability, and fuel economy. According to Elektrobit (2007), today's high-end automobiles contain up to 70 electronic control units that control the vehicle's major functions. The average car in 1990 had one million lines of code; by 2010, the average car is expected to have up to 100 million lines of code with software and electronics contributing to over one-third of the cost of a car. New devices in the consumer electronics sector have dramatically changed how we play and manage music and conduct personal computing to extend that we manage our daily activities. As software becomes more deeply embedded in most goods and services, creating reliable and robust software is becoming an even more important challenge. Despite the pervasive use of software, and partly because of its relative immaturity especially with regards to integrating complex hardware and software applications, understanding the economics of software presents an extraordinary challenge.

F16A fighter had 50 digital processors and 135 thousands of lines of code or KLOC

The US Army's Future Combat Systems (FCS) will have over 16,000 to 50,000 KLOC

F4 fighter had no digital computer and software (Early 70's),

F16D fighter had 300 digital processors and 236 KLOC (Late 80's),

B-2 bomber has over 200 digital processors and 5,000 KLOC (Late 90's), and

Fig. 9. Role of functional and physical views of a system (from Stevens Institute of

Software requirements growth (% of functionality provided by software) has grown from less than 10% in the 1980s to 80% in our current world (National Research Council, 2008).

Software is also redefining the consumer's world. Microprocessors embedded in today's automobiles require software to run, permitting major improvements in their performance, safety, reliability, maintainability, and fuel economy. According to Elektrobit (2007), today's high-end automobiles contain up to 70 electronic control units that control the vehicle's major functions. The average car in 1990 had one million lines of code; by 2010, the average car is expected to have up to 100 million lines of code with software and electronics contributing to over one-third of the cost of a car. New devices in the consumer electronics sector have dramatically changed how we play and manage music and conduct personal computing to extend that we manage our daily activities. As software becomes more deeply embedded in most goods and services, creating reliable and robust software is becoming an even more important challenge. Despite the pervasive use of software, and partly because of its relative immaturity especially with regards to integrating complex hardware and software applications, understanding the economics of software presents an extraordinary

(Late 70's),

(Late 00's).

Technology, 2009)

challenge.

Engineers typically know how to estimate hardware – we can simply count up the components. However, software and integration/interfaces continue to be the challenge in costing complex systems. Thus, we wrote this chapter to expose readers to the myriad of methods to estimate software. As you will see, historical analysis dominates software cost estimation.

Probably the most important tool in developing a software (or any) cost estimate is to develop some type of functional representation to capture all elements in the life cycle. This includes (modified from DoD, 2005):

A product-oriented family tree composed of hardware, software, services, data, and facilities. The family tree results from systems engineering efforts during the acquisition of a defense materiel item.

A WBS displays and defines the product, or products, to be developed and/or produced. It relates the elements of work to be accomplished to each other and to the end product. A WBS can be expressed down to any level of interest. However the top three levels are as far as any program or contract need go unless the items identified are high cost or high risk. Then, and only then, is it important to take the work breakdown structure to a lower level of definition.

Most models are a mix of expertise based and hybrid because of the subject nature of many of the inputs and algorithms. Expertise is nothing more than subjective human estimating combined with some simple heuristics. One large defense contractor uses the expertise and algorithm to estimate software costs:


This is one example of an experienced based algorithm combined with a mathematical model to produce a hybrid technique. Most companies use "rules of thumb" with hybrid techniques to estimate software development costs.

The original COCOMO is an algorithm-based model developed by Boehm (1981) and is used predicts the effort and schedule for a software product development. The model is based on inputs relating to the size of the software and a number of cost drivers that affect productivity COCOMO and drew on a study of about sixty projects with software ranging in size from 2,000 to 100,000 lines of code. Most companies even today use a modified version of one of the COCOMO family of models to estimate software development times and efforts.

The original COCOMO consists of a hierarchy of three increasingly detailed versions (modified from NASA, 2008):


 Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver's impact on each step (analysis, design, etc.) of the software engineering process engineering.

The basic COCOMO, which is also referred to as COCOMO 81 (Boehm, 1981), is a static model that utilizes a non-linear single valued input equation to compute software development effort (and cost) as a function of software program size. The main input into the model is estimated KDSI. The model takes the form:

$$\mathbf{E} = \mathbf{a} \mathbf{S} \mathbf{b} \tag{2}$$

where: E = effort in person-months,


Note that all models that COSYMO and other COCOMO based models all use this type of exponential model. Typically they all follow the form presented in Equation 2 with additional multiplicative factors.

#### **8. Cost management**

#### **8.1 Introduction**

Engineering cost management can be defined as the process to identify, allocate, and track resources needed to meet the stakeholder's requirements. An integrated, process-centered, all backed with quantifiable data and documented processes provides real and tangible benefits to all stakeholders. Engineering cost management can best be described as an integrated, process-centered, measurable, and disciplined approach to LCCs and management to make the tradeoffs between cost, performance, schedule, and risk. Good cost management practices, supported by sound analysis, can lead to (modified from NASA, 2008):


 Detailed COCOMO incorporates all characteristics of the intermediate version with an assessment of the cost driver's impact on each step (analysis, design, etc.) of the

The basic COCOMO, which is also referred to as COCOMO 81 (Boehm, 1981), is a static model that utilizes a non-linear single valued input equation to compute software development effort (and cost) as a function of software program size. The main input into

Note that all models that COSYMO and other COCOMO based models all use this type of exponential model. Typically they all follow the form presented in Equation 2 with

Engineering cost management can be defined as the process to identify, allocate, and track resources needed to meet the stakeholder's requirements. An integrated, process-centered, all backed with quantifiable data and documented processes provides real and tangible benefits to all stakeholders. Engineering cost management can best be described as an integrated, process-centered, measurable, and disciplined approach to LCCs and management to make the tradeoffs between cost, performance, schedule, and risk. Good cost management practices, supported by sound analysis, can lead to (modified from NASA,

Complete, unambiguous, and documented functional requirements in order to meet

Bounded and clearly defined product functional expectations and acceptance criteria,

More accurate, credible, and defensible scope, cost, and schedule estimates with

 More complete and timely risk identification, leading to more effective risk mitigation; A basis for properly quantifying, evaluating, and controlling the acceptance and timing

 Final products that deliver better reliability, adaptability, usability, performance, maintainability, supportability, and functionality -- in short, higher quality and value; Insight into near, mid and long term technology, design, infrastructure and operational investment needs as they relate to different effects on the phases and trade-offs within

E = aSb (2)

software engineering process engineering.

the model is estimated KDSI. The model takes the form:

 S = size of the software development in KDSI, and a, b = values dependent on the development mode

understood and agreed to by all stakeholders;

of changes to requirements (i.e., precluding "scope creep");

 Earlier and more consistent visibility to problems (fewer surprises); Understanding the costs for each step in the development process;

realistic assessments of risk;

 More efficient project management; and Organizational credibility and reputation.

where: E = effort in person-months,

additional multiplicative factors.

**8. Cost management** 

**8.1 Introduction** 

LCCs goals;

the life-cycle;

2008):

Engineers play a critical role in corporate or business planning. Engineers are involved in cost management from top-level corporate planning to costing components and sub systems. All require the same basic understanding of time value of money, risk, and life cycle perspective.

Engineering cost management is employed as a means of balancing a project's scope and expectations of risk, quality, and technical performance to ensure that the most cost effective solution is delivered and consists of three steps:


The ability to use analysis techniques such as those discussed allow an engineer to conduct defendable and rigorous analysis that can not only provide representative costs but can help scope a technical problem.

One important technique to help manage costs is cost as an independent variable (CAIV). Though mainly a technique that is used solely by government, its underlying principles have utility in the commercial sector. The challenges of managing the costs of open source and off the shelve technology presents a unique costing challenge because integration not development is the key cost driver. The complexity, especially given the amount of software in most modern systems, Lastly, formal tracking using project management techniques to estimate, track, and manage costs. This is beyond the scope of this chapter but is an important for managing costs and are commonly used.

#### **8.2 Cost as an independent variable**

Cost as an Independent Variable (CAIV) is a formal methodology for reducing TOCs while maintaining performance and schedule objectives. It involves developing, setting, and refining cost objectives in a systematic method while meeting owner/user requirements. CAIV entails setting aggressive, realistic cost objectives for acquiring systems and managing program risks to obtain those objectives. Cost objectives must balance against market and budget realties with projected out-year resources, taking into account existing technologies as well as the high-confidence matriculation of new technologies (from Kaye, et al, 2000). In essence the CAIV concept means that, once the system performance and objective costs are decided (on the basis of cost-performance trade-offs), then the acquisition process will make cost more of a constraint, and less of a variable, while obtaining the needed capability of the system. Figure 10 shows this graphically.

CAIV is founded upon two primary principles. First, LCCs are constrained. Unfortunately, this is all to often limited to development and production costs. Whereas some programs do obtain additional funding when needed, such funding is often at the expense of other business units, programs, or future modernization. Second, "trade space" is the foundation for smart decisions. Trade space is the range of alternatives available to the buyers. It is fourdimensional, comprising performance, TOCs, schedule, and risk impacts (from Kaye, et al., 2000). Many of the methods presented such as SBC can be used for this trade space analysis.

Fig. 10. CAIV representation (modified from Kaye, et al., 2000)

#### **8.3 Formal cost accounting**

Cost accounting is obviously the best way to track costs. The emergence of activity based costing techniques have made the engineers job easier when trying to ascertain true costs. Activity Based Costing (ABC) tracks costs—both direct and indirect—to their source. While traditional accounting practices have concentrated on evaluating inventory for asset based reporting, ABC links the resources consumed to the activities performed and then links these activities directly to their products. As a result, ABC provides a basis for strategic product and service pricing by capturing the direct relationships between costs, activities, and products. This is particularly useful when the primary cost factors are directly traceable to individual products or traditional direct costs. Most costs in industrial companies today are indirect, resulting, when indirect costs are uniformly allocated across products, in invalid management support information. This is particularly true in a service organization—commercial or government—attempt to use traditional inventory accounting techniques for management support will inevitably lead to inappropriate decisions.

All engineers need to understand the basics of cost accounting. As systems become more complex, the role of the engineer has diminished in terms of developing detail proposals. Most engineers now develop LCCs estimates for the system. Unless you are working at the senior management level, you do not need an in depth accounting background.

#### **9. Summary**

Costing systems is complex and consists of a variety of techniques to include analogies, PCEs, and detailed bottom-ups modeling. Unlike the mature knowledge encompassed by the traditional engineering disciplines, the techniques and tools for costing and managing complex systems are rapidly evolving and being driven mainly by the commercial sector. Also, the MPTs and techniques are often not presented in the open literature because of the competitive advantage afforded any company that can accurately estimate the LCCs of a product. Thus much of the MPTs presented were gleamed from government sources especially the DoD the National Aeronautical and Space Administration. Fortunately, the DoD and NASA are in many ways the intellectual thought leader on costing and estimating of complex systems because of the sheer size and complexity of their projects/programs. There is probably no one size fits or collect of MPTs, and certainty no substitution for experience, that are repeatable for LCCs estimation. However, much research, especially for techniques applicable early in the life cycle, is needed to better ascertain true LCCs.

Good engineers follow a disciplined and structured approach when developing a product/system. Costing hardware, software, and integration requires an understanding of many MPTs and terminology that few engineers have received formal training. Once technical characteristics have been ascertained from the requirements, selecting the right MPTs is critical to accurately determining costs early in the development cycle and estimating realistic LCCs.

In the evaluation and reengineering of existing systems, the functional analysis serves as a basis for developing WBS or CBS leading to the collection of costs by functional area. Unfortunately, if you can develop architectures/WBS you have a well-understood system suitable for realistic costs estimates which is often long after a budget has been establish.

#### **10. References**

144 Systems Engineering – Practice and Theory

Cost accounting is obviously the best way to track costs. The emergence of activity based costing techniques have made the engineers job easier when trying to ascertain true costs. Activity Based Costing (ABC) tracks costs—both direct and indirect—to their source. While traditional accounting practices have concentrated on evaluating inventory for asset based reporting, ABC links the resources consumed to the activities performed and then links these activities directly to their products. As a result, ABC provides a basis for strategic product and service pricing by capturing the direct relationships between costs, activities, and products. This is particularly useful when the primary cost factors are directly traceable to individual products or traditional direct costs. Most costs in industrial companies today are indirect, resulting, when indirect costs are uniformly allocated across products, in invalid management support information. This is particularly true in a service organization—commercial or government—attempt to use traditional inventory accounting

techniques for management support will inevitably lead to inappropriate decisions.

senior management level, you do not need an in depth accounting background.

All engineers need to understand the basics of cost accounting. As systems become more complex, the role of the engineer has diminished in terms of developing detail proposals. Most engineers now develop LCCs estimates for the system. Unless you are working at the

Costing systems is complex and consists of a variety of techniques to include analogies, PCEs, and detailed bottom-ups modeling. Unlike the mature knowledge encompassed by the traditional engineering disciplines, the techniques and tools for costing and managing complex systems are rapidly evolving and being driven mainly by the commercial sector. Also, the MPTs and techniques are often not presented in the open literature because of the

Fig. 10. CAIV representation (modified from Kaye, et al., 2000)

**8.3 Formal cost accounting** 

**9. Summary** 

Andrews, Richard, "An Overview of Acquisition Logistics," Fort Belvoir, VA: Defense Acquisition University, 2003, available online at

 https://acc.dau.mil/CommunityBrowser.aspx?id=32720, accessed on April 2, 2007 Barker, Bruce, personal note, April, 2008

	- /WebHelp3/MIL-HDBK-881A%20FOR%20PUBLICATION%20FINAL% 2009AUG05. pdf, accessed on 30 July, 2005

 http://www.nasa.gov/topics/aeronautics/features/trl\_demystified.html, 20 August 2010, accessed 19 August 2011


 http://csse.usc.edu/csse/TECHRPTS/PhD\_Dissertations/files/Valerdi\_Dissertati on.pdf, accessed 19 August 2011


### **Integrated Product Service Engineering – Factors Influencing Environmental Performance**

Sofia Lingegård, Tomohiko Sakao\* and Mattias Lindahl *Department of Management and Engineering, Linköping University Sweden* 

#### **1. Introduction**

146 Systems Engineering – Practice and Theory

Government Accounting Office (GAO), "Cost Estimating and Assessment Guide Best

IBM, "Software Estimation, Enterprise-Wide, Part I: Reasons and Means," available onine at http://www.ibm.com/developerworks/rational/library/jun07/temnenco/index.

Kaye, M. A., Sobota, M. S., Graham, D. R., and Gotwald, A. L., 2000, "Cost as an Independent Variable: Principles and Implementation," Available onine at http://www.dau.mil/pubs/arq/2000arq/kaye.pdf, from the Acquisition Review

National Aeronautics Space Administration, "Cost Estimating Handbook," Available onine

National Aeronautics Space Administration, "Parametric Cost Estimating Handbook,"

National Aeronautics Space Administration, Technology Readiness Levels Demystified ,

http://www.nasa.gov/topics/aeronautics/features/trl\_demystified.html, 20 August

National Research Council of the National Academies, Air For"Pre-Milestone A Systems

Stem, David E., Boito, Michael and Younossi, Obaid, "Systems Engineering and Program

Stevens Institute of Technology, "SYS 625 Fundamentals of Systems Engineering Class

Stevens Institute of Technology, "SYS 650 System Architecture and Design," Course Notes,

Valerdi, Ricardo, "The Constructive Systems Engineering Costing Model (COSYSMO)," A

http://csse.usc.edu/csse/TECHRPTS/PhD\_Dissertations/files/Valerdi\_Dissertati

Valerdi, Ricardo, "Academic COSYSMO User Manual - A Practical Guide for Industry and Government," Version 1.1, MIT Lean Aerospace Initiative, September 2006 Young, Leone Z., Farr, John V., Valerdi, Ricardo, and Kwak, Young Hoon, "A Framework

Young, Leone Z., Wade, Jon, Farr, John V., Valerdi, Ricardo, and Kwak, Young Hoon, "An

Available onine at http://cost.jsc.nasa.gov/PCEHHTML/pceh.htm, 2009, accessed

Engineering - A Retrospective Review and Benefits for Future Air Force Systems

Management - Trends and Costs for Aircraft and Guided Weapons Programs," ISBN 0-8330-3872-9, Published by the RAND Corporation, Santa Monica, CA, 2006

Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy, University of Southern California, August 2005, Available

for Evaluating Life Cycle Project Management Costs on Systems Centric Projects,"31th Annual American Society of Engineering Management Conference,

Approach to Estimate the Life Cycle Cost and Effort of Project Management for Systems Centric Projects," International Society of Parametric Analysts (ISPA) and the Society of Cost Estimating and Analysis (SCEA)2011 ISPA/SCEA Joint Annual Conference and Training Workshop, Albuquerque, New Mexico, June 7 - 10, 2011

at www.nasa.gov/ceh\_2008/2008.htm, accessed 28 January 2010

accessed 19 August 2011

19 August 2011

Notes," 2008

2009

onine at

Available online at

2010, accessed 19 August 2011

on.pdf, accessed 19 August 2011

Rogers, AK, October, 2010

html, accessed November 18, 2008

Quarterly, Fall, 2000, accessed January 2008

Acquisition," Air Force Studies Board, 2008

Practices for Developing and Managing Capital Program Costs," GAO-09-3SP, March 2009, available online at http://www.gao.gov/new.items/d093sp.pdf,

> In society today there is increased awareness about environmental problems, e.g. climate change and pollution. This, in combination with concern about future shortages of natural resources, has resulted in increased pressure to find innovative strategies that can tackle these problems. Simply put, the main reasons for these problems are tied to society's use of products, and in general caused by:


Clearly, strategies for tackling these problems need to be investigated. During the last two decades, industry and academia have proposed and tried to implement several strategies and solutions. From academia, these include Functional Economy (Stahel 1994) and the Integrated Product Service Engineering (IPSE) concept, also often called Product/Service Systems (PSS) (e.g. (Mont 2002; Tukker and Tischner 2006; Sakao and Lindahl 2009)). PSS is defined, for instance, as "a marketable set of products and services capable of jointly fulfilling a user's needs" (Goedkoop, van Halen et al. 1999). Service in this chapter includes operation, maintenance, repair, upgrade, take-back, and consultation. In addition to this definition, other authors (Tukker and Tischner 2006) regard PSS as a value proposition, one including its network and infrastructure. Another concept, named Total Care Products (Functional Products), has been developed as well with some connection to PSS. It comprises "combinations of hardware and support services". The economically efficient functioning of this concept should be achieved by the proposition of an "intimate business relationship" between the service provider and the customer. As a result, both the provider and the customer obtain benefits through sharing existing business risks (Alonso-Rasgado, Thompson et al. 2004; Alonso-Rasgado and Thompson 2006). Furthermore, the proposal of a "life cycle-oriented design" (Aurich, Fuchs et al. 2006) highlights an important step for the

<sup>\*</sup> Corresponding Author

"product and technical service design processes" integration. It is also interesting that Aurich et al. address designing products and services based on life cycle thinking. Furthermore, some specific engineering procedures and computer tools have been developed and validated with industrial cases (e.g. (Sakao and Shimomura 2007; Sakao, Birkhofer et al. 2009; Sakao, Shimomura et al. 2009)).

However, the research in this area is still in its infancy and a number of questions remain unanswered. Specifically, a general weakness in existing literature is that even though a large number of authors have stressed PSS' environmental and economic potential (e.g. (Roy 2000; Mont, Singhal et al. 2006)), very few studies have proved PSS' potential for changing environmental performance.

In the manufacturing industry, the trend of servicizing has been evident regardless of the environmental concern or the academic debate (e.g. (Sakao, Napolitano et al. 2008)). In much of the manufacturing industry today, numerous companies' business offerings are a combination of physical products and services. In fact, over 50% of the companies in the USA and Finland provide both physical products and services (Neely 2007). Some manufacturing firms are even strategically shifting from being a "product seller" towards becoming a "service provider" (Oliva and Kallenberg 2003). Namely, the industry possesses a driver for service integration, something which should be seen as an interesting opportunity for academia (Isaksson, Larsson et al. 2009).

As explained above, PSS is a likely solution for environmental problems from the theoretical and practical viewpoints. However, little is known scientifically about PSS' impact on environmental performance. It is the research community who should respond to this lack of knowledge, and this is the overall subject of this chapter.

There are two main questions to consider. One is under which conditions PSS is a suitable offering, since it is a prerequisite for PSS to work in business practice in order to realize its influence on environmental performance. In general, PSS approaches seem to work well if any of the following conditions apply (Tukker and Tischner 2006):


In addition, recent research has reported on characteristics of products suitable for PSS. For instance, (Lay, Copani et al. 2010) argue that the innovativeness of products has positive influences on the integration of product and service. Theoretical investigation has also begun: For instance, property rights (Furubotn and Pejovich 1972) have gained attention as a key for PSS to be meaningful (Hockerts 2008; Dill, Birkhofer et al. 2011). Yet, all these literature are insufficient, especially from scientific viewpoints.

The other main question is which PSS factors influence the environmental performance in comparison with traditional product-sales type business. (Tukker 2004) is one of very few who have attempted to analyze the relation between PSS types and their influence on environmental impact, yet he fails to present a thorough background and reasons.

In sum, thus far there has been growing interest in PSS. Among other things, there has been relatively more work with the analytical approach (e.g. (Mont 2002)), and less work with PSS synthesis (e.g. (Sakao and Lindahl 2009)). Even with relatively more work available on analysis, there is analysis to be conducted as to PSS' factors making PSS meaningful as a business and influential on environmental impacts. This PSS with a certain level of complexity is believed to be a good example of areas where Systems Engineering (Lindemann 2011) can contribute.

#### **2. Objective and method**

148 Systems Engineering – Practice and Theory

"product and technical service design processes" integration. It is also interesting that Aurich et al. address designing products and services based on life cycle thinking. Furthermore, some specific engineering procedures and computer tools have been developed and validated with industrial cases (e.g. (Sakao and Shimomura 2007; Sakao,

However, the research in this area is still in its infancy and a number of questions remain unanswered. Specifically, a general weakness in existing literature is that even though a large number of authors have stressed PSS' environmental and economic potential (e.g. (Roy 2000; Mont, Singhal et al. 2006)), very few studies have proved PSS' potential for changing

In the manufacturing industry, the trend of servicizing has been evident regardless of the environmental concern or the academic debate (e.g. (Sakao, Napolitano et al. 2008)). In much of the manufacturing industry today, numerous companies' business offerings are a combination of physical products and services. In fact, over 50% of the companies in the USA and Finland provide both physical products and services (Neely 2007). Some manufacturing firms are even strategically shifting from being a "product seller" towards becoming a "service provider" (Oliva and Kallenberg 2003). Namely, the industry possesses a driver for service integration, something which should be seen as an interesting

As explained above, PSS is a likely solution for environmental problems from the theoretical and practical viewpoints. However, little is known scientifically about PSS' impact on environmental performance. It is the research community who should respond to this lack

There are two main questions to consider. One is under which conditions PSS is a suitable offering, since it is a prerequisite for PSS to work in business practice in order to realize its influence on environmental performance. In general, PSS approaches seem to work well if

complex products that require special competencies to design, operate, manage and/or

products with considerable consequences or costs if not used correctly or appropriately;

In addition, recent research has reported on characteristics of products suitable for PSS. For instance, (Lay, Copani et al. 2010) argue that the innovativeness of products has positive influences on the integration of product and service. Theoretical investigation has also begun: For instance, property rights (Furubotn and Pejovich 1972) have gained attention as a key for PSS to be meaningful (Hockerts 2008; Dill, Birkhofer et al. 2011). Yet, all these

The other main question is which PSS factors influence the environmental performance in comparison with traditional product-sales type business. (Tukker 2004) is one of very few who have attempted to analyze the relation between PSS types and their influence on

environmental impact, yet he fails to present a thorough background and reasons.

Birkhofer et al. 2009; Sakao, Shimomura et al. 2009)).

opportunity for academia (Isaksson, Larsson et al. 2009).

of knowledge, and this is the overall subject of this chapter.

products with high costs to operate and/or maintain;

any of the following conditions apply (Tukker and Tischner 2006):

products where operational failure or downtime is not tolerated;

products with only a few major customers on the market.

literature are insufficient, especially from scientific viewpoints.

environmental performance.

maintain;

products with long life; or

This chapter endeavours to lead the scientific discussion regarding which IPSE factors are expected to, in theory, lower the environmental impact of a life cycle compared to a traditional product sales business. To do so, the IPSE concept is introduced, first with an emphasis on engineering processes rather than an object such as PSS. In the following sections, four aspects from theory will be discussed: product development, information asymmetry, economies of scale, and risk. These sections discuss how environmental impacts are influenced from a product life cycle perspective, and highlight crucial factors theoretically. They are followed by an overall discussion and an examination of some promising future work. The chapter provides the research community with a first theoretical cornerstone regarding environmental performance by IPSE. To practitioners, it will be an eye opener for how they engineer.

#### **3. Redefining IPSE**

Our research group at Linköping University and KTH (The Royal Institute of Technology) in Sweden has developed what is termed Integrated Product Service Engineering (IPSE) (Lindahl, Sundin et al. 2006). IPSE has the following characteristics in relation to other existing concepts. First, and in common with PSS, IPSE looks at combinations of products and services. Second, IPSE is a type of engineering, which is different from PSS per se. In addition, it attempts holistic optimization from the environmental and economic perspectives throughout the life cycle. Third, IPSE consists not only of design as the most influential activity, but possibly other engineering activities such as maintenance, upgrade, remanufacturing, etc. Therefore, IPSE has to deal with the time dimension of the life cycle. Figure 1 depicts different interesting processes for IPSE, obviously showing various disciplines and different aspects to be addressed.

This section reveals additional characteristics of IPSE. An IPSO (Integrated Product Service Offering) is an offering that consists of a combination of products and services that, based on a life cycle perspective, have been integrated to fit targeted customer needs. Further, IPSO means that products and services have been developed in parallel and are mutually adapted to operate well together. This contrasts with the traditional product sale, where the provider transfers control and responsibility to the customer at the point of sales. An IPSO often creates close contact between the supplier and customer, leading e.g. to offers being customized and improved to better suit the customer. In many cases, the service provider retains responsibility for the physical products in the IPSO during the use phase. One example is when a client does not own the machines installed by the supplier, but only uses them and pays for the manufactured volumes; then, when the customer does not need them anymore, the supplier takes back the machines. Such cases increase the provider's interest to ensure that the customer uses machines installed as long as possible and that any disturbances, such as the need for repairs, are reduced. The increased responsibility by the IPSO supplier also potentially facilitates improvements identified and implemented in comparison to traditional sales. This could lead to a product lifetime extension.

Note: IPSO; Integrated Product Service Offering. EOL; end of life.

Fig. 1. Processes of IPSE's interest (Sakao, Berggren et al. 2011)

Based on (Sakao 2009), IPSE is explained in comparison to Ecodesign (environmentally conscious design) due to some commonality with Figure 2 (a) and (b), where different types of engineering activities are put on the identical graph. The graph depicts the environmental impact of a certain type of product with high impact from its usage phase, which holds true in many cases. The horizontal axis represents the time dimension on the life cycle. Bars represent the environmental impact from each phase such as production and usage (scaled with the left vertical axis). A dotted line represents the accumulated influence of the activity at each phase of the life cycle's environmental impact. It is shown that the design phase has by far the highest ratio (some 80%), which is generally known.

As seen by the dotted line, Ecodesign is obviously crucial, since it is the design activity with the dominant influence. However, is Ecodesign sufficient? The answer is no, since it leaves out control after the design phase. This is why IPSE is more effective, including the possible employment of other engineering activities such as maintenance. Naturally, company management must be committed if they are to carry out IPSE. IPSE includes a business issue, e.g. how to sell services.

What characteristics of IPSE are to be paid particular attention to in this chapter? The first is its length on the time dimension. It can be as long as 20 - 30 years in the case of an investment machine (e.g. aircraft engine) or facility (e.g. railway). Therefore, IPSE has to address much of this dimension with the fact that the earlier a certain action is taken the

ensure that the customer uses machines installed as long as possible and that any disturbances, such as the need for repairs, are reduced. The increased responsibility by the IPSO supplier also potentially facilitates improvements identified and implemented in

**Purchase usage**

**Product dev.**

Based on (Sakao 2009), IPSE is explained in comparison to Ecodesign (environmentally conscious design) due to some commonality with Figure 2 (a) and (b), where different types of engineering activities are put on the identical graph. The graph depicts the environmental impact of a certain type of product with high impact from its usage phase, which holds true in many cases. The horizontal axis represents the time dimension on the life cycle. Bars represent the environmental impact from each phase such as production and usage (scaled with the left vertical axis). A dotted line represents the accumulated influence of the activity at each phase of the life cycle's environmental impact. It is shown that the design phase has

As seen by the dotted line, Ecodesign is obviously crucial, since it is the design activity with the dominant influence. However, is Ecodesign sufficient? The answer is no, since it leaves out control after the design phase. This is why IPSE is more effective, including the possible employment of other engineering activities such as maintenance. Naturally, company management must be committed if they are to carry out IPSE. IPSE includes a business

What characteristics of IPSE are to be paid particular attention to in this chapter? The first is its length on the time dimension. It can be as long as 20 - 30 years in the case of an investment machine (e.g. aircraft engine) or facility (e.g. railway). Therefore, IPSE has to address much of this dimension with the fact that the earlier a certain action is taken the

**Service dev.**

> **EOL treatment**

**Service** 

**Product** 

**Logistics delivery**

**Production**

*material, energy, information, money, person hours*

comparison to traditional sales. This could lead to a product lifetime extension.

**Marketing & sales**

**Business model design**

Note: IPSO; Integrated Product Service Offering. EOL; end of life.

Fig. 1. Processes of IPSE's interest (Sakao, Berggren et al. 2011)

by far the highest ratio (some 80%), which is generally known.

*IPSO dev.*

**R&D**

issue, e.g. how to sell services.

*IPSO provider*

*IPSO buyer/user*

more effective its outcome is in general. It is actually realized by effective design. Thus, design is naturally a core of IPSE.

Then, what is design? A seminal work by (Pahl and Beitz 1996) states "design is an engineering activity that … provides the prerequisites for the physical realization of solution ideas" (originally in (Martyrer 1960)). It has a lot to do with the processing of information – information about needs and wants from stakeholders and through the product life cycle, as well as about function and structure of the product. Effective processing of information plays a central role in IPSE – this is the second characteristic.

**(a) Various Eco-activities** 

Note: The environmental impact (shown by bars) is a rough estimation of active products. EOL and LC stand for end-of-life and life cycle, respectively.

Fig. 2. Comparison of IPSE and other activities.

Then, design of what? This is the next relevant question as discussed in (Cantamessa 2011), which points out an artefact, i.e. an object to be designed, is today "integrated and systemic product-services linked in a high-level user experience". Also acknowledging co-creation of value by a provider and a customer/user is a strong idea behind the servicizing (see e.g. (Vargo and Lusch 2004)), a provider cannot get rid of influence from its customer/user to create the intended value. Thus, a provider can design something contributing to its value, but cannot design the value itself. This means that control of risks of the value creation process is crucial. Thus, this risk is the third characteristics.

In sum, IPSE can be defined as an engineering activity controlling risks of value creation through dealing with information originating from a wide window on the time dimension. These three characteristics are discussed in the following sections with their relevant theories: time dimension and design with the theory of product development, information processing with theory about information asymmetry, and risk. In addition to these, economies of scale are also discussed since it is vital to business activities in general.

#### **4. Product development**

According to ENDREA† (ENDREA 2001), product development is defined as: "all activities in a company aiming at bringing a new product to the market. It normally involves design, marketing and manufacturing functions in the company". A product can in this context be both physical and non-physical. As is well known, when developing new products, designers typically follow a general procedure (sequence of activities), a so-called product development model. A product development model normally involves design, marketing and manufacturing activities. The current business model for many products, to get the customer to buy the product, implies that the focus is normally on cutting down the cost for manufacturing the product and delivering it to the customer. This is done in order to get a price that is accepted by the customer. It also implies that little focus is placed on later phases of the product's life cycle, e.g. the use phase (with activities such as use of energy and consumables, service and maintenance, and upgrading) and end-of-life. At the same time, life cycle cost studies and life cycle assessments have shown that for many products, it is during the use-phase (in reality often the longest phase of a product's life) and its related activities where the major costs and environmental impact for the product occur. Figure 2 shows, in a basic way (different products have different profiles), the environmental impact accumulation over the product's life cycle.

When developing IPSO, the basic principal is to consider all life cycle phases in order to optimize the offering from a life cycle perspective. The idea is to get the lowest total cost for the offering possible, not only to get the lowest cost for product. This generates new conditions for the product development. Since the focus is expanded to cover more life cycle phases, e.g. the use phase, it implies that the number of potential offering solutions

<sup>†</sup> Engineering Research and Education Agenda (ENDREA). ENDREA was a joint effort between four of the major Swedish institutes of technology: Chalmers University of Technology in Göteborg, the Royal Institute of Technology in Stockholm, Linköping Institute of Technology in Linköping and Luleå University of Technology in Luleå. Funding came from the Swedish board for strategic research, SSF, industry and the participating universities. The main idea behind ENDREA was to create a national cooperation in creating a new type of research in the engineering design area.

Then, design of what? This is the next relevant question as discussed in (Cantamessa 2011), which points out an artefact, i.e. an object to be designed, is today "integrated and systemic product-services linked in a high-level user experience". Also acknowledging co-creation of value by a provider and a customer/user is a strong idea behind the servicizing (see e.g. (Vargo and Lusch 2004)), a provider cannot get rid of influence from its customer/user to create the intended value. Thus, a provider can design something contributing to its value, but cannot design the value itself. This means that control of risks of the value creation

In sum, IPSE can be defined as an engineering activity controlling risks of value creation through dealing with information originating from a wide window on the time dimension. These three characteristics are discussed in the following sections with their relevant theories: time dimension and design with the theory of product development, information processing with theory about information asymmetry, and risk. In addition to these,

According to ENDREA† (ENDREA 2001), product development is defined as: "all activities in a company aiming at bringing a new product to the market. It normally involves design, marketing and manufacturing functions in the company". A product can in this context be both physical and non-physical. As is well known, when developing new products, designers typically follow a general procedure (sequence of activities), a so-called product development model. A product development model normally involves design, marketing and manufacturing activities. The current business model for many products, to get the customer to buy the product, implies that the focus is normally on cutting down the cost for manufacturing the product and delivering it to the customer. This is done in order to get a price that is accepted by the customer. It also implies that little focus is placed on later phases of the product's life cycle, e.g. the use phase (with activities such as use of energy and consumables, service and maintenance, and upgrading) and end-of-life. At the same time, life cycle cost studies and life cycle assessments have shown that for many products, it is during the use-phase (in reality often the longest phase of a product's life) and its related activities where the major costs and environmental impact for the product occur. Figure 2 shows, in a basic way (different products have different profiles), the environmental impact

When developing IPSO, the basic principal is to consider all life cycle phases in order to optimize the offering from a life cycle perspective. The idea is to get the lowest total cost for the offering possible, not only to get the lowest cost for product. This generates new conditions for the product development. Since the focus is expanded to cover more life cycle phases, e.g. the use phase, it implies that the number of potential offering solutions

† Engineering Research and Education Agenda (ENDREA). ENDREA was a joint effort between four of the major Swedish institutes of technology: Chalmers University of Technology in Göteborg, the Royal Institute of Technology in Stockholm, Linköping Institute of Technology in Linköping and Luleå University of Technology in Luleå. Funding came from the Swedish board for strategic research, SSF, industry and the participating universities. The main idea behind ENDREA was to create a national

cooperation in creating a new type of research in the engineering design area.

economies of scale are also discussed since it is vital to business activities in general.

process is crucial. Thus, this risk is the third characteristics.

**4. Product development** 

accumulation over the product's life cycle.

increases, which is good from an optimizing perspective. At the same time, costs are often associated with the use of materials and energy, which in turn provides a negative environmental impact, implying that more cost-optimized products usually have less environmental impact.

Figure 2 also illustrates the different phase's impact on the total environmental impact and how important the design phase is, especially the early part of it. This is at the same time logical, since it is in the early phases of product development that the product specification is defined, i.e. what parameters must/should be focused on. Examples of parameters are: how it will be used; how long it will work; what type of power it will use; what type and amount of consumables will be used during the normal use phase; what spare parts will be needed; and what is the lifetime of the product. Today, many companies' main concern in their product specifications is how to optimize and improve the production of their products, and how to develop products that are not too durable. This is important, since the predominate way of earning money is by selling products to customers.

At the same time, the initial product specification sets up boundaries for potential actions in the later phases. This is a well-known fact for people working with product development, often referred to as the "design paradox". When a new design project starts, very little is known about the final product, especially if the product is a new one for the designers. As the work on the product progresses, knowledge is increased. At the same time, the scope of freedom of action decreases for every product decision step taken, since time and cost drive most projects. Costs for later changes increase rapidly, since earlier work must be redone (Ullman 2002). The paradox is that when the general design information is needed, it is not accessible, and when it is accessible, the information is usually not needed.

Figure 3 shows the principal relation between freedom of action, product knowledge and modification cost‡. The figure is the author's further development of three figures: the design paradox (Ullman 2002), costs allocated early but used late in the project (Andreasen 1987) and the cost for design changes as a function of time during the planning and production process (Bergman and Klefsjö 2003).

Figures 2 and 3 illustrate the importance of the design phase as well as getting in relevant requirements as early as possible in the development process. It also shows the problem with traditional product development. Often, little care is taken in product development (and in its specification) for future services, maintenance, and end-of-life-treatment. Traditionally, the initial focus is on developing the physical product; once that is done, a possible service (intangible product) is developed, but this is hindered by the limitations set up from the physical product. When developing IPSO, the development is accomplished in an integrated and parallel approach.

The rate of market and technological changes has accelerated in the past decade. This implies that companies must be pro-active in the sense that they must be able to rapidly respond to fluctuations in demand (Collaine, Lutz et al. 2002). Central to competitive success in the present highly-turbulent environment is: the company's capability to develop new products (Gonzalez and Palacios 2002); to improve, further develop and optimize old

 ‡ This figure can also be found in the author's licentiate thesis Lindahl, M. (2000). Environmental Effect Analysis - an approach to design for environment Licentiate Thesis, Royal Institute of Technology.

products; and to do so faster than competitors (Stalk and Hout 1990). Designers must develop and proceed faster, while at the same time covering an increased number of different demands on the product. A way to handle these challenges is to do more of the product development in a more parallel and concurrent way in order to e.g. shorten the calendar time (from start to stop) and increase the collaboration over competence disciplines. One concept in line with this is Integrated Product Development§ (IPD), whose basic idea is to increase the efficiency in product development by more parallel activities and a higher degree of co-operation between functions, levels and individuals in an enterprise (Olsson 1976; Andreasen 1980). Norell (1999) characterizes the performance of IPD as follows: parallel activities; cross-functional collaboration by multifunctional teams; structured processes; and front-loaded development. The four characteristics above are in line with what (Wheelwright and Clark 1992), (Cooper, Edgett et al. 1998), and (Wilson, Kennedy et al. 1995) regard as important features for successful product development.

However, if a business model is changed from selling products to providing a function via IPSO, this also changes the conditions for development. When selling products, there is a need to constantly sell new ones in order to survive. In order to do so, the company must constantly come out with new models and/or features, and do so at an increased speed to keep competitors out. This also implies that a company should not want to offer all potential technical improvements in new products, but rather split them up over several versions in order to be able to sell more products over time.

Fig. 3. The relation between "Freedom of action", "Product knowledge" and "Modification cost" is shown (Lindahl and Tingström 2000).

However, if a company sells IPSO, this is changed since the focus is not on selling products but rather on selling functionality to the customer. In principal, once an IPSO is sold to a customer, the company wants him/her to use it for as long a time as it is economically

<sup>§</sup> Other similar common terms which correspond to this concept are Concurrent Engineering (Söderved, 1991), (Prasad, 1997) and Lean Product Development (Mynott, 2001).

interesting. If a company has technology that can e.g. cut down the energy consumption during use, it will implement the best technique at once instead of taking it in steps. Instead of spending time on developing different versions of a product, with IPSO the company in principal has more time for developing more optimized offerings - offerings that are more cost-efficient and effective, and therefore in general give a lower negative environmental impact. Nevertheless, it will still be relevant for shortening the calendar time (from start to stop).

#### **5. Information asymmetric between a provider and a user**

154 Systems Engineering – Practice and Theory

products; and to do so faster than competitors (Stalk and Hout 1990). Designers must develop and proceed faster, while at the same time covering an increased number of different demands on the product. A way to handle these challenges is to do more of the product development in a more parallel and concurrent way in order to e.g. shorten the calendar time (from start to stop) and increase the collaboration over competence disciplines. One concept in line with this is Integrated Product Development§ (IPD), whose basic idea is to increase the efficiency in product development by more parallel activities and a higher degree of co-operation between functions, levels and individuals in an enterprise (Olsson 1976; Andreasen 1980). Norell (1999) characterizes the performance of IPD as follows: parallel activities; cross-functional collaboration by multifunctional teams; structured processes; and front-loaded development. The four characteristics above are in line with what (Wheelwright and Clark 1992), (Cooper, Edgett et al. 1998), and (Wilson, Kennedy et al. 1995) regard as important features for successful product development.

However, if a business model is changed from selling products to providing a function via IPSO, this also changes the conditions for development. When selling products, there is a need to constantly sell new ones in order to survive. In order to do so, the company must constantly come out with new models and/or features, and do so at an increased speed to keep competitors out. This also implies that a company should not want to offer all potential technical improvements in new products, but rather split them up over several versions in

Fig. 3. The relation between "Freedom of action", "Product knowledge" and "Modification

However, if a company sells IPSO, this is changed since the focus is not on selling products but rather on selling functionality to the customer. In principal, once an IPSO is sold to a customer, the company wants him/her to use it for as long a time as it is economically

§ Other similar common terms which correspond to this concept are Concurrent Engineering (Söderved,

order to be able to sell more products over time.

cost" is shown (Lindahl and Tingström 2000).

1991), (Prasad, 1997) and Lean Product Development (Mynott, 2001).

In general, environmental impact of a product life cycle is determined by product characteristics themselves and processes on the product. The former includes the type and amount of materials in a product, while the latter includes how to treat the product at EOL (end of life). Thus, the environmental impact of a product can be decreased by changing either its characteristics or its processes. However, one has to own and apply appropriate information to do so. There are different types of such information about a product itself or processes along the life cycle phases such as design, manufacturing, usage, and EOL. In addition, the information may not be documented in such a way that it is easily transferrable to another actor as depicted in Figure 4.

Who owns the information on how to improve the environmental aspect of the product and processes at different stages of the life cycle? Information asymmetry exists in many cases between the OEM, who in many cases designs a product, and the user. For instance, how the substances contained in a product are toxic is not necessarily known to a user but is to a designer. In addition, how to attain the best energy performance for the product in practice may be more hidden to a user than to a designer – the user simply does not know how to operate the given product for the best performance, or the provider has more knowledge of the best available technologies at the moment. There can be various reasons for this, such as a lack of user education in spite of the existence of the necessary information, or the strategy of a user as a company not to get the competence.

Fig. 4. General illustration of information owned by provider and user

Note that information asymmetry in the "market for lemons" addressed by (Akerlof 1970) is not the main issue of this chapter. In that case, the information possessed by a provider is about a product at a point of sale and is unchanged after the sale of the product, as it is based on a product-sales type business and the provider has no access to the product afterwards. This is shown with gray lines in Figure 5: the information of a user about the product increases along time and can surpass that of a provider. Note that variation of speed of the increase along time is not considered in this graph. In IPSE, on the other hand, a provider can obtain more information with access to the product during usage, and could maintain superiority regarding product information over the user. This is drawn as Cases 1 and 2 in Figure 5, to refer to the same and a higher speed as compared to the user, respectively. In Case 3, due to the lower speed than the user, the provider is surpassed by the user.

Information asymmetry can be a weapon for a provider to obtain payment in IPSE and makes IPSE meaningful as a business. For example, in the case where an OEM owns more information about usage or EOL of a product, there is potential for the OEM to provide IPSO so that the environmental impact is less than would be for product sales. It is also often reasonable for an OEM to be able to provide maintenance or upgrade service of its product. From the viewpoint of environmental performance, on the other hand, information asymmetry is a hindrance to improvement, since it is costly to transfer information to an actor who needs it.

Some regulations are effective so as to diminish the information asymmetry – a simple example is a symbol of "no to be put it in a dustbin" attached to an electronic product by the WEEE (Waste Electrical and Electronic Equipment Directive) (EU 2003). This symbol conveys effective information from a provider to a user: this product should not be disposed of in a regular dustbin from an environmental viewpoint. As is explained by Cerin (Cerin 2006), this type of information flow has potential to decrease the environmental impact. However, everything is not covered by regulations. A user may be willing to pay for information that contributes to the environmental performance of the product. This is where business opportunities for an OEM as an IPSO provider can be found.

Fig. 5. Transitions of amount of information about a product after sales

Summarizing the discussion above, three levels of information asymmetry are assumed to exist in this context. If there is no (or too little) information asymmetry, there will be no gain in environmental performance through IPSE and no IPSE activities. On the other hand, in case there is a high level of information asymmetry, i.e. enough to make IPSE meaningful, there would be economic activities as well as environmental gain. The rest is an intermediate level, where there are no IPSE activities and thus loss of environmental performance. Note that this discussion focuses on a single parameter, information asymmetry; there can be other influential parameters if IPSE is meaningful.

#### **6. Economies of scale**

156 Systems Engineering – Practice and Theory

product increases along time and can surpass that of a provider. Note that variation of speed of the increase along time is not considered in this graph. In IPSE, on the other hand, a provider can obtain more information with access to the product during usage, and could maintain superiority regarding product information over the user. This is drawn as Cases 1 and 2 in Figure 5, to refer to the same and a higher speed as compared to the user, respectively. In Case 3, due to the lower speed than the user, the provider is surpassed by

Information asymmetry can be a weapon for a provider to obtain payment in IPSE and makes IPSE meaningful as a business. For example, in the case where an OEM owns more information about usage or EOL of a product, there is potential for the OEM to provide IPSO so that the environmental impact is less than would be for product sales. It is also often reasonable for an OEM to be able to provide maintenance or upgrade service of its product. From the viewpoint of environmental performance, on the other hand, information asymmetry is a hindrance to improvement, since it is costly to transfer information to an

Some regulations are effective so as to diminish the information asymmetry – a simple example is a symbol of "no to be put it in a dustbin" attached to an electronic product by the WEEE (Waste Electrical and Electronic Equipment Directive) (EU 2003). This symbol conveys effective information from a provider to a user: this product should not be disposed of in a regular dustbin from an environmental viewpoint. As is explained by Cerin (Cerin 2006), this type of information flow has potential to decrease the environmental impact. However, everything is not covered by regulations. A user may be willing to pay for information that contributes to the environmental performance of the product. This is where

*Time*

Case 2

Case 1

Case 3

Provider User

business opportunities for an OEM as an IPSO provider can be found.

IPSE

Product-sales type

Fig. 5. Transitions of amount of information about a product after sales

the user.

actor who needs it.

*Information about the product*

*Point of sale*

Economies of scale are the result of an increased number of units produced or distributed, making it possible for the unit price to decrease (Chandler 2001; Cook, Bhamra et al. 2006). An IPSE provider has the possibility to attain economies of scale through several different aspects. To provide IPSE is, in some cases, equal to being responsible for all the life cycle costs of the offering, which provide incentives to optimize the total cost as well as to realize economic development, and potentially environmental development (Lindahl, Sundin et al. 2006; Tukker and Tischner 2006). The provider would be able to gain economies of scale for both the products and the services. Leverage in production and administration could be created by offering the same services to different customers (Morey and Pacheco 2003). Another way of decreasing costs and achieving economies of scale could be realized when answering customers' demands by constantly configuring the same technology and skills in different ways (Cook, Bhamra et al. 2006). For a certain industry the market capacity is limited, which means that a single company may not reach its scale of economy since its market share is relatively fixed for a certain period of time. It is not possible to realize largescale effects with only a few customers, since much information is needed before, during and after the delivery which results in high transaction costs (Arnold 2000). If a number of companies outsourced their processes to one organization, this would aggregate the volume and the production efficiency would increase (Gao, Yao et al. 2009). This would also bring down the transaction costs, since they were created when transferring goods and services (Chandler 2001). If the transactions occur frequently they are better handled within one single organization, since hierarchical governance facilitates administrative control and coordinated adaptability (Toffel 2008). Furthermore, customers want to benefit from the knowledge of the supplier, and are reluctant to do business with several suppliers if they want an integrated and global offering (Mathieu 2001). However, the number of actors should be enough to make sure all the components of the offer are delivered by experts (Mont 2004).

Reduced transaction costs are not the only costs to consider. New costs for complementary products may also appear for the provider in the beginning, but will benefit from economies of scale after the transition (Toffel 2008). Even though IPSE offerings imply customized solutions to achieve economies of scale, they have to be combined with well-defined modular structures at the component level (Windahl, Andersson et al. 2004). If a company wants to profit from economies of scale, this standardization of components is to be the first step (Arnold 2000). This could also be useful when considering remanufacturing, since parts that are worn out quickly or require frequent upgrading should be placed in an accessible way (Sundin and Bras 2005). Considering the remanufacturing, this process could also benefit from an economies of scale perspective. The IPSE approach would provide the manufacturer with the knowledge of how many products that are entering the process, as well as when they would do so, which would provide the IPSE provider with a remanufacturing plan that is easier to manage (Sundin and Bras 2005).

When it comes to other steps in the life cycle of the offering, the IPSE provider can economically afford a high level of specialization and technological features due to economies of scale, and can thereby optimize resource consumption and waste production, leading to better eco-efficiency for the company. The provider also often gains a competitive advantage over the customer when it comes to experience and knowledge concerning the product. With this information, the provider can optimize maintenance routines and thereby minimize the cost (Toffel 2008). Furthermore, the provider can benefit from scale effects when observing how the equipment is repaired across their whole customer base and use this knowledge (Toffel 2008). Further increased knowledge and understanding will result in increased availability and reduced product failures (Alonso-Rasgado, Thompson et al. 2004). Economies of scale can also emerge when the provider is in charge of the operations at the site of the customer, when the expertise of the provider in running the equipment can provide reduction in lead time and scale affects (Lay, Schroeter et al. 2009).

In sum, there are economies of scale in IPSE as well. Major positive factors include carrying out similar services so that an organization can learn from one service and apply it to another. In the case of IPSE, in contrast to the case of selling physical products, exactly the same offering does not exist, since a customer or user is involved in the service. This difference means that IPSE requires more involvement of staffs of a provider learning to gain economies of scale. Another factor is a market capacity, and it is necessary to take into account transaction cost and complementary product cost. Needs addressed by IPSE differ slightly from one offering to another. Therefore, modularization is a key to gain economies of scale, but service modularization needs more research than product modularization (e.g. (Simpson, Siddique et al. 2006)).

#### **7. Risk**

There are various types of risk, namely possible negative consequences from the environmental viewpoint. Reasons for this include an actor's lack of necessary information due to another actor's possession of the information, which was already discussed in the section on information asymmetry. There is another reason as well – non-existence of information.

Whether a product is better from an environmental standpoint for a given need is not necessarily certain at the time the product is first used. Different factors for this originate from the environment (not in the meaning of sustainability) and users. The former includes the speed of progress of the technology used in the product (or product generations) (see e.g. (Deng and Williams 2011)). If a new product is more energy efficient than the original one, and it becomes available before the end of usage, it may be better environmentally to switch to the new product. The user factor includes his/her discontinuity with the need for the chosen product (see different classical reasons for this in (Hanson 1980)). For instance, a change in demand causing a user to stop using a product after a short time, and owning another product in addition, generates additional environmental impact.

How can these different types of uncertainty be better handled? A provider could do this. If a provider promises a user in a contract that the "best" available technology is provided within the contract period, the user can avoid the uncertainty of the technology progress. For the user's discontinuity of the need, a provider could give an option to a user so that the user can return the product to the provider after a certain period of time. By doing so, a user can shorten the time of holding that risk. The "trick" behind this is scale of economy that enables a provider to cancel different types of risks arising from its users. Thus, variety of the needs by a group of many customers is cancelled.

In sum, there are different types of uncertainty, due to unavailable information. In the case of product sales, they generate risks of producing higher environmental impact than if this uncertainty and risk is managed through IPSE. Note that this is not merely an actor's lack of information; rather, the information is not available in spite of a willingness to get it. This is where business opportunities for IPSO exist, and existing research has not approached with that viewpoint. For instance, uncertainty in PSS has been researched as an object to be reduced for more accurate cost estimation (Erkoyuncu, Roy et al. 2011). Note that e.g. leasing by itself does not improve EOL management of leased products (Lifset and Lindhqvist 1999). If there is a high degree of uncertainty of technological progress or demand discontinuity, and if the risk can be cancelled by an OEM, IPSO has potential to decrease environmental impact.

#### **8. Concluding discussion**

158 Systems Engineering – Practice and Theory

require frequent upgrading should be placed in an accessible way (Sundin and Bras 2005). Considering the remanufacturing, this process could also benefit from an economies of scale perspective. The IPSE approach would provide the manufacturer with the knowledge of how many products that are entering the process, as well as when they would do so, which would provide the IPSE provider with a remanufacturing plan that is

When it comes to other steps in the life cycle of the offering, the IPSE provider can economically afford a high level of specialization and technological features due to economies of scale, and can thereby optimize resource consumption and waste production, leading to better eco-efficiency for the company. The provider also often gains a competitive advantage over the customer when it comes to experience and knowledge concerning the product. With this information, the provider can optimize maintenance routines and thereby minimize the cost (Toffel 2008). Furthermore, the provider can benefit from scale effects when observing how the equipment is repaired across their whole customer base and use this knowledge (Toffel 2008). Further increased knowledge and understanding will result in increased availability and reduced product failures (Alonso-Rasgado, Thompson et al. 2004). Economies of scale can also emerge when the provider is in charge of the operations at the site of the customer, when the expertise of the provider in running the equipment can provide reduction in lead time and scale

In sum, there are economies of scale in IPSE as well. Major positive factors include carrying out similar services so that an organization can learn from one service and apply it to another. In the case of IPSE, in contrast to the case of selling physical products, exactly the same offering does not exist, since a customer or user is involved in the service. This difference means that IPSE requires more involvement of staffs of a provider learning to gain economies of scale. Another factor is a market capacity, and it is necessary to take into account transaction cost and complementary product cost. Needs addressed by IPSE differ slightly from one offering to another. Therefore, modularization is a key to gain economies of scale, but service modularization needs more research than product modularization (e.g.

There are various types of risk, namely possible negative consequences from the environmental viewpoint. Reasons for this include an actor's lack of necessary information due to another actor's possession of the information, which was already discussed in the section on information asymmetry. There is another reason as well – non-existence of

Whether a product is better from an environmental standpoint for a given need is not necessarily certain at the time the product is first used. Different factors for this originate from the environment (not in the meaning of sustainability) and users. The former includes the speed of progress of the technology used in the product (or product generations) (see e.g. (Deng and Williams 2011)). If a new product is more energy efficient than the original one, and it becomes available before the end of usage, it may be better

easier to manage (Sundin and Bras 2005).

affects (Lay, Schroeter et al. 2009).

(Simpson, Siddique et al. 2006)).

**7. Risk** 

information.

This chapter endeavoured to lead theoretical discussion regarding which IPSE factors are expected to increase environmental performance of a life cycle compared to a traditional product sales business. Four aspects from theory were discussed and their relevance was pointed out. In the theory of product development, information about a product is pointed out to be a crucial parameter, although the theory is to be adapted according to the nature of the offering – IPSO as opposed to a physical, traditional product. Then, asymmetry of the information about a product between a provider and a user was identified as a key for IPSE to be meaningful also through comparison with the product sales type business. Economies of scale were brought into the discussion and this remains to be an important issue for IPSE but with different characteristics from the product sales type business. Finally, risk was discussed and pointed out to be a crucial parameter to be controlled after sale and economies of scale were shown to be an enabler to control the risk in a better way. As shown in these four sections, these aspects are interlinked with each other (see Figure 6) and need to be further investigated. Nevertheless, the chapter has provided a first theoretical cornerstone regarding conditions for IPSE to be a meaningful business style and IPSE's influential factors on environmental performance.

Fig. 6. Relations between different issues at each phase of a life cycle

#### **9. Acknowledgment**

This research was partially supported by a Trafikverket (the Swedish Transport Administration)-funded project "Integrated Product Service Offerings of the Railway Infrastructure System".

#### **10. References**


Production

This research was partially supported by a Trafikverket (the Swedish Transport Administration)-funded project "Integrated Product Service Offerings of the Railway

Akerlof, G. (1970). "The market for lemons: quality uncertainty and the market mechanism."

Alonso-Rasgado, T. and G. Thompson (2006). "A rapid design process for Total Care

Alonso-Rasgado, T., G. Thompson, et al. (2004). "The design of functional (total care)

Andreasen, M. M. (1980). Machine Design Methods Based on a Systematic Approach. Lund,

Arnold, U. (2000). "New dimensions of outsourcing: a combination of transaction cost

Aurich, J. C., C. Fuchs, et al. (2006). "Life cycle oriented design of technical Product-Service

Bergman, B. and B. Klefsjö (2003). Quality from Customer Needs to Customer Satisfaction.

Cantamessa, M. (2011). Design ... but of What. The Future of Design Methodology. H.

Cerin, P. (2006). "Bringing economic opportunity into line with environmental influence: A

discussion on the Coase theorem and the Porter and van der Linde hypothesis."

economics and the core competencies concept." European Journal of Purchasing &

Product creation." Journal of Engineering Design 17(6): 509 - 531.

products." Journal of Engineering Design 15(6): 515-540. Andreasen, M. (1987). Integrated Product Development. Berlin, Springer.

Systems." Journal of Cleaner Production 14(17): 1480-1494.

Use phase

End-of-life

treatment

 phase

 phase

Economic issues Manufacturing issues

Quality issues

Design issues Marketing issues

Et cetera

**9. Acknowledgment**

Infrastructure System".

**10. References** 

Environmental issues

Fig. 6. Relations between different issues at each phase of a life cycle

Quarterly Journal of Economics 84: 488-500.

University of Lund. Ph.D. Thesis.

Supply Management 6(1): 23-29.

Lund, Studentlitteratur AB.

Birkhofer. London, Springer: 229-237.

Ecological Economics 56 209– 225.


Lay, G., M. Schroeter, et al. (2009). "Service-Based Business Concepts: A Typology for Business-to-Business Markets." European Management Journal 27(6): 442-455. Lifset, R. and T. Lindhqvist (1999). "Does Leasing Improve End of Product Life

Lindahl, M., E. Sundin, et al. (2006). Integrated Product and Service Engineering – the IPSE

Lindahl, M. and J. Tingström (2000). A Small Textbook on Environmental Effect Analysis.

Lindemann, U. (2011). Systems Engineering versus Design Methodology. The Future of

Mathieu, V. (2001). "Service strategies within the manufacturing sector: benefits, costs and

Mont, O. (2004). "Institutionalisation of sustainable consumption patterns based on shared

Mont, O., P. Singhal, et al. (2006). "Chemical Management Services in Sweden and Europe: Lessons for the Future." Journal of Industrial Ecology 10(1/2): 279-292. Mont, O. K. (2002). "Clarifying the concept of product–service system." Journal of Cleaner

Morey, E. and D. Pacheco (2003). "Prouct service systems: Exploring the potential for

Mynott, C. (2001). Lean product development: the manager's guide to organising, running

Neely, A. (2007). The servitization of manufacturing: an analysis of global trends. 14th

Oliva, R. and R. Kallenberg (2003). "Managing the transition from products to services." International Journal of Service Industry Management 14(2): 160-172. Olsson, F. (1976). Systematic Design. Lund, University of Lund. Doctoral Thesis.

Pahl, G. and W. Beitz (1996). Engineering Design: A Systematic Approach. London,

Prasad, B. (1997). Concurrent Engineering Fundamentals - Integrated Product Development

Sakao, T. (2009). A View of Service, Quality, and Value for Sustainability. 12th International

Sakao, T., C. Berggren, et al. (2011). Research on Services in the Manufacturing Industry

Conference on Industrial Product-Service Systems, Braunschweig.

based on a Holistic Viewpoint and Interdisciplinary Approach. CIRP International


and controlling the complete business process of developing products.

partnership." Intermational Journal of Service Industry Management 12(5): 451-

project. Changes to Sustainable Consumption, Workshop of the Sustainable Consumption Research Exchange (SCORE!) Network, supported by the EU's 6th

Management?" Journal of Industrial Ecology 3(4): 10-13.

Kalmar, Department of Technology, University of Kalmar.

Design Methodology. H. Birkhofer. London, Springer: 157-167. Martyrer, E. (1960). "Der Ingenieur und das Konstruieren." Konstruktion 12: 1-4.

Framework Programme, Copenhagen , Denmark.

use." Ecological Economics 50(1-2): 135-153.

economic and environmental efficiency."

Northampton, UK, Westfield Publ.

EurOMA Conference, Ankara.

QMOD Conference, Verona.

Springer-Verlag: 1.

Production 10(3): 237-245.

475.


Windahl, C., P. Andersson, et al. (2004). "Manufacturing firms and integrated solutions: characteristics and implications " European Journal of Innovation Management 7(3): 218-228.

### **Leveraging Neural Engineering in the Post-Factum Analysis of Complex Systems**

Jason Sherwin1 and Dimitri Mavris2

*1Columbia University in the City of New York, 2Georgia Institute of Technology, USA* 

#### **1. Introduction**

164 Systems Engineering – Practice and Theory

Windahl, C., P. Andersson, et al. (2004). "Manufacturing firms and integrated solutions:

7(3): 218-228.

characteristics and implications " European Journal of Innovation Management

This chapter is about the pressing problem of, and our proposed response, to data deluge in the analysis of complex systems. We begin by illustrating the problem in certain systems engineering examples, primarily focusing on aerospace-related systems but pointing out the generality of this problem in other data-intensive design problems (Section 2). Having established the need to address this problem, we then propose a solution based on current advances in the intersecting fields of neuroscience and computer engineering, increasingly being called *neural engineering* (Section 3). With a proposed solution in mind, we carry out a case study in which we utilize certain results and algorithms from neural engineering (Section 4). Though this case study gives credible results, we find that we can improve our neural-based models of complex systems data from more direct neuroscience experiments on expertise (Section 5). Finally, we draw conclusions on the current state of the art for leveraging neural engineering results and algorithms on the problem of complex systems post-factum data analysis (Section 6).

#### **2. A problem in systems engineering: Data deluge**

The need to engineer within and for both increasingly complex and sophisticated systems is continually growing. In tandem with this problem is the need to analyze ever-increasing amounts of data that describe these systems. In short, the post-factum analysis of an already-built system is a key step in the analysis and, consequently, the design processes.

For instance, within the aerospace community, this problem is not unfamiliar. In that field, the perennial aim has been to balance the various sub-disciplines (e.g., acoustics, aerodynamics, propulsion, structures) to deliver an aircraft or spacecraft that meets a set of pre-defined criteria. But with each added sub-system of an aircraft, there is an added degree of complexity that is contributed to the design process.

This phenomenon is not unique to aerospace systems design. More generally, with the explosion of remote sensing capabilities in recent years, there has been a deluge of data made available about many other complex and intricate systems. But the means to fully analyze this data and to extract a useful comprehension of its content can be a challenge.

Both of these problems – one being a subset of the other – share a common thread: there is a plethora of computation needed to arrive at a design solution. In aircraft design, there is a potential deluge of possible designs as new sub-systems are added to the analysis. Similarly, in the mining of data from complex systems, there is likewise a deluge of possible data interpretations; and no specific interpretation is more 'correct' than any other (via the 'No Free Lunch Theorem', Ho & Pepyne, 2002).

In the midst of this deluge, it is potentially easier to approach the data holistically and to provide a subjective analysis of its content. Not only does this approach allow the data's full scope to be considered, but it also allows comprehension to be communicated rapidly because of its approximate – and therefore, simpler – nature. For instance, many systems engineering techniques have been devised to simplify the potentially overwhelming aspects of a complex system's analysis. Some examples of these are the analytical hierarchy process (Saaty, 2000 and Saaty, 2008), quality function deployment (Chan & Wu, 2002 and Akao, 1990) and other quasi-quantitative methods of subjective evaluation. While these methods have proven to be rapid, their transparency is lacking due to the expert-driven nature of the processing schema. For instance, in quality function deployment (QFD), experts in a particular discipline create subjective mappings from requirements to characteristics for a given product. There is no physics-based model that determines the product's design. Rather, a graded scale is used to map design requirements to characteristics based on a subjective assessment done by an expert. Necessarily, in this and other techniques like it, there is a crucial role for an expert's input to such analysis.

There has also been an opposite response to the data deluge in system analysis: utilize the increasingly available computation power to process the excessive amounts of data. In other words, rather than resign to the need for subjective analysis (e.g., in QFD) due to the problem's complexity, the availability of greater amounts of computing power in recent years has made it possible somewhat to navigate the deluge. For example, this has been the mentality behind the approach of multi-disciplinary optimization (Vanderplaats, 2007), which is used with great success in aircraft design. In multi-disciplinary optimization (MDO), numerical optimization techniques are applied to sets of objective functions whose dependent variables must satisfy various constraints (i.e., inequality, equality and side constraints). The ultimate aim though is not to yield a design that is optimal in any one system, but rather one that is optimal with regard to all systems. Necessarily, such a process is computationally quite costly and as the number of variables grows it becomes infeasible.

Similar examples exist for the analysis of data obtained remotely. For instance, the American military increasingly relies on remote sensing for many of its activities (e.g., the MQ-1 Predator drones, National Commission, 2004). But the exponentially-increasing amounts of data leave the analysts "swimming in sensors and drowning in data" (Drew 2010). In other words, the analytic tools to comprehend the data are well behind the means to gather it.

Such an analysis problem is an inherent precursor to engineering a complex system. For instance, it exists in the case where the system has been human-constructed from many parts (e.g., an aircraft). And it also exists when the system is not human-constructed, i.e., in nature. It is this latter situation that is of the most interest to us now though because, in reality, it is a superset of the first: whether man-made or not, it is a difficult engineering analysis problem to figure out how complex systems work. In particular, although the human-constructed parts may behave in predictable ways in many situations, there are always new interactions arising

Both of these problems – one being a subset of the other – share a common thread: there is a plethora of computation needed to arrive at a design solution. In aircraft design, there is a potential deluge of possible designs as new sub-systems are added to the analysis. Similarly, in the mining of data from complex systems, there is likewise a deluge of possible data interpretations; and no specific interpretation is more 'correct' than any other (via the 'No

In the midst of this deluge, it is potentially easier to approach the data holistically and to provide a subjective analysis of its content. Not only does this approach allow the data's full scope to be considered, but it also allows comprehension to be communicated rapidly because of its approximate – and therefore, simpler – nature. For instance, many systems engineering techniques have been devised to simplify the potentially overwhelming aspects of a complex system's analysis. Some examples of these are the analytical hierarchy process (Saaty, 2000 and Saaty, 2008), quality function deployment (Chan & Wu, 2002 and Akao, 1990) and other quasi-quantitative methods of subjective evaluation. While these methods have proven to be rapid, their transparency is lacking due to the expert-driven nature of the processing schema. For instance, in quality function deployment (QFD), experts in a particular discipline create subjective mappings from requirements to characteristics for a given product. There is no physics-based model that determines the product's design. Rather, a graded scale is used to map design requirements to characteristics based on a subjective assessment done by an expert. Necessarily, in this and other techniques like it,

There has also been an opposite response to the data deluge in system analysis: utilize the increasingly available computation power to process the excessive amounts of data. In other words, rather than resign to the need for subjective analysis (e.g., in QFD) due to the problem's complexity, the availability of greater amounts of computing power in recent years has made it possible somewhat to navigate the deluge. For example, this has been the mentality behind the approach of multi-disciplinary optimization (Vanderplaats, 2007), which is used with great success in aircraft design. In multi-disciplinary optimization (MDO), numerical optimization techniques are applied to sets of objective functions whose dependent variables must satisfy various constraints (i.e., inequality, equality and side constraints). The ultimate aim though is not to yield a design that is optimal in any one system, but rather one that is optimal with regard to all systems. Necessarily, such a process is computationally quite costly and as the number of variables grows it becomes infeasible. Similar examples exist for the analysis of data obtained remotely. For instance, the American military increasingly relies on remote sensing for many of its activities (e.g., the MQ-1 Predator drones, National Commission, 2004). But the exponentially-increasing amounts of data leave the analysts "swimming in sensors and drowning in data" (Drew 2010). In other words, the analytic tools to comprehend the data are well behind the means to gather it.

Such an analysis problem is an inherent precursor to engineering a complex system. For instance, it exists in the case where the system has been human-constructed from many parts (e.g., an aircraft). And it also exists when the system is not human-constructed, i.e., in nature. It is this latter situation that is of the most interest to us now though because, in reality, it is a superset of the first: whether man-made or not, it is a difficult engineering analysis problem to figure out how complex systems work. In particular, although the human-constructed parts may behave in predictable ways in many situations, there are always new interactions arising

Free Lunch Theorem', Ho & Pepyne, 2002).

there is a crucial role for an expert's input to such analysis.

between component sub-systems that reveal a previously unknown system-level behavior. Therefore, while reductionist approaches to system construction can be successful in most cases, we have still not obtained the hoped for deterministic prediction of behavior.

Instead, it seems that a degree of uncertainty is inherent in the analysis and consequently the engineering of all systems. This does not mean that we throw the previously successful reductionist approaches to the wind though. But for the analysis and engineering of those systems for which such approaches are impossible (e.g., not accurately quantifiable, not modelled within computational constraints), the only mechanism for design choices thus far has been the aforementioned systems engineering techniques (QFD, AHP, MDO, etc.). A new approach is needed.

#### **3. A new path for handling data deluge in analysis: Neural engineering**

As a result of this problem, we suggest here to consider the breadth of results and techniques emerging from neural engineering to bolster systems analysis for engineering purposes. In particular, instead of relying on an inconsistent mapping made by human experts to design analysis (e.g., as in QFD), why not understand some cognitive elements to expertise and, in turn, apply that comprehension to both systems analysis and manipulation? Of course, these are both monumental tasks to perform, considering not only the breadth of cognitive abilities that comprise expertise but also determining how to implement them in real engineering contexts.

Despite the seemingly daunting nature of these endeavors, certain elements of expert decision-making, human situational awareness and cortical biology can inform some of the details as to how we can understand and, in turn fine tune, the ways by which we as engineers collect observations and integrate them into a cohesive analysis; such an analysis is then the foundation of ensuing engineering choices. Nowhere is this need as great as it is in the analysis of complex and large-scale systems. Therefore, a true test as to the utility of neural engineering for systems purposes would be to implement these ideas within a complex or large-scale analysis and engineering task. In this chapter, we will demonstrate a simulated use of such an application.

As a demonstration, we discuss the application of neural engineering to the analysis of Iraq's stability during 2003-2008. This application was never used in a real context, however we frame the problem within the context of its utility to a decision-maker whose actions influence the outcome of such a system. In other words, he/she must analyze and then manipulate this system. Our assumption is that the decision-maker only has access to a stream of data that measures certain conditions related to Iraq's stability. More importantly, we assume that there is no possibility of developing an analytic model to describe the timeevolution of Iraq during these years. Rather, we cast aside that futile aim and attempt to glean useful patterns directly from the data. As part of this demonstration paraphrase (seeing as the full-blown analysis comprises a Ph.D. thesis and several papers), we emphasize the importance of learning algorithms to do so. Specifically, we consider algorithms based off of some anatomical and behavioral features of the human cortex. Builtin to the rationale behind using these algorithms is the assumption that many of the cognitive faculties comprising the development and use of expertise reside in the cortex.

Building off of the hope (and shortcomings) provided by the Iraq example, we then review some of the latest developments in neural engineering that have possible applications in the analysis of other large-scale systems. As we will see, it is important to maintain an awareness of how the biological hardware (e.g., a neuronal network) "computes" its analysis of complex, time-evolving, often self-conflicting and/or occluded data. We will offer the results of an experiment from audio cognition in which expertise of subjects is tracked directly from neural data. Finally, we will then consider how these insights would then translate back to the actual solid-state computations done in modern computers to drive systems engineering analysis.

#### **4. Case study of neural engineering-assisted analysis: Iraq war, 2003-2008**

The focus of this case study is to create a computational situational awareness (SA) usable by the Department of Defense to gauge Iraq's stability during 2003-2008. Situational awareness is a term from psychology used to describe elemental steps of an expert's mental processes (Endsley, 1995). In other words, situational awareness is the description of the cognitive processes involved in an expert's analysis of a situation. So if we can design an appropriate computational SA for the Iraq context then it is equivalent to developing a means to analyze that context for driving it to a desired state – in other words, to engineer it.

As a theoretical construct, SA was developed to analyze the decision-making processes of aircraft pilots, yet its general usage has extended into many other areas in which expertise are employed by a human controller. In general, an SA can be particular to a given scenario or context. For example, pilots have SA for flying airplanes, pianists have SA for playing music, etc. In our case study, the SA of interest applies to the stability of Iraq during the war from 2003-2008, which henceforth will be called the 'Iraq context'.

To develop this SA computationally, we implement a method that is summarized by the information flow of Fig. 1.

Fig. 1. Method for building/maintaining computational SA with HTM

analysis of other large-scale systems. As we will see, it is important to maintain an awareness of how the biological hardware (e.g., a neuronal network) "computes" its analysis of complex, time-evolving, often self-conflicting and/or occluded data. We will offer the results of an experiment from audio cognition in which expertise of subjects is tracked directly from neural data. Finally, we will then consider how these insights would then translate back to the actual solid-state computations done in modern computers to drive systems engineering analysis.

**4. Case study of neural engineering-assisted analysis: Iraq war, 2003-2008**  The focus of this case study is to create a computational situational awareness (SA) usable by the Department of Defense to gauge Iraq's stability during 2003-2008. Situational awareness is a term from psychology used to describe elemental steps of an expert's mental processes (Endsley, 1995). In other words, situational awareness is the description of the cognitive processes involved in an expert's analysis of a situation. So if we can design an appropriate computational SA for the Iraq context then it is equivalent to developing a means to analyze that context for driving it to a desired state – in other words, to engineer it. As a theoretical construct, SA was developed to analyze the decision-making processes of aircraft pilots, yet its general usage has extended into many other areas in which expertise are employed by a human controller. In general, an SA can be particular to a given scenario or context. For example, pilots have SA for flying airplanes, pianists have SA for playing music, etc. In our case study, the SA of interest applies to the stability of Iraq during the war

To develop this SA computationally, we implement a method that is summarized by the

from 2003-2008, which henceforth will be called the 'Iraq context'.

Fig. 1. Method for building/maintaining computational SA with HTM

information flow of Fig. 1.

This method maps the neurally-inspired machine learning algorithms to be used here (Hierarchical Temporal Memory, or HTM, see Section 3.2.1 for details, George & Hawkins, 2009 and George, 2008) to the three levels of SA first hypothesized by Endsley (Endsley, 1995). These three levels are Level 1 (perception of relevant elements), Level 2 (comprehension of those elements) and Level 3 (prediction). Here, we focus on Levels 1 and 2, since they are a necessary antecedent to Level 3. In particular, we present here a way to implement Levels 1 and 2 for this problem via data preprocessing and HTM training/testing.

#### **4.1 Why use such a high-level representation of mental processes?**

While this approach enhances awareness of trends in the data of the Iraq context, it also mimics the basic tenets of what constitutes SA in actual decision-makers. In particular, Endsley and others have shown that the selection of a goal is crucial to SA formation. In other words, the collection of data, its analysis and the engineering goal are all inextricable in forming SA. Here, we assume the criteria for success established by the U.S. Department of Defense: to bring Iraq to political, economic and social stability between 2003-2008 (United States House of Representatives, 2005). Consequently, we rely on data related to these aspects of the Iraq context, so that not only do we enhance a decision-maker's own SA of the Iraq context but we also create one – albeit, a rudimentary one – with a computer. By starting from such a high-level representation of expert mental processes, we can then specialize the computational tools used to find problem-relevant patterns in the data.

#### **4.2 Deploying these processes computationally**

Once the collection of data is focused onto problem-relevant elements, the analysis of that data becomes a learning problem. By conceiving of the expert's assessment of data as a learning problem (due to Endsley), we are in a position to mimic some of these processes computationally.

However, there is a question to consider before doing so: What is the right answer? In particular, no matter what machine learning approach is used, it is difficult to validate the SA learned about this context, since we do not know the right answer a priori. In other words, we do not know if our assessment of Iraq's stability is 'correct'. Although this is possible in other learning problems, such as invariant visual pattern recognition (i.e., the pattern is either object A or object B), we cannot do this here.

So to verify the accuracy of the computational SA formed in this context, another method will be introduced that has influence from system dynamics: we call it extreme-case bounding. This method has assumptions built into it that creates fictitious extreme cases of stability, either extremely unstable or extremely stable. With these fictitious bounds used for HTM network training/testing (e.g., extreme dystopia or utopia based on the data), some insight into the actual progression of events in Iraq during 2003-2008 can be obtained. Needless to say, this method is not perfect and it is somewhat arbitrary because we arbitrarily select a peg against which to measure Iraq's stability. Nevertheless, it provides an intriguing foothold in an avenue of computational SA that has thus far been difficult to probe concretely.

#### **4.2.1 The final computational piece: A neurally-inspired machine learning algorithm**

Thus far, Hierarchical Temporal Memory (HTM) algorithms have been mentioned as a way to execute the various stages of SA accounted by Endsley and have been adapted into Fig. 1. However, we have not yet discussed why these algorithms in particular are of interest. In what follows, we will argue that the hierarchical storage of spatio-temporal data and the way by which temporally adjacent data points are related to each other lend well to steps of SA laid out in Fig. 1. To make these points clear, Fig. 1 includes learning and inference steps involved in HTM training and testing as well.

An HTM attempts to mimic two crucial aspects of cortical function and anatomy. These are particularly of use for determining better ways to handle highly-varied data, so their potential utility in forming SA are apparent. First, these algorithms rely on the temporal adjacency of observed events when storing spatial patterns. The anatomical inspiration for this procedure comes from observations of cortical function. In particular, there are numerous cell groups and types in the cortex that have been identified as 'sequence detectors' (e.g., PPA, Broca's Area, FFA). Secondly, and in relation to the first aspect, the algorithms store these sequences in a hierarchical arrangement across both space and time. The result of this division of spacetime is that local regions' spatio-temporal patterns are first encoded from which more global regions' patterns are then encoded, etc. The anatomical inspiration for this compartmentalization of information comes directly from the different hierarchical functional areas observed in the human cortex (e.g., visual cortex, audio cortex, etc.).

Since an HTM is implemented on a computer, it is perhaps useful to consider a mathematical description of HTM. For starters, a trained HTM is in fact a Bayesian inference network. In particular, it is a network of nodes, each of which solving the same problem: learning spatiotemporal sequences. On a network-level, the goal behind the algorithms is to learn a schema ( *S* ) that describes data related to a given problem. That problem exists locally for each node in a vector space ( *<sup>i</sup> v* ) and, upon grouping all nodes, exists on a global level that concerns all data related to the problem (*V* ). Considering the local version of the problem, each vector ( *<sup>k</sup> x* ) in *<sup>i</sup> v* is an observation of an aspect of a given complex phenomenon at the *th k* time step. The HTM node's goal then is to create *q* Markov-chains to which any one of the vectors in *<sup>i</sup> v* can be assigned. For compression purposes, it is highly desirable that *q k* . By collecting sets of observations in this way, each node's Markov-chain ( *mq* ) corresponds to some spatiotemporal high-level feature of the local complex phenomenon. Consequently, the set of all *M* Markov-chains constitutes a schema ( *S* ) of the global phenomenon. In other words, *S* is a reduction of the phenomenon witnessed in each node's vector space, *<sup>i</sup> v* . In particular, by using HTM, the aim is to use learning mechanisms akin to certain aspects of neural coding to develop a schema, i.e., a situational awareness of the data (*V* ).

#### **4.3 Implementation for the Iraq context**

For the Iraq context, the phenomenon is the Iraq War during 2003-2008. The goal of the HTM then is to create a schema ( *SIraq* ) that is a suitable description of the Iraq context. This schema is based on what kind of data is used to describe the Iraq War ( *VIraq* ). Recalling that the analysis and collection of data are inextricably linked in forming SA, and due to the DoD goal of achieving political, economic and security stability, we have chosen metrics of these aspects of Iraq stability to track during 2003-2008. In the following sub-sections, we show what specific data is used (part of forming Level 1 SA) and how that data is 'comprehended' using a trained HTM for inference (Level 2 SA).

#### **4.3.1 Level 1 SA: Data preparation**

170 Systems Engineering – Practice and Theory

An HTM attempts to mimic two crucial aspects of cortical function and anatomy. These are particularly of use for determining better ways to handle highly-varied data, so their potential utility in forming SA are apparent. First, these algorithms rely on the temporal adjacency of observed events when storing spatial patterns. The anatomical inspiration for this procedure comes from observations of cortical function. In particular, there are numerous cell groups and types in the cortex that have been identified as 'sequence detectors' (e.g., PPA, Broca's Area, FFA). Secondly, and in relation to the first aspect, the algorithms store these sequences in a hierarchical arrangement across both space and time. The result of this division of spacetime is that local regions' spatio-temporal patterns are first encoded from which more global regions' patterns are then encoded, etc. The anatomical inspiration for this compartmentalization of information comes directly from the different hierarchical functional areas observed in the

Since an HTM is implemented on a computer, it is perhaps useful to consider a mathematical description of HTM. For starters, a trained HTM is in fact a Bayesian inference network. In particular, it is a network of nodes, each of which solving the same problem: learning spatiotemporal sequences. On a network-level, the goal behind the algorithms is to learn a schema ( *S* ) that describes data related to a given problem. That problem exists locally for each node in a vector space ( *<sup>i</sup> v* ) and, upon grouping all nodes, exists on a global level that concerns all data related to the problem (*V* ). Considering the local version of the problem, each vector ( *<sup>k</sup> x* ) in *<sup>i</sup> v* is an observation of an aspect of a given complex phenomenon at the *th k* time step. The HTM node's goal then is to create *q* Markov-chains to which any one of the vectors in *<sup>i</sup> v* can be assigned. For compression purposes, it is highly desirable that *q k* . By collecting sets of observations in this way, each node's Markov-chain ( *mq* ) corresponds to some spatiotemporal high-level feature of the local complex phenomenon. Consequently, the set of all *M* Markov-chains constitutes a schema ( *S* ) of the global phenomenon. In other words, *S* is a reduction of the phenomenon witnessed in each node's vector space, *<sup>i</sup> v* . In particular, by using HTM, the aim is to use learning mechanisms akin to certain aspects of neural coding to

For the Iraq context, the phenomenon is the Iraq War during 2003-2008. The goal of the HTM then is to create a schema ( *SIraq* ) that is a suitable description of the Iraq context. This schema is based on what kind of data is used to describe the Iraq War ( *VIraq* ). Recalling that the analysis and collection of data are inextricably linked in forming SA, and due to the DoD goal of achieving political, economic and security stability, we have chosen metrics of these

**4.2.1 The final computational piece: A neurally-inspired machine learning algorithm**  Thus far, Hierarchical Temporal Memory (HTM) algorithms have been mentioned as a way to execute the various stages of SA accounted by Endsley and have been adapted into Fig. 1. However, we have not yet discussed why these algorithms in particular are of interest. In what follows, we will argue that the hierarchical storage of spatio-temporal data and the way by which temporally adjacent data points are related to each other lend well to steps of SA laid out in Fig. 1. To make these points clear, Fig. 1 includes learning and inference steps

involved in HTM training and testing as well.

human cortex (e.g., visual cortex, audio cortex, etc.).

develop a schema, i.e., a situational awareness of the data (*V* ).

**4.3 Implementation for the Iraq context** 

Entering the information flow of Fig. 1, the first task (represented by an oval) is to prepare the data. Before doing so, we must address the Boolean question (represented by a triangle) about whether data is available. For the Iraq context, we actually have data. But some effort is needed to prepare this data into standard form.

There are four issues we must confront in doing so. First, the primary source (United States Department of Defense, 2005, 2006, 2007, 2008, and O'Hanlon & Campbell, 2008) from which the data is extracted contains many blanks in the data, depending on how many metrics are used. So a set of metrics must be selected from the actual data that exhibits a minimal number of blanks.

Second, the primary source has not prepared the data in a temporally structured format suitable for HTM learning. Dileep George pioneered work on HTM algorithms and he gives guidelines for generalizing their usage in other domains. In particular, George writes, "if there is no temporal structure in the data, application of an HTM to that data need not give any generalization advantage." So the data must be arranged in this fashion if HTM is to be of use. Specifically, observations at specific time intervals should follow one another.

Third, the relative magnitudes of the chosen metrics will be necessary to consider. Consequently, a transformation of the data may be necessary before training/testing.

Fourth and finally, one of the metrics we use to describe the Iraq context is only known within given bounds at each time step. Consequently, we must select a technique to get only one value at each time step, rather than a range.

Considering all of these points, it is possible to pick a subset of metrics from the primary source that we can use to describe the Iraq context in a data-driven fashion related to our goal of tracking stability. These selected metrics are shown in Table 1 with identifying numbers next to each of them.


Table 1. Sixteen metrics to describe Iraq context

While it is possible to select fewer metrics, a drop off in performance was seen when this was done. We have shown this in other works (Sherwin & Mavris, 2011 and Sherwin, 2010). We believe that this occurs because the degree of stability in an operational theater already lacks a clear definition amongst stakeholders. Consequently, the more metrics that are incorporated into the analysis, the more complete the description of stability will be. Inversely, the fewer metrics that are incorporated, the more narrow the description will be. Here, we stopped at sixteen because this number approaches the upper limit of what was publically available, although more metrics may make the analysis that much more rich.

Finally, to give the HTM a baseline for stability and instability, artificial data generated from a rudimentary system dynamics was created based on the selected metrics. For instance, in this model (for stability, for instance), the number of troop deaths due to car bombs fell off to zero over time (roughly the same 60 months of time for which actual data exists). Alternatively, in this model, (for instability, e.g.), the nationwide electricity would flatten out to zero. In general, metrics associated with stable or unstable situations would monotonically be driven to an extreme maximum or minimum over the course of 60 months, starting from a real data point. In other words, we use extreme-cases to bound the reality observed in actuality – and this reality is more of a complex mix of certain features of instability and/or stability along different avenues (such as politically, socially, or economically).

#### **4.3.2 Level 2 SA: HTM-aided comprehension of the data**

With the ability to generate data for both progressively stable and unstable situations, as well as the actual time series of data on the Iraq context, it is possible to attempt HTM as an unsupervised machine learning mechanism. Recall, an HTM is a network trained is built to find a schema, *S* , that describes the Iraq context. This is based on each vector observed in each node's local vector space, *<sup>i</sup> v* , all of which considered together constitute *V* . To aid the spatio-temporal grouping, these vectors are first grouped with K-means clustering before temporal adjacency is learned and grouped into the network's Marko-chains, *M* . These Markov-chains are then used to perform evidence-based Bayesian inference on novel data.

With HTM, the aim now is to fuse the data and to extract possibly implicit meaning from it pertinent to the Iraq context. We emphasize the unsupervised nature of the learning here because our goal is to extract implicit meaning and not to impose our possibly biased judgments. Furthermore, we attempt to extract this meaning from a system that is not ergodic (i.e., there is no end-state), not separable into components (hence, model-able) and not completely observable (i.e., uncertain data describes the system).

It has been found to be more effective to first train the HTM on the extreme-case data and then to test its inference capabilities on the actual data (Sherwin & Mavris, 2011). Therefore, we implement this approach so that the HTM can learn from the extreme-case boundaries and then use them to classify the reality in between.1

The evaluation of this computational SA is not entirely straightforward and so additional techniques were employed to probe the SA formed about the Iraq context. These studies will

 1 The former method has been tried and has proven unsuccessful (Sherwin, 2010).

While it is possible to select fewer metrics, a drop off in performance was seen when this was done. We have shown this in other works (Sherwin & Mavris, 2011 and Sherwin, 2010). We believe that this occurs because the degree of stability in an operational theater already lacks a clear definition amongst stakeholders. Consequently, the more metrics that are incorporated into the analysis, the more complete the description of stability will be. Inversely, the fewer metrics that are incorporated, the more narrow the description will be. Here, we stopped at sixteen because this number approaches the upper limit of what was publically available, although more metrics may make the analysis that much more rich.

Finally, to give the HTM a baseline for stability and instability, artificial data generated from a rudimentary system dynamics was created based on the selected metrics. For instance, in this model (for stability, for instance), the number of troop deaths due to car bombs fell off to zero over time (roughly the same 60 months of time for which actual data exists). Alternatively, in this model, (for instability, e.g.), the nationwide electricity would flatten out to zero. In general, metrics associated with stable or unstable situations would monotonically be driven to an extreme maximum or minimum over the course of 60 months, starting from a real data point. In other words, we use extreme-cases to bound the reality observed in actuality – and this reality is more of a complex mix of certain features of instability and/or stability along different avenues (such as politically, socially, or

With the ability to generate data for both progressively stable and unstable situations, as well as the actual time series of data on the Iraq context, it is possible to attempt HTM as an unsupervised machine learning mechanism. Recall, an HTM is a network trained is built to find a schema, *S* , that describes the Iraq context. This is based on each vector observed in each node's local vector space, *<sup>i</sup> v* , all of which considered together constitute *V* . To aid the spatio-temporal grouping, these vectors are first grouped with K-means clustering before temporal adjacency is learned and grouped into the network's Marko-chains, *M* . These Markov-chains are then used to perform evidence-based Bayesian inference on novel data. With HTM, the aim now is to fuse the data and to extract possibly implicit meaning from it pertinent to the Iraq context. We emphasize the unsupervised nature of the learning here because our goal is to extract implicit meaning and not to impose our possibly biased judgments. Furthermore, we attempt to extract this meaning from a system that is not ergodic (i.e., there is no end-state), not separable into components (hence, model-able) and

It has been found to be more effective to first train the HTM on the extreme-case data and then to test its inference capabilities on the actual data (Sherwin & Mavris, 2011). Therefore, we implement this approach so that the HTM can learn from the extreme-case boundaries

The evaluation of this computational SA is not entirely straightforward and so additional techniques were employed to probe the SA formed about the Iraq context. These studies will

**4.3.2 Level 2 SA: HTM-aided comprehension of the data** 

not completely observable (i.e., uncertain data describes the system).

1 The former method has been tried and has proven unsuccessful (Sherwin, 2010).

and then use them to classify the reality in between.1

economically).

not be reviewed in too much depth now, but a summary of them is important for our purposes.

For starters, we do not know if too many or too few metrics are being used here to describe the Iraq context. Therefore, studies were done to see what effects there are from reducing the number of metrics used to train and infer with the network. It was found that fewer metrics reduce the semantic richness of the data, thereby causing certain volatile metrics to dominate the learning and subsequent inference.

Also, we examined the degree to which information is hierarchically stored in intermediate levels of the networks. We found that, true to the promise of HTM, this was the case. Finally, we considered alternative ways of feeding the data into the networks to see what effects – if any – there are on the simulated SA.

Why would we use these techniques in particular though? For instance, what purpose could there be in probing the hierarchical storage of information in an HTM? It is necessary to recall that our purpose in using HTM for computational SA has been its declared ability to condense information into hierarchies of both space and time. For the Iraq context, we test hierarchical storage directly because it is not clear what the top-level node's output should be, as it might be for simpler recognition tasks (e.g., invariant visual pattern recognition, George & Hawkins, 2009, and George, 2008). One possible outcome of this analysis is that it might in turn help us to identify what aspects of the Iraq context are not well observed. This would then provide the beginnings of a feedback mechanism with Level 1 SA to search for more data.

In fact, in the course of this research, it was one possible feedback mechanism between Levels 1 & 2 SA that informed us to improve our extreme-case model. This resulted in the monotonic extreme-case models used below. Necessarily, this is not the only possible feedback mechanism, but as we will see, it helps to strengthen the credibility of the Level 2 SA formed here computationally. If we use data for training that becomes monotonically extreme from a realistic starting condition then we would expect an HTM network to learn to recognize clear progressions towards stability/instability.

We employ an evolutionary approach to network design here and modify a network used in an HTM demonstration example (see Numenta, 2008).2 In order to exploit the HTM network's temporal learning algorithms, we modify the network parameters to accommodate how the metrics' values change in time. The complete network parameters employed for this computational Level 2 SA can be seen in another work (see appendix C.9 in Sherwin, 2010).3

From training the network, we can survey the resulting schema created by the network. As for all HTM nodes, this is described in terms of coincidence patterns (i.e., distinct spatial patterns) that form the schema's Markov-chains. Here, we find that there are sixty-one

<sup>2</sup> All analysis has been done with Vitamin D Toolkit 1.3.0 as a graphical user interface to NuPIC 1.6.1, which is run on Python 2.5.2. It should be noted that slightly different results are obtained if networks are created, trained and tested directly in NuPIC. See appendix D in Sherwin, 2010 for more information on this topic.

<sup>3</sup> Even though this work says that these parameters are for a progressively trained network, they are the same ones used for the monotonically trained one.

coincidence patterns (*C3,1*) and fifty-nine Markov-chains (*G3,1*) in the top-level node.4 The coincidence patterns are the result of the K-means clustering in this top-level node, while the Markov-chains are the result of first-order transition probabilities between these coincidence patterns. This is a standard calculation for each of the nodes in an HTM network (see George & Hawkins, 2009).

After proper training, an HTM network is most valuable as an inference tool, so now we evaluate its performance. We will start plaintively by looking at inference on the training data, moving onto novel data later.

When we perform inference on the monotonic training data, we see a clear progression of Markov-chains as instability increases, but stability is still not clear. We can see this by following the probability over Markov-chains *t r g* of the top-level node, given the bottom-up evidence (*–et*) at each *t*. In particular, we examine the maximum of this distribution ( max *t r g* ) to see what stability state is most likely. What we find is that the progression indicated by max *t r g* is *g0*, *g1*, *g2*, ..., *g58*. Since we know the data is monotonic towards instability, we can reasonably claim that the Markov-chain labels are monotonic towards instability as well. For example, the bottom-up evidence when g45 is most likely in the top level indicates a situation that is less stable than when g5 is most likely.

Having trained the network to recognize progressions in instability, it would be useful now to test this ability on novel data. In particular, we feed into the network real data of the Iraq context. When we look for instability gradations in the actual data, we see some interesting results (Fig. 2). In Fig. 2, as in similar figures to follow, each row is a time point. The first, second, third, etc. columns indicate the groups (i.e., Markov-chains) that are most likely, second most likely, third most likely, etc., given the bottom-up evidence. The significance of this ordering of group numbers at each time point is that we can quantitatively say how unstable Iraq is at each of them. Note throughout that time points *t [0, 60]* correspond to each month from May 2003 to April 2008. In particular, at *t = 11*, *12*, the entire probability distribution over top-level Markov-chains shifts towards higher number Markov-chains. At *t = 11, g25, g24, g23* are in the top three (see Fig. 2).

At *t = 18*, the probability distribution shifts as well, indicating *g12, g13, g14* in the top three. In light of our results from inference on the monotonic-extreme-case instability data, it would seem that the Iraq context is increasingly unstable during these months. Furthermore, the actual data during these months indicates this in comparison to those months that come before them.

Let us expand our purview to those time points leading up to and coming out of *t [41,49],* another region of heightened instability according to the network. If we consider the top seven Markov-chains of the top-level for *t [36,60]* then we see something quite interesting. For *t [36,41],* the distribution shifts increasingly towards *g12, g13, g14, g15, g16, g17, g18*. Also, we can see the demotion of *g0* over these time steps (Fig. 3), indicating increasing instability.

<sup>4</sup> Here and throughout the remainder of the paper, we follow George's notation for HTM theory (George & Hawkins, 2009 and George, 2008).

Fig. 2. Instability recognition of real data

174 Systems Engineering – Practice and Theory

coincidence patterns (*C3,1*) and fifty-nine Markov-chains (*G3,1*) in the top-level node.4 The coincidence patterns are the result of the K-means clustering in this top-level node, while the Markov-chains are the result of first-order transition probabilities between these coincidence patterns. This is a standard calculation for each of the nodes in an HTM network (see

After proper training, an HTM network is most valuable as an inference tool, so now we evaluate its performance. We will start plaintively by looking at inference on the training

When we perform inference on the monotonic training data, we see a clear progression of Markov-chains as instability increases, but stability is still not clear. We can see this by

bottom-up evidence (*–et*) at each *t*. In particular, we examine the maximum of this

monotonic towards instability, we can reasonably claim that the Markov-chain labels are monotonic towards instability as well. For example, the bottom-up evidence when g45 is most likely in the top level indicates a situation that is less stable than when g5 is most

Having trained the network to recognize progressions in instability, it would be useful now to test this ability on novel data. In particular, we feed into the network real data of the Iraq context. When we look for instability gradations in the actual data, we see some interesting results (Fig. 2). In Fig. 2, as in similar figures to follow, each row is a time point. The first, second, third, etc. columns indicate the groups (i.e., Markov-chains) that are most likely, second most likely, third most likely, etc., given the bottom-up evidence. The significance of this ordering of group numbers at each time point is that we can quantitatively say how

each month from May 2003 to April 2008. In particular, at *t = 11*, *12*, the entire probability distribution over top-level Markov-chains shifts towards higher number Markov-chains. At

At *t = 18*, the probability distribution shifts as well, indicating *g12, g13, g14* in the top three. In light of our results from inference on the monotonic-extreme-case instability data, it would seem that the Iraq context is increasingly unstable during these months. Furthermore, the actual data during these months indicates this in comparison to those

another region of heightened instability according to the network. If we consider the top

4 Here and throughout the remainder of the paper, we follow George's notation for HTM theory

 *[36,41],* the distribution shifts increasingly towards *g12, g13, g14, g15, g16, g17, g18*. Also, we can see the demotion of *g0* over these time steps (Fig. 3), indicating increasing

Let us expand our purview to those time points leading up to and coming out of *t* 

*t r g* ) to see what stability state is most likely. What we find is that the

*t r g* is *g0*, *g1*, *g2*, ..., *g58*. Since we know the data is

*t r g* of the top-level node, given the

 *[36,60]* then we see something quite interesting.

 *[0, 60]* correspond to

 *[41,49],*

George & Hawkins, 2009).

distribution ( max

likely.

For *t* 

instability.

data, moving onto novel data later.

progression indicated by max

following the probability over Markov-chains

unstable Iraq is at each of them. Note throughout that time points *t* 

*t = 11, g25, g24, g23* are in the top three (see Fig. 2).

months that come before them.

seven Markov-chains of the top-level for *t* 

(George & Hawkins, 2009 and George, 2008).

Fig. 3. Probability Distribution Shifts Towards Higher-Number Markov-chains

Then for *t [41,49]*, these seven Markov-chains are the most likely, given the bottom-up evidence (Fig. 4).


Fig. 4. Complete Shift Towards Markov-chains Indicating Instability

As we know from the actual data, there were peaks in violence and other attacks, sagging economic metrics, etc. during this time. But there were also metric trends that favored stability. Consequently, this conflicting data makes it difficult to characterize the stability level during this time period. But here we see the probability distribution shift for the entire time period towards these mid-grade instable states.

The situation in the Iraq context changes though with time. For *t [50,51]*, the probability distribution begins to shift back. Finally, for *t [52,60], g0, g1, g2, g3* are the top four most likely Markov-chains (see Fig. 5).

Even though this does not indicate stability, it does indicate a dramatic drop in instability, according to how we trained the network. So we see here how the monotonic training data has provided a peg against which to categorize evidence that trends towards instability. But what about stability recognition?

As mentioned earlier, direct stability recognition is less clear, even with the monotonic training data. Rather, we can only infer stability recognition with the network. Why is this so? If we consider the types of metrics used here then we notice that only four of them increase with stability. So, as a more stable situation is reached, the remaining twelve metrics drop close to zero. Consequently, the bottom-up evidence does not provide enough magnitude to propagate through the network. All the change comes from the four metrics that increase with stability. In the current permutation of the data, one of them is in the receptive field of the third node in level one (*N1,3*) and the other four are in the field of *N1,4*. The entire left receptive field (covered by *N1,1* and *N1,2*) therefore produces blank recognition. This is because there is simply not enough bottom-up evidence coming up through this side of the network. So we are not able to determine gradations of stability

Fig. 4. Complete Shift Towards Markov-chains Indicating Instability

The situation in the Iraq context changes though with time. For *t* 

time period towards these mid-grade instable states.

distribution begins to shift back. Finally, for *t* 

likely Markov-chains (see Fig. 5).

what about stability recognition?

As we know from the actual data, there were peaks in violence and other attacks, sagging economic metrics, etc. during this time. But there were also metric trends that favored stability. Consequently, this conflicting data makes it difficult to characterize the stability level during this time period. But here we see the probability distribution shift for the entire

Even though this does not indicate stability, it does indicate a dramatic drop in instability, according to how we trained the network. So we see here how the monotonic training data has provided a peg against which to categorize evidence that trends towards instability. But

As mentioned earlier, direct stability recognition is less clear, even with the monotonic training data. Rather, we can only infer stability recognition with the network. Why is this so? If we consider the types of metrics used here then we notice that only four of them increase with stability. So, as a more stable situation is reached, the remaining twelve metrics drop close to zero. Consequently, the bottom-up evidence does not provide enough magnitude to propagate through the network. All the change comes from the four metrics that increase with stability. In the current permutation of the data, one of them is in the receptive field of the third node in level one (*N1,3*) and the other four are in the field of *N1,4*. The entire left receptive field (covered by *N1,1* and *N1,2*) therefore produces blank recognition. This is because there is simply not enough bottom-up evidence coming up through this side of the network. So we are not able to determine gradations of stability

 *[52,60], g0, g1, g2, g3* are the top four most

 *[50,51]*, the probability

because the utility function of these metrics can be assumed to be inversely proportional to stability. Consequently, as stability is reached, the magnitude of *–et* goes to zero. In future implementations, it might be possible to alleviate this problem by transforming the data by an inverse or offsetting the zero. We have not done this though because we have devised an approach to recognize degrees of instability in real data, as judged against the extreme-case baseline. Furthermore, these results imply stability recognition due to the monotonic utility function of stability/instability.


Fig. 5. Complete Shift Back Towards Markov-chains Indicating Less Instability

#### **4.4 Consequences of the Iraq context implementation**

We should be very clear at the outset: the schema formed with the HTM-based computational SA created here is not how a human brain of an expert decision-maker would function. Rather, it is an alternative analysis of the data at hand that can be used by a decision-maker in such a scenario. The key element to its utility though is the fact that the computational SA functions in analogous ways to some neuro-anatomical processes, such as coincidence detection and spatio-temporal pattern condensation into a flexible schema. In particular, the HTM-based SA learns from its experiences to infer about uncertain and novel situations. To be implemented on a computer, it does so with the aforementioned hierarchical and temporal breakdown of its 'experiences' (in the form of spatio-temporal vectors).

Furthermore, this is not a perfect implementation of neural engineering to analyze complex systems. But as the preceding sections demonstrate, it provides a quantifiable analysis of a system that is otherwise left in the hands of subjective analyses whose justifications are missing or obfuscating. Perhaps with better insight into human expert's mental processes the results would have stronger impact in systems engineering analysis.

#### **5. Recent neuroscientific results on expertise**

It should be clear from the preceding implementation of computational SA that the following is true:


Considering these points in succession, it is clear that a refined perspective on human-borne expertise can add tremendous value to our first attempt of forming SA computationally. In particular, we aim to highlight some recent advances in the neuroscience of expertise that can ultimately be of use in how we analyze and design complex systems in engineering.

#### **5.1 What happens when you poke an expert's brain?**

The most insightful way to examine an expert's brain is to subject him/her to stimuli that violate their expertise-borne predictions. This experimental paradigm has seen tremendous success in tracking the neural markers of unexpected stimuli (most notably in analysis of the P300, a positivity that emerges around 300ms after a repeatedly unexpected stimulus is observed). By tracking neural signatures like the P300 and others, it is possible to see how a human brain – experienced in a certain stimulus domain – responds to errant stimuli.

Although research in many modalities have been done (e.g., functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG)), we focus here on electroencephalography (EEG) measurements of neural data. We do this for two reasons: 1) single-trial classification of neural data from EEG is generally more robust than it is from fMRI (and not very developed for MEG), 2) EEG neural data has a much higher temporal resolution than fMRI (and slightly higher than MEG), making it an ideal candidate for more immediate integration into systems engineering problems.

#### **5.1.2 A simple experiment in error-detection**

To illustrate the kinds of neural processes observable with EEG systems, we will summarize some experimental work on expectation violation. While this experiment may seem removed in the specific sense from analyzing expertise for the purposes of systems engineering analysis, the abstract concept at the heart of this experiment could not be more on target. In particular, subjects are asked to listen to an audio stimulus with which they are

hierarchical and temporal breakdown of its 'experiences' (in the form of spatio-temporal

Furthermore, this is not a perfect implementation of neural engineering to analyze complex systems. But as the preceding sections demonstrate, it provides a quantifiable analysis of a system that is otherwise left in the hands of subjective analyses whose justifications are missing or obfuscating. Perhaps with better insight into human expert's mental processes

It should be clear from the preceding implementation of computational SA that the

1. Situational awareness (SA) is a nebulous term used to define an equally nebulous

Considering these points in succession, it is clear that a refined perspective on human-borne expertise can add tremendous value to our first attempt of forming SA computationally. In particular, we aim to highlight some recent advances in the neuroscience of expertise that can ultimately be of use in how we analyze and design complex systems in engineering.

The most insightful way to examine an expert's brain is to subject him/her to stimuli that violate their expertise-borne predictions. This experimental paradigm has seen tremendous success in tracking the neural markers of unexpected stimuli (most notably in analysis of the P300, a positivity that emerges around 300ms after a repeatedly unexpected stimulus is observed). By tracking neural signatures like the P300 and others, it is possible to see how a human brain – experienced in a certain stimulus domain – responds to errant stimuli.

Although research in many modalities have been done (e.g., functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG)), we focus here on electroencephalography (EEG) measurements of neural data. We do this for two reasons: 1) single-trial classification of neural data from EEG is generally more robust than it is from fMRI (and not very developed for MEG), 2) EEG neural data has a much higher temporal resolution than fMRI (and slightly higher than MEG), making it an ideal candidate for more

To illustrate the kinds of neural processes observable with EEG systems, we will summarize some experimental work on expectation violation. While this experiment may seem removed in the specific sense from analyzing expertise for the purposes of systems engineering analysis, the abstract concept at the heart of this experiment could not be more on target. In particular, subjects are asked to listen to an audio stimulus with which they are

2. The machine learning-based approximation of this process with HTM is imperfect 3. More details on neural markers of expertise would inform any future computerizations

the results would have stronger impact in systems engineering analysis.

**5. Recent neuroscientific results on expertise** 

**5.1 What happens when you poke an expert's brain?** 

immediate integration into systems engineering problems.

**5.1.2 A simple experiment in error-detection** 

vectors).

following is true:

ability of humans

of human mental processes

quite familiar. In this case, American popular songs, such as "Eye of the Tiger" or "Sweet Home Alabama," are used because the subjects all have strong prior expectations for the course of these songs once they start listening to them. In other words, they are experts at how these songs unfold. In this experiment, we analyze the subject's expertise about these audio stimuli.

The aim of this experiment is to alter the song in such a way that the following occurrences are true: 1) a subject with normal hearing should be able to discern the alteration, 2) the subject should be able to reorient his/her expectations after the alteration has taken place. The latter condition is a requirement if multiple alterations are to be performed within one hearing of the song. To balance these two requirements, it was chosen that the song's key should be altered either up or down by a semi-tone at various points in the recording.

After analyzing the neural data with learning algorithms based on logistic regression classification (Parra et al., 2002 and Parra et al., 2005), it is found that we can distinguish from neural data alone when the subject perceived an alteration and when he/she did not, regardless of where they were in the song. In other words, we are able to distinguish times in the experiment when the subject's expertise (and associated predictions) were at a conflict with the data (i.e., the audio stimuli) in front of him/her. An example of this phenomenon can be seen in Fig. 6.

Fig. 6. Individual subject and subject-average discriminations of expertise violation

What this plot shows is the classification performance in blue (measured by Az, or the area under the receiver-operator characteristic curve, see Green & Swets, 1966) vs. the time when this classification is performed relative to when the alteration happened (in the case of the alteration) or did not happen (in the case of the corresponding condition in a control listening). The alteration is shown in the dotted-black line. The green and red solid lines in the left figure indicate the 99% and 95% confidence lines. In the right figure, the subjectaveraged significance lines are differentiated from the individual lines (because they are computed differently) by being dashed. The red-dashed line is the 95% and the blackdashed line is the 99% significance line. Finally, the dashed-green line is the 5%-standard deviation from the averaging. As the average plot shows, similar plots exist for other subjects in the experiment than the one shown on the left.

This experiment shows that there is a measurable neural process happening when an expert witnesses something unexpected. In particular, just as with HTM, the key component to deciphering the incoming phenomenon (here, it is audio, *VAudio* ) is the subject's schema ( *SAudio* ) and how it predicts the evolution of *VAudio* in time. In other words, we are playing with the subjects' Level 3 SA (i.e., prediction). If this result is to be used in the analysis of complex systems then it must involve a dynamic system. Luckily, this challenge in complex system analysis is actually a boon for an expert's assessment of a phenomenon, *V* .

#### **5.2 What does this mean for systems engineering?**

As described earlier, systems engineering is a complex process that is largely driven by expert insight into a range of problems. But we see both from the computational SA demonstration and one result (of many related others) from neuroscience that a new way to think about expertise is developing. Furthermore, both results depend on a temporal evolution of data as a means by which SA is created, either computationally or by a human.

It may be possible to integrate this understanding technologically with the needs of systems engineering today. For instance, why not handle the data deluge with an approach that couples the speedy perception of cortical circuitry to the vast computational power of modern computers? This is already being done in remote sensing in the form of corticallycoupled computer vision (Sajda, 2010). Similar versions of this problem exist in aircraft analysis and design, as described earlier. As we become more familiar with the insights on expertise provided by neuroscience, it could only be a matter of time before neural engineering is a needed integrative enabler of analyzing and manipulating increasingly complex and dynamic systems.

#### **6. Conclusion**

The systems engineering task is only getting more difficult. As systems become more complex and demands for reliability increase, there is a growing need – already being met in certain ways – to build appropriate analytic tools to engineer within this context. But systems engineers are increasingly required to analyze an already existing system before the design process begins. In other words, post-factum analysis is a key aspect to the systems engineering process. Consequently, subject matter expertise becomes a key enabler for the design process. However, the elusive nature of expertise remains an intractable aspect to that process. Here, we have shown a computational approach built on insights into expertise and how it was used in a decision-making context. In other words, we showed how the analysis of the provided data enabled justifiable conclusions on an otherwise unpredictable and already established complex system (e.g., the Iraq War, 2003-2008). However, we noticed some shortcomings in this approach and so turned our focus to more basic neuroscience questions about what neural processes occur as expertise is used. In particular, we discussed one recent experiment as an example of current work going in the neural markers of expertise. Since such markers are stimulus driven, in particular, the expert reacts to data being either in line with expectations or not, we forecasted a potential role for neural engineering in the analysis, and consequent design, of complex systems. Not only does this need exist but such an approach would also complement the current techniques used in this endeavor.

#### **7. Acknowledgment**

We want to thank the Aerospace Systems Design Laboratory at Georgia Institute of Technology for supporting this research.

#### **8. References**

180 Systems Engineering – Practice and Theory

This experiment shows that there is a measurable neural process happening when an expert witnesses something unexpected. In particular, just as with HTM, the key component to deciphering the incoming phenomenon (here, it is audio, *VAudio* ) is the subject's schema ( *SAudio* ) and how it predicts the evolution of *VAudio* in time. In other words, we are playing with the subjects' Level 3 SA (i.e., prediction). If this result is to be used in the analysis of complex systems then it must involve a dynamic system. Luckily, this challenge in complex

As described earlier, systems engineering is a complex process that is largely driven by expert insight into a range of problems. But we see both from the computational SA demonstration and one result (of many related others) from neuroscience that a new way to think about expertise is developing. Furthermore, both results depend on a temporal evolution of data as a means by which SA is created, either computationally or by a human. It may be possible to integrate this understanding technologically with the needs of systems engineering today. For instance, why not handle the data deluge with an approach that couples the speedy perception of cortical circuitry to the vast computational power of modern computers? This is already being done in remote sensing in the form of corticallycoupled computer vision (Sajda, 2010). Similar versions of this problem exist in aircraft analysis and design, as described earlier. As we become more familiar with the insights on expertise provided by neuroscience, it could only be a matter of time before neural engineering is a needed integrative enabler of analyzing and manipulating increasingly

The systems engineering task is only getting more difficult. As systems become more complex and demands for reliability increase, there is a growing need – already being met in certain ways – to build appropriate analytic tools to engineer within this context. But systems engineers are increasingly required to analyze an already existing system before the design process begins. In other words, post-factum analysis is a key aspect to the systems engineering process. Consequently, subject matter expertise becomes a key enabler for the design process. However, the elusive nature of expertise remains an intractable aspect to that process. Here, we have shown a computational approach built on insights into expertise and how it was used in a decision-making context. In other words, we showed how the analysis of the provided data enabled justifiable conclusions on an otherwise unpredictable and already established complex system (e.g., the Iraq War, 2003-2008). However, we noticed some shortcomings in this approach and so turned our focus to more basic neuroscience questions about what neural processes occur as expertise is used. In particular, we discussed one recent experiment as an example of current work going in the neural markers of expertise. Since such markers are stimulus driven, in particular, the expert reacts to data being either in line with expectations or not, we forecasted a potential role for neural engineering in the analysis, and consequent design, of complex systems. Not only does this need exist but such an approach would also complement the current techniques used in this

system analysis is actually a boon for an expert's assessment of a phenomenon, *V* .

**5.2 What does this mean for systems engineering?** 

complex and dynamic systems.

**6. Conclusion** 

endeavor.


### **An Abstracted and Effective Capabilities Portfolio Management Methodology Using Enterprise or System of Systems Level Architecture**

Joongyoon Lee and Youngwon Park *Ajou University/SE Technology Ltd. Republic of Korea* 

#### **1. Introduction**

182 Systems Engineering – Practice and Theory

United States Department of Defense (2005). Report to Congress: Measuring Stability and

United States Department of Defense (2005). Report to Congress: Measuring Stability and

United States Department of Defense (2005). Report to Congress: Measuring Stability and

United States Department of Defense (2006). Report to Congress: Measuring Stability and

United States Department of Defense (2006). Report to Congress: Measuring Stability and

United States Department of Defense (2006). Report to Congress: Measuring Stability and

United States Department of Defense (2007). Report to Congress: Measuring Stability and

United States Department of Defense (2007). Report to Congress: Measuring Stability and

United States Department of Defense (2007). Report to Congress: Measuring Stability and

United States Department of Defense (2007). Report to Congress: Measuring Stability and

United States Department of Defense (2008). Report to Congress: Measuring Stability and

United States Department of Defense (2008). Report to Congress: Measuring Stability and

United States House of Representatives (2005). Making Emergency Supplemental

Vanderplaats, G. N. (2007). *Multidiscipline Design Optimization,* Vanderplaats R&D, Inc.,

Appropriations for the Fiscal Year Ending September 30, 2005, and for Other

Security in Iraq, July 2005

Security in Iraq, May 2006

Security in Iraq, October 2005

Security in Iraq, February 2006

Security in Iraq, August 2006

Security in Iraq, March 2007

Security in Iraq, June 2007

Security in Iraq, November 2006

Security in Iraq, September 2007

Security in Iraq, December 2007

Purposes, *Conference Report 109-72*, May 2005

ISBN 0-944956-04-1, New York, USA

Security in Iraq, March 2008

Security in Iraq, June 2008

The purpose of this chapter is to provide an abstracted methodology for executing Capabilities Portfolio Management (hereafter, CPM) effectively based on the Department of Defense Architecture Framework version 2.0 (hereafter, DoDAF V2.0)1. A methodology is the specification of the process to follow together with the work products to be used and generated, plus the consideration of the people and tools involved, during a development effort. Based on the definition of methodology of ISO 24744 (ISO, 2007), this chapter provides process, product and modeling related technology as considerations of people and tools involved in CPM. From DoDAF V2.0, the purpose of developing an architecture is for beneficial use of it. A good set of architectural artifacts facilitates the manipulation and use of them in meeting its purposes of use. Systems engineering methodologies evolve to accommodate or to deal with problems in enterprise level, system of systems (hereafter, SoS) level and family of systems (hereafter, FoS) level. And the CPM of the United States Department of Defense (hereafter DoD) is a good example which demonstrates enterprise or SoS level problems. However, the complexity of the metamodel of DoDAF makes it difficult to develop and use the architecture models and their associated artifacts. DoDAF states that it was established to guide the development of architectures and to satisfy the demands for a structured and, repeatable method in evaluating alternatives which add value to decisions and management practices. One of the objectives of DoDAF V2.0 is to define concepts and models usable in DoD's six core processes. DoDAF V2.0 provides a particular methodology in the architecture development process. However, DoDAF as well as other guidelines states requirements for CPM which is one of DoD's six core processes, rather than how to perform CPM. This chapter provides an abstracted methodology for CPM which includes the process, abstrated products and tailored meta-models based on DoDAF Meta Model (hereafter, DM2).

<sup>1</sup>The Department of Defense Architecture Framework (DoDAF) is an architecture framework for the United States Department of Defense, that provides structure for a specific stakeholder concern through viewpoints organized by various views. (quoted from http://en.wikipedia.org/wiki/DODAF)

#### **2. Current issues on system of systems problems**

The definition of system of DoDAF V2.0 (DoD, Aug. 2010) has been changed from that of DoDAF V1.5. A system is not just computer hardware and software. A system is now defined in the general sense of an assemblage of components (machine or, human)- that perform activities (since they are subtypes of Performer) and interact or become interdependent. The Federal Enterprise Architecture Practice Guidance (Federal Government, 2007) has defined three types of architecture: enterprise architecture, segment architecture, and solution architecture. "Enterprise architecture" is fundamentally concerned with identifying common or shared assets – whether they are strategies, business processes, investments, data, systems, or technologies. By contrast, "segment architecture" defines a simple roadmap for a core mission area, business service, or enterprise service. "Solution architecture" defines agency IT assets such as applications or components used to automate and improve individual agency business functions. The scope of solution architecture is typically limited to a single project and is used to implement all or part of a system or business solution. From the viewpoint of a system hierarchy, the solution architecture addresses system level problems whereas enterprise architecture and segment architecture address SoS/FoS problems respectively. Systems engineering methodologies have evolved to deal with enterprise or SoS level problems.

The purpose of DoDAF V2.0 is to define concepts and models usable in DoD's six core processes:


The DoD's six core processes are good examples of addressing SoS level problems. However, DoDAF V2.0 and other guidelines state requirements rather than how to perform these processes. This chapter provides a methodology for CPM which contains detailed processes, methods, artifacts and tailored meta-model of DM2.

#### **3. Capability Portfolio Management methodology development guide**

ISO/IEC 24744 (ISO, 2007) defines that a methodology specifies the process to be executed, usually as a set of related activities, tasks and/or techniques, together with what work products must be manipulated (created, used or changed) at each occasion possibly including models, documents and other inputs and outputs. So a methodology is the specification of the process to follow together with the work products to be used and generated, plus techniques which are the consideration of people and tools involved, during a development effort.

#### **3.1 Methodology development requirements**

#### **3.1.1 Process, methods, tools, and environment concept of methodology element**

According to Martin (Martin, 1997), it is important to have a proper balance among process, methods, tools, and environment (PMTE) when performing systems engineering tasks. He

The definition of system of DoDAF V2.0 (DoD, Aug. 2010) has been changed from that of DoDAF V1.5. A system is not just computer hardware and software. A system is now defined in the general sense of an assemblage of components (machine or, human)- that perform activities (since they are subtypes of Performer) and interact or become interdependent. The Federal Enterprise Architecture Practice Guidance (Federal Government, 2007) has defined three types of architecture: enterprise architecture, segment architecture, and solution architecture. "Enterprise architecture" is fundamentally concerned with identifying common or shared assets – whether they are strategies, business processes, investments, data, systems, or technologies. By contrast, "segment architecture" defines a simple roadmap for a core mission area, business service, or enterprise service. "Solution architecture" defines agency IT assets such as applications or components used to automate and improve individual agency business functions. The scope of solution architecture is typically limited to a single project and is used to implement all or part of a system or business solution. From the viewpoint of a system hierarchy, the solution architecture addresses system level problems whereas enterprise architecture and segment architecture address SoS/FoS problems respectively. Systems

engineering methodologies have evolved to deal with enterprise or SoS level problems.

The purpose of DoDAF V2.0 is to define concepts and models usable in DoD's six core

The DoD's six core processes are good examples of addressing SoS level problems. However, DoDAF V2.0 and other guidelines state requirements rather than how to perform these processes. This chapter provides a methodology for CPM which contains detailed

ISO/IEC 24744 (ISO, 2007) defines that a methodology specifies the process to be executed, usually as a set of related activities, tasks and/or techniques, together with what work products must be manipulated (created, used or changed) at each occasion possibly including models, documents and other inputs and outputs. So a methodology is the specification of the process to follow together with the work products to be used and generated, plus techniques which are the consideration of people and tools involved, during

**3.1.1 Process, methods, tools, and environment concept of methodology element** 

According to Martin (Martin, 1997), it is important to have a proper balance among process, methods, tools, and environment (PMTE) when performing systems engineering tasks. He

**3. Capability Portfolio Management methodology development guide** 

**2. Current issues on system of systems problems** 

1. Capabilities Integration and Development (JCIDS)

6. Capabilities Portfolio Management (CPM)

**3.1 Methodology development requirements** 

2. Planning, Programming, Budgeting, and Execution (PPBE)

processes, methods, artifacts and tailored meta-model of DM2.

processes:

3. Acquisition System (DAS) 4. Systems Engineering (SE) 5. Operations Planning

a development effort.

defines that a process is a logical sequence of tasks performed to achieve a particular objective, a method consists of techniques for performing a task, and a tool is an instrument when applied to a particular method. While, in ISO/IEC 24744, a method is used as a synonym of methodology, this chapter adopts Martin's PMTE paradigm. So this chapter provides the CPM methodology which has its own process, method (technique), and tool (model or artifacts).

ISO/IEC 24744 (ISO, 2007) also states that a methodology element is a simple component of a methodology. Usually, methodology elements include the specification of what tasks, activities, techniques, models, documents, languages and/or notations can or must be used when applying the methodology. Methodology elements are related to each other, comprising a network of abstract concepts. Typical methodology elements are Capture Requirements, Write Code for Methods (a kind of tasks), Requirements Engineering, High-Level Modelling (kinds of activities), Pseudo-code, Dependency Graphs (notations), Class, Attribute (kinds of model building blocks), Class Model, Class Diagram, Requirements Specification (kind of work products), etc. From this concept, the elements for CPM methodology of this chapter are Capture Requirements (top level CPM requirements), High-Level Model of CPM process (kinds of activities), metamodel (Class Diagram), and Attribute.

#### **3.1.2 Metamodel development requirements**

A metamodel is the specification of the concepts, relations and rules that are used to define a methodology. This metamodel should be simple and consistent with the analysis methodology. And the metamodel is a schema for semantic data and a language that supports a particular process, method (technique), and tool (model or artifacts).

Probability and set theory have axioms of mutually exclusive and collectively exhaustive (hereafter, MECE) concepts, and decomposition concepts. This means no overlap, no omission of concept and complete decomposition of a concept also. Axiomatic design theory (Suh, 1990) states that the design axiom No.1 is the independence axiom, "Maintain the independence of functions (not affecting other functions)" and the design axiom No. 2 is the information axiom, "Minimize the information content of the design (functionally uncoupled design)." These are the same MECE principle concept of different viewpoints, one is a set viewpoint and the other is a functional design viewpoint. A past study (Lee and Park, 2009) adopted this concept to the metamodel design. The study pointed out that if the metamodel design satisfies the MECE principle, the classes within the metamodel is distinguished from each other clearly, the model composes a complete set of semantic, and relates to each other clearly. The metamodel requirements of this study are summarized in Table 1.


Table 1. Metamodel requirements

And the study (Lee & Park, 2009) also proposed five rules for developing metamodel and those metamodel development requirements are presented in Table 2.


Table 2. Metamodel development requirements

Current proposed DM2 shows many similar type of classes which violates Lee & Park 's metamodel requirement No.1.

As mentioned before, the metamodel must be consistent, integrated and balanced between process and methods to achieve the greatest benefits from the good systems engineering practice. The systems engineering method teaches that the requirement space and the solution space shall be divided strictly. These attributes of the metamodel resulted in effective benefits from the viewpoint of building architecture (e.g. SoS architecting) and the usage (e.g. CPM).

#### **3.2 Capability Portfolio Management methodology requirements**

#### **3.2.1 Capability Portfolio Management requirements**

DoDD 7045.20 (DoD, Sep. 2008) defines that capability portfolio management (CPM) is the process of integrating, synchronizing, and coordinating Department of Defense capabilities needs with current and planned DOTMLPF2 investments within a capability portfolio to better inform decision making and optimize defense resources and capability portfolio is a collection of grouped capabilities as defined by JCAs3 and the associated DOTMLPF programs, initiatives, and activities. The top level requirement of CPM is that CPMs shall provide recommendations regarding capability requirements to capability investments. And other requirements for recommending capability requirement to the Heads of the DoD Components, and to the Deputy's Advisory Working Group (DAWG) are that the CPM should evaluate capability demand against resource constraints, identify and assess risks, and suggest capability trade-offs within their capability portfolio. DoDD 7045.20 (DoD, Sep. 2008) provides CPM requirements and responsibilities but does not provide process and method.

<sup>2</sup> DOTMLPF is an acronym used by the United States Department of Defense. DOTMLPF is defined in the The Joint Capabilities Integration Development System (JCIDS) Process. The JCIDS process provides a solution space that considers solutions involving any combination of doctrine, organization, training, materiel, leadership and education, personnel and facilities (DOTMLPF).

<sup>3</sup> Joint Capability Area (JCA) - Collections of like DOD capabilities functionally grouped to support capability analysis, strategy development, investment decision making, capability portfolio management, and capabilities-based force development and operational planning. (http://www.dtic.mil/futurejointwarfare/cap\_areas.htm).

And the study (Lee & Park, 2009) also proposed five rules for developing metamodel and

those metamodel development requirements are presented in Table 2.

3 Make the relation names among groups clear and meaningful.

**3.2 Capability Portfolio Management methodology requirements** 

**3.2.1 Capability Portfolio Management requirements** 

<sup>4</sup>Make the relations among the groups to represent systems engineering

<sup>5</sup>Include the operational viewpoint and system viewpoint category while creating

Current proposed DM2 shows many similar type of classes which violates Lee & Park 's

As mentioned before, the metamodel must be consistent, integrated and balanced between process and methods to achieve the greatest benefits from the good systems engineering practice. The systems engineering method teaches that the requirement space and the solution space shall be divided strictly. These attributes of the metamodel resulted in effective benefits from the viewpoint of building architecture (e.g. SoS architecting) and the

DoDD 7045.20 (DoD, Sep. 2008) defines that capability portfolio management (CPM) is the process of integrating, synchronizing, and coordinating Department of Defense capabilities needs with current and planned DOTMLPF2 investments within a capability portfolio to better inform decision making and optimize defense resources and capability portfolio is a collection of grouped capabilities as defined by JCAs3 and the associated DOTMLPF programs, initiatives, and activities. The top level requirement of CPM is that CPMs shall provide recommendations regarding capability requirements to capability investments. And other requirements for recommending capability requirement to the Heads of the DoD Components, and to the Deputy's Advisory Working Group (DAWG) are that the CPM should evaluate capability demand against resource constraints, identify and assess risks, and suggest capability trade-offs within their capability portfolio. DoDD 7045.20 (DoD, Sep. 2008) provides

CPM requirements and responsibilities but does not provide process and method.

materiel, leadership and education, personnel and facilities (DOTMLPF).

(http://www.dtic.mil/futurejointwarfare/cap\_areas.htm).

management, and capabilities-based force development and operational planning.

2 DOTMLPF is an acronym used by the United States Department of Defense. DOTMLPF is defined in the The Joint Capabilities Integration Development System (JCIDS) Process. The JCIDS process provides a solution space that considers solutions involving any combination of doctrine, organization, training,

3 Joint Capability Area (JCA) - Collections of like DOD capabilities functionally grouped to support capability analysis, strategy development, investment decision making, capability portfolio

No. Metamodel development requirements 1 Create the minimum number of data groups 2 Do not overlap concept across data groups

Table 2. Metamodel development requirements

methodology.

metamodel requirement No.1.

groups.

usage (e.g. CPM).

#### **3.2.2 Current status of Capability Portfolio Management methodology**

DM2 of DoDAF V2.0 provides Conceptual Data Model (hereafter, CDM), Logical Data Model (LDM), and Physical Exchange Specification (PES). LDM provides data groups (classes) and their usage in DoD's six core processes including CPM. DoDAF V2.0 provides metamodel which support method but does not provide process and methods itself for CPM. Table 3 shows DM2 CDM core concepts which represent the relation among DM2 Data Groups and DoD's six core processes. Table 3 also shows twenty five data groups that are used to develop architectures across DoD's six core processes including CPM.


Table 3. Relation among DM2 Data Groups and DoD's six core processes

The study (Lee & Park, 2009) points out that DoDAF metamodel is too complex to use and proposed more simplified metamodel to enhance usability. Fig. 1 shows many classes used in DM2. There are still many classes which generate complexity when architecting.

Fig. 1. Notional class hierarchy of DM2

In order to overcome complexity and enhance usability of metamodel, Lee & Park proposed another metamodel based on DoDAF 2.0 JCIDS overlay protocol. Fig. 2 shows Lee & Park's proposal for CDM. The study articulate that the proposed metamodel is the product of an integrating effort that combines the MECE principles, systems engineering principles. The study also demonstrates that it is a simple and effective process to develope and use the artifacts of an architecture.

The CDM of current DM2 of DoDAF V2.0 is similar to the proposed Lee & Park's metamodel. Fig. 3 shows DM2 CDM overlay with the Lee & Park's proposed metamodel. Table 4 shows the relation between classes of DM2 CDM and Lee & Park proposed metamodel. From the contents viewpoint the total of eighteen classes of DM2 CDM are matched with classes of Lee & Park's proposed metamodel. Unmatched classes of DM2 CDM with Lee & Park's are seven classes as follows: Data, Information, Agreement, Location, Skill, MeasureType, and PersonType. To maintain consistency with DM2 CDM, Lee & Park's metamodel complemented with these 7 classes. Three classes of Data, Information, and Location are added, two classes of Agreement and Skill are excluded for the reason of not directly related to the CPM and the other two classes of MeasureType and PersonType go for attribute of Measure and Person. Based on these analysis results the metamodel of CPM methodology could maintain consistency conceptually with DM2 CDM.

The study (Lee & Park, 2009) points out that DoDAF metamodel is too complex to use and proposed more simplified metamodel to enhance usability. Fig. 1 shows many classes used

In order to overcome complexity and enhance usability of metamodel, Lee & Park proposed another metamodel based on DoDAF 2.0 JCIDS overlay protocol. Fig. 2 shows Lee & Park's proposal for CDM. The study articulate that the proposed metamodel is the product of an integrating effort that combines the MECE principles, systems engineering principles. The study also demonstrates that it is a simple and effective process to develope and use the

The CDM of current DM2 of DoDAF V2.0 is similar to the proposed Lee & Park's metamodel. Fig. 3 shows DM2 CDM overlay with the Lee & Park's proposed metamodel. Table 4 shows the relation between classes of DM2 CDM and Lee & Park proposed metamodel. From the contents viewpoint the total of eighteen classes of DM2 CDM are matched with classes of Lee & Park's proposed metamodel. Unmatched classes of DM2 CDM with Lee & Park's are seven classes as follows: Data, Information, Agreement, Location, Skill, MeasureType, and PersonType. To maintain consistency with DM2 CDM, Lee & Park's metamodel complemented with these 7 classes. Three classes of Data, Information, and Location are added, two classes of Agreement and Skill are excluded for the reason of not directly related to the CPM and the other two classes of MeasureType and PersonType go for attribute of Measure and Person. Based on these analysis results the metamodel of CPM methodology could maintain consistency conceptually with DM2 CDM.

in DM2. There are still many classes which generate complexity when architecting.

Fig. 1. Notional class hierarchy of DM2

artifacts of an architecture.

Fig. 2. Lee & Park proposed metamodel for capability based assessment (CBA) methodology

Fig. 3. DM2 CDM overlay with Lee & Park proposed metamodel


Table 4. Relation between classes of DM2 CDM and Lee & Park proposed metamodel

DM2 CDM No.

Lee & Park proposed classes

Operational Node

19 Service ● ○ ○ ○

3 Capability ● ○ 11 Measure ○ ○ ○ 23 Vision ○

Scenario

Activity

Description ●

10 Materiel ● ● 22 System ● ● 17 Resource ○ ○ ● ● 16 Project ○ 4 Condition ● 18 Rule ● 25 Constraint ●

2 Agreement Class 5 Data Class 8 Information Class 9 Location Class 12 MeasureType Attribute 15 PersonType Attribute 20 Skill Class Table 4. Relation between classes of DM2 CDM and Lee & Park proposed metamodel

7 Guidance ● 21 Standard ●

Op. Perf. Attribute ( Oriented)

13 Organization ○ ●

Op. Perf. Attribute (Capability Oriented)

System Node

Scenario (Sys)

Function

System Perf. Attribute (Capability Oriented)

Architecture

Guidance Document

Risk

Deliverable Product Solution

Condition

Resource

Executables

Ref

Classes of DM2 CDM core concepts

14 Performer ●

<sup>24</sup>Architectural

1 Activity ● 6 DesiredEffect ● And process viewpoint of methodology status, CBA guides (DoD, Dec. 2006) have relatively detailed information about CBA process and methods. The CBA process and related information could be used to perform CPM but that is not sufficient for CPM method. The following part provides CPM process, product and method which manipulate information of the product and supporting metamodel.

#### **4. Proposal of Capability Portfolio Management methodology**

As mentioned before, a methodology specifies the process to be executed, usually as a set of related activities, tasks and/or techniques, together with work products possibly including models, documents. CPM methodology has its own process, method (technique), and product (model or artifacts) as the tool category of Martin's PMTE. According to these requirements, the CPM methodology of this chapter shows CPM process, product and model related technique. The CPM process consists of a set of activities/tasks. Each step of activity has corresponding output product and model related technique which is used to build model and/or generate the output products.

In order to facilitate further discussions, key terms quoted from DoDD 7045.20 capability portfolio management (DoD, Sep. 2008) are defined as follows. Capability portfolio is a collection of grouped capabilities as defined by JCAs and the associated DOTMLPF programs, initiatives, and activities. And CPM is the process of integrating, synchronizing, and coordinating capability requirements with current and planned DOTMLPF investments within a capability portfolio to better inform decision making and optimize defense resources. From this definition, CPM can make a balanced capability requirements to maximize mission effects within limited resources and the capability requirements are originated from a group of capabilities defined by JCAs.

#### **4.1 Capability Portfolio Management process**

CPM requirement is to provide recommendations regarding capability requirements to capability investments. So CPM process has to generate balanced capability requirements. The capability requirements should be generated with DOTMLPF investments within a capability portfolio (a collection of grouped capabilities as defined by JCAs).

To achieve these CPM requirements a proposed process is composed of following 5 activities: (1) Define top level missions and develop scenarios (2) Build trace relation among elements of JCA, universal joint task list (hereafter, UJTL) and activity and identify mission essential task list (hereafter, METL) of DoD (3) Develop capabilities and the related conditions and resources (4) Analyze mission effectiveness and derive (transform) capability requirements (5) Derive integrated & balanced capability requirements. And more detailed tasks are listed in Table 5.

#### **4.2 Capability Portfolio Management method and product**

In order to provide CPM method and product which could be a model or artifact. This part provides descriptions, products and model related techniques for each task of CPM process.



#### **4.2.1 Define top level missions**


#### **4.2.2 Define states & modes for each mission**


192 Systems Engineering – Practice and Theory

T.1 Define top level missions

T.5 Trace each activity to UJTL

(attributed in METLs)

(attributed in METLs)

Concepts, hereafter, JOC)

Concepts, hereafter, JFC)

systems element

performances

capability

 Description: Defining top level mission is a process to define top level missions of an enterprise to provide the point of reference or directions which CPM aims to attain.

 Model related technique: Mission is a kind of activity and the mission activity is the top level activities of an activity hierarchy. And, the level attribute of the mission activity

T.11

T.12

T.2 Define states & modes for each missions

T.3 Develop mission threads for each states & modes T.4 Design operational scenarios for each missions

T.6 Check alignment JCA, UJTL and allocated activity

T.7 Identify METLs for each mission scenario

T.8 Develop capability instance which aligned to activity (attributed in METLs)

T.9 Develop condition instances for each activity

T.10 Develop resource instances for each activity

T.13 Allocate operational element to supporting

T.14 Synthesize operational performances to system

T.15 Optimize resources to maximize MOEs for a

T.16 Define integrated capability requirements

Analyze operational effectiveness (MOEs) for each operational missions (e.g. Joint Operating

Analyze operational effectiveness (MOEs) for functional missions (e.g. Joint Functional

Activities of CPM process Tasks of CPM Process

A.1 Define top level missions and develop scenarios

> Build trace relation among JCA, UJTL and activity and identify

Develop capabilities and related conditions and

METL

resources

Analyze mission effectiveness and derive(transform) capability requirements

Derive integrated & balanced capability requirements

**4.2.1 Define top level missions** 

should be set as 'Mission level'.

Table 5. Activities and tasks of proposed CPM process

Product: Top level mission statement of an enterprise

A.2

A.3

A.4

A.5

 Model related technique: States & modes are kind of activity and the abstraction level of this activity is below the mission activities of an activity hierarchy. Thus the level attribute of the states & modes activity should be set as 'States & modes'.

#### **4.2.3 Develop mission threads for each states & modes**


#### **4.2.4 Design operational scenarios for each mission**


#### **4.2.5 Trace each activity to Universal Joint Task List**


#### **4.2.6 Check alignment Joint Capability Area, Universal Joint Task List and allocated activity**

 Description: The Joint Capability Area Management System (JCAMS) of DoD provides JCA linkages to Universal Joint Tasks. The allocated activities to UJTL should be checked by the alignment with JCA from the viewpoint of semantics. From the viewpoint of semantics, tracing relation between activity-UJTL-JCA should be meaningful.


#### **4.2.7 Identify Mission Essential Task Lists for each mission scenario**


#### **4.2.8 Develop capability instance which aligned to activity**


#### **4.2.9 Develop condition instances for each activity**


#### **4.2.10 Develop resource instances for each activity**


 Model related technique: JCA class is required and the elements (contents) of JCA could be traceable to UJTL. Then the traceability from leaf-node activity via UJTL to JCA is

 Description: METLs are decided through a process to identify key tasks, which directly contribute to achieve mission effectiveness, among leaf-node level activities of a mission scenario. The designated activities as METL have a role to develop capability requirements. The activities designated as METL are base activities for following

Model related technique: The activity class needs the importance attribute. And so, the

 Description: The activities which are identified as METLs should be carried out CBA separately and develop appropriate capabilities in the light of JCAs. The developed capability is an instance of capability class which are traced to activity instances. The developed capabilities will be traced to the functions of systems or other requirements

Model related technique: Capability class is required aside from JCA class and the

 Description: For the purpose of carrying out CBA, proper conditions for missions are developed and allocated to activities which are identified as METLs. The developed

Model related technique: Condition class is required aside from UJTL class and the

Description: Required resources (DOTMLPF) are defined to fulfill relevant activities.

 Model related technique: Resource class is separately required with other performer type classes e.g. organization and system. The resource class has relation with activity and capability class. Resource class has resource type of DOTMLPF. Especially the Resource class typed with organization is equivalent to organization class and resource

capability class should have relation with JCA and activity class.

activity importance attribute of the METL activity should be set as 'METL'.

semantics, tracing relation between activity-UJTL-JCA should be meaningful.

**4.2.7 Identify Mission Essential Task Lists for each mission scenario** 

**4.2.8 Develop capability instance which aligned to activity** 

Product: Traceability table of Activity - Capability

**4.2.9 Develop condition instances for each activity**

conditions are instances of the conditions of UJTL. Product: Traceability table of Activity – Condition

**4.2.10 Develop resource instances for each activity** 

UJTL class has 'provide relation' with Condition class.

class typed with materiel is equivalent to system class.

Those resources realize capabilities to support related activities. Product: Traceability table of Activity – Resource – Capability

Product: Traceability table of Activity-UJTL-JCA

analysis of CPM methodology.

established.

Product: METL List

of DOTMLPF.

by the alignment with JCA from the viewpoint of semantics. From the viewpoint of

#### **4.2.11 Analyze operational effectiveness for each operational mission**


#### **4.2.12 Analyze operational effectiveness for functional missions**


#### **4.2.13 Allocate operational element to supporting systems element**

 Description: This phase changes operational viewpoint to system viewpoint. And this phase allocates defined organization, operational nodes, activities and input/output information to systems, system nodes, system functions and input/output data. Lessons learned from systems engineering imply that system elements are not considered before this step, and instead, requirements are defined in operational viewpoint, then operational requirement are converted into system viewpoint in order to support operational requirements.


#### **4.2.14 Synthesize operational performances to system performances**


#### **4.2.15 Optimize resources to maximize operational effectiveness for a capability**


#### **4.2.16 Define integrated capability requirements**


#### **4.3 Metamodel for Capability Portfolio Management**

From the proposed CPM process, and based on DM2 CDM and Lee & Park's metamodel, the additionally required classes (Entity type), attributes of classes and relations for each task are identified. The additionally identified classes, relations, and attribute are used to complement metamodel for CPM methodology. Table 6 shows the additionally required classes, relations and attributes.

 Product: Organization vs. system relation table, Operational node vs. system node relation table, operational activity vs. system function relation table, Operational

 Model related technique: To reflect the principle of systems engineering which divide requirement space and the solution space strictly, the following relations are built. Organization class is supported by system class. Operational node class is supported by system node class. Activity class is supported by function class. Operational

 Description: This step aims that operational performances, which are derived from operational activities, are changed to system performance, which are derived from system functions. A system function is employed to support several operational activities. Those operational performances are synthesized into an optimized system

Model related technique: Measure class of operational performances type is traced to

 Description: This is the most peculiar phase of CPM process. Perform cost-effectiveness analysis repeatedly to achieve maximum effectiveness under the condition of limited resources. And define the capability requirements, which are the requirements for all resources to encompass DOTMLPF, and those resources are traced to one capability under certain items of JCA. The resulted performances of resources are synthetically

Model related technique: Capability class has been realized by relation with DOTMLPF

 Description: According to the definition of capability, the capability elements e.g. desired effects of various missions, a set of tasks and combination of means & ways are defined for a capability using performance measures of activity, function and resources.

Model related technique: 'Capability decisive element' attribute required for classes of

From the proposed CPM process, and based on DM2 CDM and Lee & Park's metamodel, the additionally required classes (Entity type), attributes of classes and relations for each task are identified. The additionally identified classes, relations, and attribute are used to complement metamodel for CPM methodology. Table 6 shows the additionally required

The elements contributing critically to the resulted capability should be marked.

**4.2.15 Optimize resources to maximize operational effectiveness for a capability** 

information vs. system data relation table.

information class is supported by system data class.

performance from the view point of cost-effectiveness.

measure class of system performances type

Product: Capability vs. Resources matrix

**4.2.16 Define integrated capability requirements** 

Product: Capability recommendation document

**4.3 Metamodel for Capability Portfolio Management**

resource, activity and function.

classes, relations and attributes.

type of resources.

**4.2.14 Synthesize operational performances to system performances** 

Product: Operational performances vs. system performances matrix

maximize return on invest (ROI) for the relevant capability.


Table 6. Proposed CPM Process and required classes and attributes for CPM Process

Like the study (Lee & Park, 2009) proposed metamodel, CPM metamodel should be developed in accordance with the metamodel requirement and metamodel development requirements of Table 1 and 2. And also CPM metamodel should be aligned with DM2 CDM for interoperability with DoDAF V2.0. Table 7 shows the proposed CDM classes for CPM which is aligned with classes of DM2 CDM and additionally added classes originated from Lee & Park' metamodel and the CPM task analysis. The additional JCA and UJTL classes comes from CPM task analysis of Table 6 and System Node and Function classes reflect the systems engineering concept of strict separation of requirement space and solution space.


Table 7. Relation between DM2 CDM core concepts and Lee's CDM classes for CPM

And according to the metamodel development requirements, classes are related and named meaningfully and reflect operational requirement space and system solution space. Fig. 4 shows the resulted CDM for CPM methodology.

Fig. 4. Proposed CDM for CPM methodology

#### **5. Conclusion**

198 Systems Engineering – Practice and Theory

Like the study (Lee & Park, 2009) proposed metamodel, CPM metamodel should be developed in accordance with the metamodel requirement and metamodel development requirements of Table 1 and 2. And also CPM metamodel should be aligned with DM2 CDM for interoperability with DoDAF V2.0. Table 7 shows the proposed CDM classes for CPM which is aligned with classes of DM2 CDM and additionally added classes originated from Lee & Park' metamodel and the CPM task analysis. The additional JCA and UJTL classes comes from CPM task analysis of Table 6 and System Node and Function classes reflect the systems engineering

> relation with proposed classes

Proposed CDM classes for CPM

concept of strict separation of requirement space and solution space.

3 Capability 6 correspond to Capability 1 Activity 5 correspond to Activity 22 System 5 correspond to System

11 Measure 4 correspond to Measure

5 Data 3 correspond to Data

8 Information 3 correspond to Information 16 Project 3 correspond to Project 17 Resource 3 correspond to Resource

4 Condition 2 correspond to Condition 7 Guidance 2 correspond to Guidance 9 Location 2 correspond to Location

18 Rule 2 correspond to Guidance 25 Constraint 2 correspond to Condition

21 Standard 2 correspond to Guidance 2 Agreement 0 correspond to Guidance - N/A - - System Node - N/A - - Function - N/A - - JCA - N/A - - UJTL

14 Performer 4 correspond to Operational Node 24 Architectural Description 4 correspond to Architecture

CPM usage level

6 DesiredEffect 4 correspond to Measure (Effect attributed)

19 Service 3 correspond to Activity (Service attributed) 23 Vision 3 correspond to Measure (Vision attributed)

10 Materiel 2 correspond to Resource (Materiel attributed) 13 Organization 2 correspond to Resource (Organization attributed) 15 PersonType 2 correspond to Resource (Person attributed)

20 Skill 2 correspond to Resource (Skill attributed)

Table 7. Relation between DM2 CDM core concepts and Lee's CDM classes for CPM

12 MeasureType 4 correspond to Measure (MeasureType attributed)

DM 2 No.

Classes of DM2 CDM

core concepts

The purpose of this paper is to provide an abstracted metamodel for use in CPM effectively based on DoDAF V2.0. The proposed CPM methodology provides a process, tasks of the process, products, and model related technique which supports the generation of products in accordance with the methodology definition of ISO/IEC 24744. To promote the usability, the proposed methodology suggest a detailed CPM process. Additionally, in order to be an effective and efficient methodology, the CPM metamodel is developed in accordance with the MECE principles, systems engineering principles which was proposed earlier by Lee & Park's metamodel requirements. And to obtain the interoperability with DoDAF V2.0, the proposed CPM methodology is developed in accordance with DM2 CDM.

However, the current proposed abstracted metamodel remains on a theoretical and logical level and requires validation experimentally or in field applications. In the near future, the proposed metamodel must be validated for application use. However, the proposed CPM methodology is expected to be helpful in practice in the field.

#### **6. References**

