**5. Discussion**

Based on the review of reverse engineering literature (cf. Section 3) and our own expertise in the domain of complex embedded system we try to establish the current state-of-the-art/practice and identify research challenges.

It appears that industry is starting to realize that approaches are needed that enable them to maintain and evolve their complex embedded "legacy" systems in a more effective and predictable manner. There is also the realization that reverse engineering techniques are one important enabling factor to reach this goal. An indication of this trend is the Darwin

<sup>14</sup> This is similar to the problems of general software testing; the method can only be used to show the presence of errors, not to prove the absence of errors. Nonetheless, a simulation-based analysis can identify extreme scenarios, e.g., very high response-times which may violate the system requirements, even though worst case scenarios are not identified.

<sup>15</sup> The simulation community has long recognized the need for model validation, while the model checking community has mostly neglected this issue.

project (van de Laar et al., 2011), which was supported by Philips and has developed reverse engineering tools and techniques for complex embedded systems using a Philips MRI scanner (8 million lines of code) as a real-world case study. Another example is the E-CARES project, which was conducted in cooperation with Ericsson Eurolab and looked at the AXE10 telecommunications system (approximately 10 millions of lines of PLEX code developed over about 40 years) (Marburger & Herzberg, 2001; Marburger & Westfechtel, 2010).

In the following we structure the discussion into static/dynamic fact extraction, followed by static and dynamic analyses.

#### **5.1 Fact extraction**

16 Will-be-set-by-IN-TECH

embedded systems. An important concern is *soundness* (i.e., whether the obtained timing results are guaranteed to generalize to all system executions). Timing analysis via executing the actual system or a model thereof cannot give guarantees (i.e., the approach is unsound), but heuristics to effectively guide the runs can be used to improve the confidence into the obtained results.14 A sound approach operates under the assumptions that the underlying model is valid. For WCET the tool is trusted to provide a valid hardware model; for model checking the timing automata are (semi-automatically) synthesized from the system and thus

Since both model synthesis and validation involve manual effort, scalability to large systems is a major concern for both model checking and simulation. However, at this time simulation offers better tool support and less manual effort. Another scalability concern for model checking is the state-space explosion problem. One can argue that improvements in model checking techniques and faster hardware alleviate this concern, but this is at least partially countered by the increasing complexity of embedded systems. Simulation, in contrast, avoids the state-space explosion problem by sacrificing the guaranteed safety of the result. In a simulation, the state space of the model is sampled rather than searched exhaustively.

With respect to applicability, execution time analysis (both static and hybrid) are not suitable for complex embedded systems and it appears this will be the case for the foreseeable future. The static approach is restricted to smaller systems with simple hardware; the hybrid approach does overcome the problem to model the hardware, but is still prohibitive for systems with nontrivial scheduling regimes and data/control dependencies between tasks. Model checking is increasingly viable for model-driven approaches, but mature tool support is lacking to synthesize models from source code. Thus, model checking may be applicable in principle, but costs are significant and as a result a more favorable cost-to-benefit ratio likely can be obtained by redirection effort elsewhere. Simulation arguably is the most attractive approach for industry, but because it is unsound a key concern is quality assurance. Since industry is very familiar with another unsound technique, testing, expertise from testing can be relatively easily transferred to simulation. Also, synthesis of models seems feasible with

Based on the review of reverse engineering literature (cf. Section 3) and our own expertise in the domain of complex embedded system we try to establish the current

It appears that industry is starting to realize that approaches are needed that enable them to maintain and evolve their complex embedded "legacy" systems in a more effective and predictable manner. There is also the realization that reverse engineering techniques are one important enabling factor to reach this goal. An indication of this trend is the Darwin

<sup>14</sup> This is similar to the problems of general software testing; the method can only be used to show the presence of errors, not to prove the absence of errors. Nonetheless, a simulation-based analysis can identify extreme scenarios, e.g., very high response-times which may violate the system requirements,

<sup>15</sup> The simulation community has long recognized the need for model validation, while the model

reasonable effort even though mature tool support is still lacking.

state-of-the-art/practice and identify research challenges.

even though worst case scenarios are not identified.

checking community has mostly neglected this issue.

model validation is highly desirable.<sup>15</sup>

**5. Discussion**

Obtaining facts from the source code or the running system is the first step for each reverse engineering effort. Extracting static facts from complex embedded systems is challenging because they often use C/C++, which is difficult to parse and analyze. While C is already challenging to parse (e.g., due to the C preprocessor) , C++ poses additional hurdles (e.g., due to templates and namespaces). Edison Design Group (EDG) offers a full front-end for C/C++, which is very mature and able to handle a number of different standards and dialects. Maintaining such a front end is complex; according to EDG it has more than half a million lines of C code of which one-third are comments. EDG's front end is used by many compiler vendors and static analysis tools (e.g., Coverity, CodeSurfer, Axivion's Bauhaus Suite, and the ROSE compiler infrastructure). Coverity's developers believe that the EDG front-end "probably resides near the limit of what a profitable company can do in terms of front-end gyrations," but also that it "still regularly meets defeat when trying to parse real-world large code bases" (Bessey et al., 2010). Other languages that one can encounter in the embedded systems domain—ranging from assembly to PLEX and Erlang—all have their own idiosyncratic challenges. For instance, the Erlang language has many dynamic features that make it difficult to obtain precise and meaningful static information.

Extractors have to be robust and scalable. For C there are now a number of tools available with fact extractors that are suitable for complex embedded system. Examples of tools with fine-grained fact bases are Coverity, CodeSurfer, Columbus (www.frontendart.com), Bauhaus (www.axivion.com/), and the Clang Static Analyzer (clang-analyzer.llvm. org/); an example of a commercial tool with a course-grained fact base is Understand (www.scitools.com). For fine-grained extractors, scalability is still a concern for larger systems of more than half a million of lines of code; coarse-grained extractors can be quite fast while handling very large systems. For example, in a case study the Understand tool extracted facts from a system with more than one million of lines of C code in less than 2 minutes (Kraft, 2010, page 144). In another case study, it took CodeSurfer about 132 seconds to process about 100,000 lines of C code (Yazdanshenas & Moonen, 2011).

Fact extractors typically focus on a certain programming language per se, neglecting the (heterogeneous) environment that the code interacts with. Especially, fact extractors do not accommodate the underlying hardware (e.g., ports and interrupts), which is mapped to programming constructs or idioms in some form. Consequently, it is difficult or impossible for down-stream analyses to realize domain-specific analyses. In C code for embedded systems one can often find embedded assembly. Depending on the C dialect, different constructs are

30% of the lines of code in an industrial system triggered non-conformance warnings with respect to MISRA C rules.) For complex embedded systems, analyses for concurrency bugs are most desirable. Unfortunately, Ornburn & Rugaber (1992) "have observed that because of the flexibility multiprocessing affords, there is an especially strong temptation to use ad hoc solutions to design problems when developing real-time systems." Analyses have a high rate of false positives and it is difficult to produce succinct diagnostic messages that can be easily confirmed or refuted by programmers. In fact, Coverity's developers says that "for many years we gave up on checkers that flagged concurrency errors; while finding such errors was

Software Reverse Engineering in the Domain of Complex Embedded Systems 21

Generally, compared to Java and C#, the features and complexity of C—and even more so of C++—make it very difficult or impossible to realize robust and precise static analyses that are applicable across all kinds of code bases. For example, analysis of pointer arithmetic in C/C++ is a prerequisite to obtain precise static information, but in practice pointer analysis is a difficult problem and consequently there are many approaches that exhibit different trade-offs depending on context-sensitivity, heap modeling, aggregate modeling, etc. (Hind, 2001). For C++ there are additional challenges such as dynamic dispatch and template metaprogramming. In summary, while these general approaches to static code analysis can be valuable, we believe that they should be augmented with more dedicated (reverse engineering) analyses that take into account specifically the target system's peculiarities

Architecture and design recovery is a promising reverse engineering approach for system understanding and evolution (Koschke, 2009; Pollet et al., 2007). While there are many tools and techniques very few are targeted at, or applied to, complex embedded systems. Choi & Jang (2010) describe a method to recursively synthesize components from embedded software. At the lowest level components have to be identified manually. The resulting component model can then be validated using model simulation or model checking techniques. Marburger & Westfechtel (2010) present a tool to analyze PLEX code, recovering architectural information. The static analysis identifies blocks and signaling between blocks, both being key concepts of PLEX. Based on this PLEX-specific model, a higher-level description is synthesized, which is described in the ROOM modeling language. The authors state that Ericssons' "experts were more interested in the coarse-grained structure of the system under study rather than in detailed code analysis." Research has identified the need to construct architectural viewpoints that address communication protocols and concurrency as well as timing properties such as deadlines and throughput of tasks (e.g., (Eixelsberger et al., 1998;

Static analyses are often geared towards a single programming language. However, complex embedded system can be heterogenous. The Philips MRI scanner uses many languages, among them C, C++/STL, C#, VisualBasic and Perl (Arias et al., 2011); the AXE10 system's PLEX code is augmented with C++ code (Marburger & Westfechtel, 2010); Kettu et al. (2008) talk about a complex embedded system that "is based on C/C++/Microsoft COM technology and has started to move towards C#/.NET technology, with still the major and core parts of the codebase remaining in old technologies." The reverse engineering community has neglected (in general) multi-language analyses, but they would be desirable—or are often necessary—for complex embedded systems (e.g., recovery of communication among tasks implemented in different languages). One approach to accommodate heterogenous systems with less tooling effort could be to focus on binaries and intermediate representations rather

not too difficult, explaining them to many users was" (Bessey et al., 2010).

Stoermer et al., 2003)), but concrete techniques to recover them are missing.

(Kienle et al., 2011).

used.16 Robust extractors can recognize embedded assembly, but analyzing it is beyond their capabilites (Balakrishnan & Reps, 2010).

Extracting facts from the running system has the advantage that generic monitoring functionality is typically provided by the hardware and the real-time operating system. However, obtaining finer-grained facts of the system's behavior is often prohibitive because of the monitoring overhead and the probing effect. The amount of tracing data is restricted by the hardware resources. For instance, for ABB robots around 10 seconds (100,000 events) of history are available, which are kept in a ring buffer (Kraft et al., 2010). For the Darwin project, Arias et al. (2011) say "we observed that practitioners developing large and complex software systems desire minimal changes in the source code [and] minimal overhead in the system response time." In the E-CARES project, tracing data could be collected within an emulator (using a virtual time mode); since tracing jobs have highest priority, in the real environment the system could experience timing problems (Marburger & Herzberg, 2001).

For finer-grained tracing data, strategic decisions on what information needs to be traced have to be made. Thus, data extraction and data use (analysis and visualization) have to be coordinated. Also, to obtain certain events the source code may have to be selectively instrumented in some form. As a result, tracing solutions cannot exclusively rely on generic approaches, but need to be tailored to fit a particular goal. The Darwin project proposes a tailorable architecture reconstruction approach based on logging and run-time information. The approach makes "opportunistic" use of existing logging information based on the assumption that "logging is a feature often implemented as part of large software systems to record and store information of their specific activities into dedicated files" (Arias et al., 2011).

After many years of research on scalable and robust static fact extractors, mature tools have finally emerged for C, but they are still challenged by the idiosyncracies of complex embedded systems. For C++ we are not aware of solutions that have reached a level of maturity that matches C, especially considering the latest iteration of the standard, C++11. Extraction of dynamic information is also more challenging for complex embedded systems compared to desktop applications, but they are attractive because for many systems they are relatively easy to realize while providing valuable information to better understand and evolve the system.

#### **5.2 Static analyses**

Industry is using static analysis tools for the evolution of embedded systems and there is a broad range of them. Examples of common static checks include stack space analysis, memory leakage, race conditions, and data/control coupling. Examples of tools are PC-lint (Gimpel Software), CodeSurfer, and Coverity Static Analysis. While these checkers are not strictly reverse engineering analyses, they can aid program understanding.

Static checkers for complex embedded systems face several adoption hurdles. Introducing them for an existing large system produces a huge amount of diagnostic messages, many of which are false positives. Processing these messages requires manual effort and is often prohibitively expensive. (For instance, Boogerd & Moonen (2009) report on a study where

<sup>16</sup> The developers of the Coverity tool say (Bessey et al., 2010): "Assembly is the most consistently troublesome construct. It's already non-portable, so compilers seem to almost deliberately use weird syntax, making it difficult to handle in a general way."

18 Will-be-set-by-IN-TECH

used.16 Robust extractors can recognize embedded assembly, but analyzing it is beyond their

Extracting facts from the running system has the advantage that generic monitoring functionality is typically provided by the hardware and the real-time operating system. However, obtaining finer-grained facts of the system's behavior is often prohibitive because of the monitoring overhead and the probing effect. The amount of tracing data is restricted by the hardware resources. For instance, for ABB robots around 10 seconds (100,000 events) of history are available, which are kept in a ring buffer (Kraft et al., 2010). For the Darwin project, Arias et al. (2011) say "we observed that practitioners developing large and complex software systems desire minimal changes in the source code [and] minimal overhead in the system response time." In the E-CARES project, tracing data could be collected within an emulator (using a virtual time mode); since tracing jobs have highest priority, in the real environment

For finer-grained tracing data, strategic decisions on what information needs to be traced have to be made. Thus, data extraction and data use (analysis and visualization) have to be coordinated. Also, to obtain certain events the source code may have to be selectively instrumented in some form. As a result, tracing solutions cannot exclusively rely on generic approaches, but need to be tailored to fit a particular goal. The Darwin project proposes a tailorable architecture reconstruction approach based on logging and run-time information. The approach makes "opportunistic" use of existing logging information based on the assumption that "logging is a feature often implemented as part of large software systems to record and store information of their specific activities into dedicated files" (Arias et al.,

After many years of research on scalable and robust static fact extractors, mature tools have finally emerged for C, but they are still challenged by the idiosyncracies of complex embedded systems. For C++ we are not aware of solutions that have reached a level of maturity that matches C, especially considering the latest iteration of the standard, C++11. Extraction of dynamic information is also more challenging for complex embedded systems compared to desktop applications, but they are attractive because for many systems they are relatively easy to realize while providing valuable information to better understand and evolve the system.

Industry is using static analysis tools for the evolution of embedded systems and there is a broad range of them. Examples of common static checks include stack space analysis, memory leakage, race conditions, and data/control coupling. Examples of tools are PC-lint (Gimpel Software), CodeSurfer, and Coverity Static Analysis. While these checkers are not strictly

Static checkers for complex embedded systems face several adoption hurdles. Introducing them for an existing large system produces a huge amount of diagnostic messages, many of which are false positives. Processing these messages requires manual effort and is often prohibitively expensive. (For instance, Boogerd & Moonen (2009) report on a study where

<sup>16</sup> The developers of the Coverity tool say (Bessey et al., 2010): "Assembly is the most consistently troublesome construct. It's already non-portable, so compilers seem to almost deliberately use weird

reverse engineering analyses, they can aid program understanding.

syntax, making it difficult to handle in a general way."

the system could experience timing problems (Marburger & Herzberg, 2001).

capabilites (Balakrishnan & Reps, 2010).

2011).

**5.2 Static analyses**

30% of the lines of code in an industrial system triggered non-conformance warnings with respect to MISRA C rules.) For complex embedded systems, analyses for concurrency bugs are most desirable. Unfortunately, Ornburn & Rugaber (1992) "have observed that because of the flexibility multiprocessing affords, there is an especially strong temptation to use ad hoc solutions to design problems when developing real-time systems." Analyses have a high rate of false positives and it is difficult to produce succinct diagnostic messages that can be easily confirmed or refuted by programmers. In fact, Coverity's developers says that "for many years we gave up on checkers that flagged concurrency errors; while finding such errors was not too difficult, explaining them to many users was" (Bessey et al., 2010).

Generally, compared to Java and C#, the features and complexity of C—and even more so of C++—make it very difficult or impossible to realize robust and precise static analyses that are applicable across all kinds of code bases. For example, analysis of pointer arithmetic in C/C++ is a prerequisite to obtain precise static information, but in practice pointer analysis is a difficult problem and consequently there are many approaches that exhibit different trade-offs depending on context-sensitivity, heap modeling, aggregate modeling, etc. (Hind, 2001). For C++ there are additional challenges such as dynamic dispatch and template metaprogramming. In summary, while these general approaches to static code analysis can be valuable, we believe that they should be augmented with more dedicated (reverse engineering) analyses that take into account specifically the target system's peculiarities (Kienle et al., 2011).

Architecture and design recovery is a promising reverse engineering approach for system understanding and evolution (Koschke, 2009; Pollet et al., 2007). While there are many tools and techniques very few are targeted at, or applied to, complex embedded systems. Choi & Jang (2010) describe a method to recursively synthesize components from embedded software. At the lowest level components have to be identified manually. The resulting component model can then be validated using model simulation or model checking techniques. Marburger & Westfechtel (2010) present a tool to analyze PLEX code, recovering architectural information. The static analysis identifies blocks and signaling between blocks, both being key concepts of PLEX. Based on this PLEX-specific model, a higher-level description is synthesized, which is described in the ROOM modeling language. The authors state that Ericssons' "experts were more interested in the coarse-grained structure of the system under study rather than in detailed code analysis." Research has identified the need to construct architectural viewpoints that address communication protocols and concurrency as well as timing properties such as deadlines and throughput of tasks (e.g., (Eixelsberger et al., 1998; Stoermer et al., 2003)), but concrete techniques to recover them are missing.

Static analyses are often geared towards a single programming language. However, complex embedded system can be heterogenous. The Philips MRI scanner uses many languages, among them C, C++/STL, C#, VisualBasic and Perl (Arias et al., 2011); the AXE10 system's PLEX code is augmented with C++ code (Marburger & Westfechtel, 2010); Kettu et al. (2008) talk about a complex embedded system that "is based on C/C++/Microsoft COM technology and has started to move towards C#/.NET technology, with still the major and core parts of the codebase remaining in old technologies." The reverse engineering community has neglected (in general) multi-language analyses, but they would be desirable—or are often necessary—for complex embedded systems (e.g., recovery of communication among tasks implemented in different languages). One approach to accommodate heterogenous systems with less tooling effort could be to focus on binaries and intermediate representations rather

embedded systems. We also believe that research into hybrid analyses that augment static

Software Reverse Engineering in the Domain of Complex Embedded Systems 23

Runtime verification and monitoring is a domain that to our knowledge has not been explored for complex embedded systems yet. While most work in this area addresses Java, Havelund (2008) presents the RMOR framework for monitoring of C systems. The idea of runtime verification is to specify dynamic system behavior in a modeling language, which can then be checked against the running system. (Thus, the approach is not sound because conformance is always established with respect to a single run.) In RMOR, expected behavior is described as state machines (which can express safety and liveness properties). RMOR then instruments the system and links it with the synthesized monitor. The development of RMOR has been driven in the context of NASA embedded systems, and two case studies are briefly presented, one of them showing "the need for augmenting RMOR with the ability to express time

This chapter has reviewed reverse engineering techniques and tools that are applicable for complex embedded systems. From a research perspective, it is unfortunate that the research communities of reverse engineering and embedded and real-time systems are practically disconnected. As we have argued before, embedded systems are an important target for reverse engineering, offering unique challenges compared to desktop and business

Since industry is dealing with *complex* embedded systems, reverse engineering tools and techniques have to scale to larger code bases, handle the idiosyncracies of industrial code (e.g., C dialects with embedded assembly), and provide domain-specific solutions (e.g., synthesis of timing properties). For industrial practitioners, adoption of research techniques and tools has many hurdles because it is very difficult to assess the applicability and suitability of proposed techniques and the quality of existing tools. There are huge differences in quality of both commercial and research tools and different tools often fail in satisfying different industrial requirements so that no tool meets all of the minimum requirements. Previously, we have argued that the reverse engineering community should elevate adoptability of their tools as a key requirement for success (Kienle & Müller, 2010). However, this needs to go hand in hand with a change in research methodology towards more academic-industrial collaboration as

Just as in other domains, reverse engineering for complex embedded systems is facing adoption hurdles because tools have to show results in a short time-frame and have to integrate smoothly into the existing development process. Ebert & Salecker (2009) observe that for embedded systems "research today is fragmented and divided into technology, application, and process domains. It must provide a consistent, systems-driven framework for systematic modeling, analysis, development, test, and maintenance of embedded software in line with embedded systems engineering." Along with other software engineering areas,

Reverse engineering may be able to profit from, and contribute to, research that recognizes the growing need to analyze systems with multi-threading and multi-core. Static analyses and model checking techniques for such systems may be applicable to complex embedded systems

information with dynamic timing properties is needed.

well as a change in the academic rewards structure.

reverse engineering research should take up this challenge.

constraints."

**6. Conclusion**

applications.

than source code (Kettu et al., 2008). This approach is most promising if source code is transformed to an underlying intermediate representation or virtual machine (e.g., Java bytecode or .NET CIL code) because in this case higher-level information is often preserved. In contrast, if source code is translated to machine-executable binaries, which is typically the case for C/C++, then most of the higher-level information is lost. For example, for C++ the binaries often do not allow to reconstruct all classes and their inheritance relationships (Fokin et al., 2010).

Many complex embedded systems have features of a product line (because the software supports a portfolio of different devices). Reverse engineering different configurations and variablity points would be highly desirable. A challenge is that often ad hoc techniques are used to realize product lines. For instance, Kettu et al. (2008) describe a C/C++ system that uses a number different techniques such as conditional compilation, different source files and linkages for different configurations, and scripting. Generally, there is research addressing product lines (e.g., (Alonso et al., 1998; Obbink et al., 1998; Stoermer et al., 2003)), but there are no mature techniques or tools of broader applicability.

#### **5.3 Dynamic analyses**

Research into dynamic analyses have increasingly received more attention in the reverse engineering community. There are also increasingly hybrid approaches that combine both static and dynamic techniques. Dynamic approaches typically provide information about a single execution of the system, but can also accumulate information of multiple runs.

Generally, since dynamic analyses naturally produce (time-stamped) event sequences, they are attractive for understanding of timing properties in complex embedded systems. The Tracealyzer is an example of a visualization tool for embedded systems focusing on high-level runtime behavior, such as scheduling, resource usage and operating system calls (Kraft et al., 2010). It displays task traces using a novel visualization technique that focuses on the task preemption nesting and only shows active tasks at a given point in time. The Tracealyzer is used systematically at ABB Robotics and its approach to visualization has proven useful for troubleshooting and performance analysis. The E-CARES project found that "structural [i.e., static] analysis . . . is not sufficient to understand telecommunication systems" because they are highly dynamic, flexible and reactive (Marburger & Westfechtel, 2003). E-CARES uses tracing that is configurable and records events that relate to signals and assignments to selected state variables. Based on this information UML collaboration and sequence diagrams are constructed that can be shown and animated in a visualizer. The Darwin project relies on dynamic analyses and visualization for reverse engineering of MRI scanners. Customizable mapping rules are used to extract events from logging and run-time measurements to construct so-called execution viewpoints. For example, there are visualizations that show with different granularity the system's resource usage and start-up behavior in terms of execution times of various tasks or components in the system (Arias et al., 2009; 2011).

Cornelissen et al. (2009) provide a detailed review of existing research in dynamic analyses for program comprehension. They found that most research focuses on object-oriented software and that there is little research that targets distributed and multi-threaded applications. Refocusing research more towards these neglected areas would greatly benefit complex embedded systems. We also believe that research into hybrid analyses that augment static information with dynamic timing properties is needed.

Runtime verification and monitoring is a domain that to our knowledge has not been explored for complex embedded systems yet. While most work in this area addresses Java, Havelund (2008) presents the RMOR framework for monitoring of C systems. The idea of runtime verification is to specify dynamic system behavior in a modeling language, which can then be checked against the running system. (Thus, the approach is not sound because conformance is always established with respect to a single run.) In RMOR, expected behavior is described as state machines (which can express safety and liveness properties). RMOR then instruments the system and links it with the synthesized monitor. The development of RMOR has been driven in the context of NASA embedded systems, and two case studies are briefly presented, one of them showing "the need for augmenting RMOR with the ability to express time constraints."

#### **6. Conclusion**

20 Will-be-set-by-IN-TECH

than source code (Kettu et al., 2008). This approach is most promising if source code is transformed to an underlying intermediate representation or virtual machine (e.g., Java bytecode or .NET CIL code) because in this case higher-level information is often preserved. In contrast, if source code is translated to machine-executable binaries, which is typically the case for C/C++, then most of the higher-level information is lost. For example, for C++ the binaries often do not allow to reconstruct all classes and their inheritance relationships (Fokin

Many complex embedded systems have features of a product line (because the software supports a portfolio of different devices). Reverse engineering different configurations and variablity points would be highly desirable. A challenge is that often ad hoc techniques are used to realize product lines. For instance, Kettu et al. (2008) describe a C/C++ system that uses a number different techniques such as conditional compilation, different source files and linkages for different configurations, and scripting. Generally, there is research addressing product lines (e.g., (Alonso et al., 1998; Obbink et al., 1998; Stoermer et al., 2003)), but there

Research into dynamic analyses have increasingly received more attention in the reverse engineering community. There are also increasingly hybrid approaches that combine both static and dynamic techniques. Dynamic approaches typically provide information about a

Generally, since dynamic analyses naturally produce (time-stamped) event sequences, they are attractive for understanding of timing properties in complex embedded systems. The Tracealyzer is an example of a visualization tool for embedded systems focusing on high-level runtime behavior, such as scheduling, resource usage and operating system calls (Kraft et al., 2010). It displays task traces using a novel visualization technique that focuses on the task preemption nesting and only shows active tasks at a given point in time. The Tracealyzer is used systematically at ABB Robotics and its approach to visualization has proven useful for troubleshooting and performance analysis. The E-CARES project found that "structural [i.e., static] analysis . . . is not sufficient to understand telecommunication systems" because they are highly dynamic, flexible and reactive (Marburger & Westfechtel, 2003). E-CARES uses tracing that is configurable and records events that relate to signals and assignments to selected state variables. Based on this information UML collaboration and sequence diagrams are constructed that can be shown and animated in a visualizer. The Darwin project relies on dynamic analyses and visualization for reverse engineering of MRI scanners. Customizable mapping rules are used to extract events from logging and run-time measurements to construct so-called execution viewpoints. For example, there are visualizations that show with different granularity the system's resource usage and start-up behavior in terms of execution times of various tasks or components in the system (Arias et al.,

Cornelissen et al. (2009) provide a detailed review of existing research in dynamic analyses for program comprehension. They found that most research focuses on object-oriented software and that there is little research that targets distributed and multi-threaded applications. Refocusing research more towards these neglected areas would greatly benefit complex

single execution of the system, but can also accumulate information of multiple runs.

are no mature techniques or tools of broader applicability.

et al., 2010).

**5.3 Dynamic analyses**

2009; 2011).

This chapter has reviewed reverse engineering techniques and tools that are applicable for complex embedded systems. From a research perspective, it is unfortunate that the research communities of reverse engineering and embedded and real-time systems are practically disconnected. As we have argued before, embedded systems are an important target for reverse engineering, offering unique challenges compared to desktop and business applications.

Since industry is dealing with *complex* embedded systems, reverse engineering tools and techniques have to scale to larger code bases, handle the idiosyncracies of industrial code (e.g., C dialects with embedded assembly), and provide domain-specific solutions (e.g., synthesis of timing properties). For industrial practitioners, adoption of research techniques and tools has many hurdles because it is very difficult to assess the applicability and suitability of proposed techniques and the quality of existing tools. There are huge differences in quality of both commercial and research tools and different tools often fail in satisfying different industrial requirements so that no tool meets all of the minimum requirements. Previously, we have argued that the reverse engineering community should elevate adoptability of their tools as a key requirement for success (Kienle & Müller, 2010). However, this needs to go hand in hand with a change in research methodology towards more academic-industrial collaboration as well as a change in the academic rewards structure.

Just as in other domains, reverse engineering for complex embedded systems is facing adoption hurdles because tools have to show results in a short time-frame and have to integrate smoothly into the existing development process. Ebert & Salecker (2009) observe that for embedded systems "research today is fragmented and divided into technology, application, and process domains. It must provide a consistent, systems-driven framework for systematic modeling, analysis, development, test, and maintenance of embedded software in line with embedded systems engineering." Along with other software engineering areas, reverse engineering research should take up this challenge.

Reverse engineering may be able to profit from, and contribute to, research that recognizes the growing need to analyze systems with multi-threading and multi-core. Static analyses and model checking techniques for such systems may be applicable to complex embedded systems

Arts, T. & Fredlund, L.-A. (2002). Trace analysis of Erlang programs, *ACM SIGPLAN Erlang*

Software Reverse Engineering in the Domain of Complex Embedded Systems 25

Audsley, N. C., Burns, A., Davis, R. I., Tindell, K. W. & Wellings, A. J. (1995). Fixed

Avery, D. (2011). The evolution of flight management systems, *IEEE Software* 28(1): 11–13. Balakrishnan, G. & Reps, T. (2010). WYSINWYX: What you see is not what you eXecute, *ACM Transactions on Programming Languages and Systems* 32(6): 23:1–23:84. Balci, O. (1990). Guidelines for Successful Simulation Studies, *Proceedings of the 1990*

Institute and State University, Blacksburg, Virginia 2061-0106, USA.

*Conference on Reverse Engineering (WCRE'97)* pp. 2–11.

Department of Computer Science, United Kingdom.

*Science*, Springer-Verlag, pp. 115–122.

*Software Repositories (MSR'09)* pp. 41–50.

*(AOSD'06)* pp. 199–211.

pp. 358–366.

*(RTSS'02), Austin, TX, USA*.

Bellay, B. & Gall, H. (1997). A comparison of four reverse engineering tools, *4th IEEE Working*

Bellay, B. & Gall, H. (1998). Reverse engineering to recover and describe a system's

Bernat, G., Colin, A. & Petters, S. (2002). WCET Analysis of Probabilistic Hard Real-Time

Bernat, G., Colin, A. & Petters, S. (2003). pWCET: a Tool for Probabilistic Worst Case Execution

Bessey, A., Block, K., Chelfs, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A.,

Boogerd, C. & Moonen, L. (2009). Evaluating the relation between coding standard violations

Bozga, M., Daws, C., Maler, O., Olivero, A., Tripakis, S. & Yovine, S. (1998). Kronos:

Broy, M. (2006). Challenges in automotive software engineering, *28th ACM/IEEE International*

Bruntink, M. (2008). Reengineering idiomatic exception handling in legacy C code, *12th IEEE*

Bruntink, M., van Deursen, A., D'Hondt, M. & Tourwe, T. (2007). Simple crosscutting

Bull, T. M., Younger, E. J., Bennett, K. H. & Luo, Z. (1995). Bylands: reverse engineering

*Vancouver, Canada*, Vol. 1427, Springer-Verlag, pp. 546–550.

*Conference on Software Engineering (ICSE'06)* pp. 33–42.

to find bugs in the real world, *Communications of the ACM* 53(2): 66–75. Bohlin, M., Lu, Y., Kraft, J., Kreuger, P. & Nolte, T. (2009). Simulation-Based Timing Analysis

priority pre-emptive scheduling: An historical perspective, *Real-Time Systems*

*Winter Simulation Conference*, Department of Computer Science, Virginia Polytechnic

architecture, *Development and Evolution of Software Architectures for Product Families, Second International ESPRIT ARES Workshop*, Vol. 1429 of *Lecture Notes in Computer*

Systems, *Proceedings of the 23rd IEEE International Real-Time Systems Symposium*

Time Analysis of Real-Time Systems, *Technical Report YCS353*, University of York,

McPeak, S. & Engler, D. (2010). A few billion lines of code later: Using static analysis

of Complex Real-Time Systems, *Proceedings of the 15th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'09)*, pp. 321–328.

and faults within and across software versions, *6th Working Conference on Mining*

A Model-Checking Tool for Real-Time Systems, *in* A. J. Hu & M. Y. Vardi (eds), *Proceedings of the 10th International Conference on Computer Aided Verification,*

*European Conference on Software Maintenance and Reengineering (CSMR'08)* pp. 133–142.

concerns are not so simple: analysing variability in large-scale idioms-based implementations, *6th International Conference on Aspect-Oriented Software Development*

safety-critical systems, *International Conference on Software Maintenance (ICSM'95)*

*Workshop (ERLANG'02)*.

8(2–3): 173–198.

as well. Similarly, research in runtime-monitoring/verification and in the visualization of streaming applications may be applicable to certain kinds of complex embedded systems.

Lastly, reverse engineering for complex embedded systems is facing an expansion of system boundaries. For instance, medical equipment is no longer a stand-alone system, but a node in the hospital network, which in turn is connected to the Internet. Car navigation and driver assistance can be expected to be increasingly networked. Similar developments are underway for other application areas. Thus, research will have to broaden its view towards software-intensive systems and even towards systems of systems.

#### **7. References**


22 Will-be-set-by-IN-TECH

as well. Similarly, research in runtime-monitoring/verification and in the visualization of streaming applications may be applicable to certain kinds of complex embedded systems. Lastly, reverse engineering for complex embedded systems is facing an expansion of system boundaries. For instance, medical equipment is no longer a stand-alone system, but a node in the hospital network, which in turn is connected to the Internet. Car navigation and driver assistance can be expected to be increasingly networked. Similar developments are underway for other application areas. Thus, research will have to broaden its view towards

Abdelzaher, L. S. T., Arzen, K.-E., Cervin, A., Baker, T., Burns, A., Buttazzo, G., Caccamo,

Ackermann, C., Cleaveland, R., Huang, S., Ray, A., Shelton, C. & Latronico, E. (2010). *1st*

Adnan, R., Graaf, B., van Deursen, A. & Zonneveld, J. (2008). Using cluster analysis to

Åkerholm, M., Carlson, J., Fredriksson, J., Hansson, H., Håkansson, J., Möller, A., Pettersson,

Åkerholm, M., Land, R. & Strzyz, C. (2009). Can you afford not to certify your control system?,

Alonso, A., Garcia-Valls, M. & de la Puente, J. A. (1998). Assessment of timing properties

Alur, R., Courcoubetis, C. & Dill, D. L. (1993). Model-checking in dense real-time, *Information*

Andersson, J., Huselius, J., Norström, C. & Wall, A. (2006). Extracting simulation models

Arias, T. B. C., Avgeriou, P. & America, P. (2008). Analyzing the actual execution of a

Arias, T. B. C., Avgeriou, P. & America, P. (2009). Constructing a resource usage view of a

Arias, T. B. C., Avgeriou, P., America, P., Blom, K. & Bachynskyyc, S. (2011). A

M., Lehoczky, J. & Mok, A. K. (2004). Real time scheduling theory: A historical

*International Conference on Runtime Verification (RV 2010)*, Vol. 6418 of *Lecture Notes in Computer Science*, Springer-Verlag, chapter Automatic Requirements Extraction from

improve the design of component interfaces, *23rd IEEE/ACM International Conference*

P. & Tivoli, M. (2007). The SAVE approach to component-based development of

*iVTinternational* p. 16. http://www.ivtinternational.com/legislative\_

of family products, *Development and Evolution of Software Architectures for Product Families, Second International ESPRIT ARES Workshop*, Vol. 1429 of *Lecture Notes in*

*and Computation* 104(1): 2–34. http://citeseer.ist.psu.edu/viewdoc/

from complex embedded real-time systems, *1st International Conference on Software*

large software-intensive system for determining dependencies, *15th IEEE Working*

large and complex software-intensive system, *16th IEEE Working Conference on Reverse*

top-down strategy to reverse architecting execution views for a large and complex software-intensive system: An experience report, *Science of Computer Programming*

software-intensive systems and even towards systems of systems.

perspective, *Real-Time Systems* 28(2–3): 101–155.

*on Automated Software Engineering (ASE'08)* pp. 383–386.

*Computer Science*, Springer-Verlag, pp. 161–169.

*Conference on Reverse Engineering (WCRE'08)* pp. 49–58.

versions?doi=10.1.1.26.7610.

*Engineering Advances (ICSEA 2006)*.

*Engineering (WCRE'09)* pp. 247–255.

76(12): 1098–1112.

vehicular systems, *Journal of Systems and Software* 80(5): 655–667.

**7. References**

Test Cases, pp. 1–15.

focus\_nov.php.


Graaf, B., Lormans, M. & Toetenel, H. (2003). Embedded software engineering: The state of

Software Reverse Engineering in the Domain of Complex Embedded Systems 27

Hänninen, K., Mäki-Turja, J. & Nolin, M. (2006). Present and future requirements in

Havelund, K. (2008). *Runtime Verification of C Programs*, Vol. 5047 of *Lecture Notes in Computer*

Hind, M. (2001). Pointer analysis: Haven't we solved this problem yet?, *ACM*

Holzmann, G. (2003). *The SPIN Model Checker: Primer and Reference Manual*, Addison-Wesley. Holzmann, G. J. (1997). The Model Checker SPIN, *IEEE Trans. Softw. Eng.* 23(5): 279–295. Holzmann, G. J. & Smith, M. H. (1999). A practical method for verifying event-driven

*Engineering of Computer Based Systems (ECBS'06)* pp. 139–147.

developing industrial embedded real-time systems – interviews with designers in the vehicle domain, *13th Annual IEEE International Symposium and Workshop on*

*Science*, Springer-Verlag, chapter Testing of Software and Communicating Systems

*SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering*

software, *Proceedings of the 21st international conference on Software engineering (ICSE'99)*, IEEE Computer Society Press, Los Alamitos, CA, USA, pp. 597–607. Holzmann, G. J. & Smith, M. H. (2001). Software model checking: extracting verification models from source code, *Software Testing, Verification and Reliability* 11(2): 65–79. Huselius, J. & Andersson, J. (2005). Model synthesis for real-time systems, *9th IEEE European Conference on Software Maintenance and Reengineering (CSMR 2005)*, pp. 52–60. Huselius, J., Andersson, J., Hansson, H. & Punnekkat, S. (2006). Automatic generation

and validation of models of legacy software, *12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2006)*,

course Mechanised Validation of Parallel Systems, Friedrich-Alexander University

5135 of *Lecture Notes in Computer Science*, Springer-Verlag, chapter Using Architecture

embedded systems, *4th International Workshop on Software Quality and Maintainability (SQM 2010), sattelite event of the 14th European Conference on Software Maintenance and Reengineering (CSMR 2010)*. http://holgerkienle.wikispaces.com/file/

in the complex embedded systems domain, *Software Quality Journal*. Forthcoming,

Jensen, P. K. (1998). Automated Modeling of Real-Time Implementation, *Technical Report*

Jensen, P. K. (2001). *Reliable Real-Time Applications. And How to Use Tests to Model and*

Kettu, T., Kruse, E., Larsson, M. & Mustapic, G. (2008). *Architecting Dependable Systems V*, Vol.

Kienle, H. M., Kraft, J. & Nolte, T. (2010). System-specific static code analyses for complex

Kienle, H. M., Kraft, J. & Nolte, T. (2011). System-specific static code analyses: A case study

Kienle, H. M. & Müller, H. A. (2010). The tools perspective on software reverse engineering: Requirements, construction and evaluation, *Advances in Computers* 79: 189–290.

Kaner, C. (1997). Software liability. http://www.kaner.com/pdfs/theories.pdf. Katoen, J. (1998). Concepts, algorithms and tools for model checking, lecture notes of the

Analysis to Evolve Complex Industrial Systems, pp. 326–341.

http://dx.doi.org/10.1007/s11219-011-9138-7.

the practice, *IEEE Software* 20(6): 61–69.

(TestCom/FATES'08), pp. 7–22.

*BRICS RS-98-51*, University of Aalborg.

*Understand*, PhD thesis, Aalborg University.

*(PASTE'01)* pp. 54–61.

pp. 342–349.

at Erlangen-Nurnberg.

view/KKN-SQM-10.pdf.


24 Will-be-set-by-IN-TECH

Canfora, G., Cimitile, A. & De Carlini, U. (1993). A reverse engineering process for design level

Canfora, G., Di Penta, M. & Cerulo, L. (2011). Achievements and challenges in software

Choi, Y. & Jang, H. (2010). Reverse engineering abstract components for model-based

*Symposium on High-Assurance Systems Engineering (HASE'10)* pp. 122–131. Clarke, E. M. & Emerson, E. A. (1982). Design and synthesis of synchronization skeletons

Confora, G. & Di Penta, M. (2007). New frontiers of reverse engineering, *Future of Software*

Cornelissen, B., Zaidman, A., van Deursen, A., Moonen, L. & Koschke, R. (2009). A systematic

Crnkovic, I., Sentilles, S., Vulgarakis, A. & Chaudron, M. R. V. (2011). A classification

Cusumano, M. A. (2011). Reflections on the Toyota debacle, *Communications of the ACM*

David, A. & Yi, W. (2000). Modelling and analysis of a commercial field bus protocol,

Daws, C. & Yovine, S. (1995). Two examples of verification of multirate timed automata with

Decotigny, D. & Puaut, I. (2002). ARTISST: An extensible and modular simulation tool

Ebert, C. & Jones, C. (2009). Embedded software: Facts, figures and future, *IEEE Computer*

Ebert, C. & Salecker, J. (2009). Embedded software—technologies and trends, *IEEE Software*

Eixelsberger, W., Kalan, M., Ogris, M., Beckman, H., Bellay, B. & Gall, H. (1998). Recovery

Fokin, A., Troshina, K. & Chernov, A. (2010). Reconstruction of class hierarchies

Gherbi, A. & Khendek, F. (2006). UML profiles for real-time systems and their applications,

Glück, P. R. & Holzmann, G. J. (2002). Using SPIN model checking for flight software

verification, *IEEE Aerospace Conference (AERO'02)* pp. 1–105–1–113.

1429 of *Lecture Notes in Computer Science*, Springer-Verlag, pp. 89–96. Emerson, E. A. & Halpern, J. Y. (1984). Sometimes and Not Never Revisited: on Branching

*Maintenance and Reengineering (CSMR'10)* pp. 240–243.

reverse engineering, *Communications of the ACM* 54(4): 142–151.

London, UK, pp. 52–71.

37(5): 593–615.

54(1): 33–35.

42(4): 42–52.

26(3): 14–18.

05/article5.

Press, pp. 165–172.

*Engineering (FOSE'07)* pp. 326–341.

*Software Engineering* 35(5): 684–702.

Computer Society, Washington, DC, USA, p. 66.

*Distributed Computing (ISORC'02)* pp. 365–372.

document production from ada code, *Information and Software Technology* 35(1): 23–34.

development and verification of embedded software, *12th IEEE International*

using branching-time temporal logic, *Logic of Programs, Workshop*, Springer-Verlag,

survey of program comprehension through dynamic analysis, *IEEE Transactions on*

framework for software component models, *IEEE Transactions on Software Engineering*

*Proceedings of 12th Euromicro Conference on Real-Time Systems*, IEEE Computer Society

kronos, *Proceedings of the 16th IEEE Real-Time Systems Symposium (RTSS'95)*, IEEE

for real-time systems, *5th IEEE International Symposium on Object-Oriented Real-Time*

of architectural structure: A case study, *Development and Evolution of Software Architectures for Product Families, Second International ESPRIT ARES Workshop*, Vol.

Versus Linear Time, *Technical report*, University of Texas at Austin, Austin, TX, USA.

for decompilation of C++ programs, *14th IEEE European Conference on Software*

*Journal of Object Technology* 5(4). http://www.jot.fm/issues/issue\_2006\_


Müller, H. A. & Kienle, H. M. (2010). *Encyclopedia of Software Engineering*, Taylor & Francis,

Software Reverse Engineering in the Domain of Complex Embedded Systems 29

Müller, H., Jahnke, J., Smith, D., Storey, M., Tilley, S. & Wong, K. (2000). Reverse engineering: A roadmap, *Conference on The Future of Software Engineering* pp. 49–60. Obbink, H., Clements, P. C. & van der Linden, F. (1998). Introduction, *Development and*

Ornburn, S. B. & Rugaber, S. (1992). Reverse engineering: resolving conflicts between expected

Palsberg, J. & Wallace, M. (2002). Reverse engineering of real-time assembly code. http: //www.cs.ucla.edu/~palsberg/draft/palsberg-wallace02.pdf. Parkinson, P. J. (n.d.). The challenges and advances in COTS software for avionics systems. http://blogs.windriver.com/parkinson/files/IET\_COTSaviation\_

Pnueli, A. (1977). The temporal logic of programs, *18th IEEE Annual IEEE Symposium on*

Pollet, D., Ducasse, S., Poyet, L., Alloui, I., Cimpan, S. & Verjus, H. (2007). Towards a

*Conference on Software Maintenance and Reengineering (CSMR'07)* pp. 137–148. Quante, J. & Begel, A. (2011). ICPC 2011 industrial challenge. http://icpc2011.cs.

Riva, C. (2000). Reverse architecting: an industrial experience report, *7th IEEE Working*

Riva, C., Selonen, P., Systä, T. & Xu, J. (2009). A profile-based approach for maintaining

RTCA (1992). Software considerations in airborne systems and equipment certification,

Russell, J. T. & Jacome, M. F. (2009). Program slicing across the hardware-software boundary for embedded systems, *International Journal of Embedded Systems* 4(1): 66–82. Samii, S., Rafiliu, S., Eles, P. & Peng, Z. (2008). A Simulation Methodology for Worst-Case

Schlesinger, S., Crosbie, R. E., Gagne, R. E., Innis, G. S., Lalwani, C. S. & Loch, J. (1979).

Shahbaz, M. & Eschbach, R. (2010). Reverse engineering ECUs of automotive components, *First International Workshop on Model Inference In Testing (MIIT'10)* pp. 21–22. Sivagurunathan, Y., Harman, M. & Danicic, S. (1997). Slicing, I/O and the implicit state,

Stoermer, C., O'Brien, L. & Verhoef, C. (2003). Moving towards quality attribute driven

process-oriented software architecture reconstruction taxonomy, *11th IEEE European*

software architecture: an industrial experience report, *Journal of Software Maintenance*

Response Time Estimation of Distributed Real-Time Systems, *Proceedings of Design,*

*3rd International Workshop on Automatic Debugging (AADEBUG'97)* pp. 59–67. http:

software architecture reconstruction, *10th IEEE Working Conference on Reverse*

doi/abs/10.1081/E-ESE-120044308.

pp. 1–3.

*(ICSM'92)* pp. 32–40.

PAUL\_PARKINSON\_paper.pdf.

*Foundations of Computer Science (FOCS'77)*, pp. 46–57.

usask.ca/conf\_site/IndustrialTrack.html.

*Conference on Reverse Engineering (WCRE'00)* pp. 42–50.

*Automation, and Test in Europe (DATE'08)*, pp. 556–561.

//www.ep.liu.se/ea/cis/1997/009/06/.

*Engineering (WCRE'03)* pp. 46–56.

Terminology for Model Credibility, *Simulation* 32(3): 103–104.

*and Evolution: Research and Practice* 23(1): 3–20.

*Standard RTCA/DO-17B*, RTCA.

chapter Reverse Engineering, pp. 1016–1030. http://www.tandfonline.com/

*Evolution of Software Architectures for Product Families, Second International ESPRIT ARES Workshop*, Vol. 1429 of *Lecture Notes in Computer Science*, Springer-Verlag,

and actual software designs, *8th IEEE International Conference on Software Maintenance*


26 Will-be-set-by-IN-TECH

Knor, R., Trausmuth, G. & Weidl, J. (1998). Reengineering C/C++ source code by transforming

Koschke, R. (2009). Architecture reconstruction: Tutorial on reverse engineering to the

Kraft, J. (2009). RTSSim – a simulation framework for complex embedded systems,

Kraft, J. (2010). *Enabling Timing Analysis of Complex Embedded Systems*, PhD thesis no. 84,

Kraft, J., Kienle, H. M., Nolte, T., Crnkovic, I. & Hansson, H. (2011). Software maintenance

Kraft, J., Lu, Y., Norström, C. & Wall, A. (2008). A Metaheuristic Approach for Best Effort

Law, A. M. & Kelton, W. D. (1993). *Simulation, Modeling and Analysis*, ISBN: 0-07-116537-1,

Lewis, B. & McConnell, D. J. (1996). Reengineering real-time embedded software onto

Liggesmeyer, P. & Trapp, M. (2009). Trends in embedded software engineering, *IEEE Software*

Lv, M., Guan, N., Zhang, Y., Deng, Q., Yu, G. & Zhang, J. (2009). A survey of WCET analysis of

Marburger, A. & Herzberg, D. (2001). E-CARES research project: Understanding

Marburger, A. & Westfechtel, B. (2003). Tools for understanding the behavior of

Marburger, A. & Westfechtel, B. (2010). Graph-based structural analysis for

Vol. 5765 of *Lecture Notes in Computer Science*, Springer-Verlag, pp. 363–392. McDowell, C. E. & Helmbold, D. P. (1989). Debugging concurrent programs, *ACM Computing*

*Maintenance and Reengineering (CSMR'01)* pp. 139–147.

of *Lecture Notes in Computer Science*, Springer-Verlag, pp. 140–173.

*Science*, Springer-Verlag, pp. 97–105.

php?choice=publications&id=1629.

diva2:312516/FULLTEXT01.

pp. 335–338.

pp. 315–329.

McGraw-Hill.

26(3): 19–25.

*(WCRE'96)* pp. 11–19.

*and Systems* pp. 65–72.

*(ICSE'03)* pp. 430–441.

*Surveys* 21(4): 593–622.

state machines, *Development and Evolution of Software Architectures for Product Families, Second International ESPRIT ARES Workshop*, Vol. 1429 of *Lecture Notes in Computer*

architectural level, *in* A. De Lucia & F. Ferrucci (eds), *ISSSE 2006–2008*, Vol. 5413

*Technical Report*, Mälardalen University. http://www.mrtc.mdh.se/index.

Mälardalen University, Sweden. http://mdh.diva-portal.org/smash/get/

research in the PROGRESS project for predictable embedded software systems, *15th IEEE European Conference on Software Maintenance and Reengineering (CSMR 2011)*

Timing Analysis targeting Complex Legacy Real-Time Systems, *Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'08)*. Kraft, J., Wall, A. & Kienle, H. (2010). *1st International Conference on Runtime Verification (RV*

*2010)*, Vol. 6418 of *Lecture Notes in Computer Science*, Springer-Verlag, chapter Trace Recording for Embedded Systems: Lessons Learned from Five Industrial Projects,

a parallel processing platform, *3rd IEEE Working Conference on Reverse Engineering*

real-time operating systems, *2009 IEEE International Conference on Embedded Software*

complex legacy telecommunication systems, *5th IEEE European Conference on Software*

telecommunication systems, *25th Internatinal Conference on Software Engineering*

telecommunication systems, *Graph transformations and model-driven engineering*,


**0**

**2**

*Portugal*

**User Interface Software**

<sup>1</sup>*Departamento de Informática, Universidade do Minho*

<sup>2</sup>*Escola Superior de Tecnologia, Instituto Politécnico do Cávado e do Ave*

**GUIsurfer: A Reverse Engineering Framework for**

José Creissac Campos1, João Saraiva1, Carlos Silva1 and João Carlos Silva<sup>2</sup>

In the context of developing tool support to the automated analysis of interactive systems implementations, this chapter proposal aims to investigate the applicability of reverse engineering approaches to the derivation of user interfaces behavioural models. The ultimate goal is that these models might be used to reason about the quality of the system, both from an usability and an implementation perspective, as well as being used to help systems'

Developers of interactive systems are faced with a fast changing technological landscape, where a growing multitude of technologies (consider, for example, the case of web applications) can be used to develop user interfaces for a multitude of form factors, using a growing number of input/output techniques. Additionally, they have to take into consideration non-functional requirements such as the usability and the maintainability of the system. This means considering the quality of the system both from the user's (i.e. external) perspective, and from the implementation's (i.e. internal) perspective. A system that is poorly designed from a usability perspective will most probably fail to be accepted by its end users. A poorly implemented system will be hard to maintain and evolve, and might fail to fulfill all intended requirements. Furthermore, when subsystems are subcontracted, the problem is faced of how to guarantee the quality of the implemented system during acceptance testing. The generation of user interface models from source code has the potential to mitigate these problems. The analysis of these models enables some degree of reasoning about the usability of the system, reducing the need to resort to costly user testing (cf. (Dix et al., 2003)), and can support acceptance testing processes. Moreover, the manipulation of the models supports the

Human-Computer interaction is an important and evolving area. Therefore, it is very important to reason about GUIs. In several situations (for example the mobile industry) it

In order for a user interface to have good usability characteristics it must both be adequately designed and adequately implemented. Tools are currently available to developers that allow

is the quality of the GUI that influences the adoption of certain software.

**1. Introduction**

**1.1 Motivation**

**1.2 Objectives**

maintenance, evolution and redesign.

evolution, redesign and comparison of systems.

