**Reverse Engineering Platform Independent Models from Business Software Applications**

Rama Akkiraju1, Tilak Mitra2 and Usha Thulasiram2 *1IBM T. J. Watson Research Center 2IBM Global Business Services USA* 

#### **1. Introduction**

82 Reverse Engineering – Recent Advances and Applications

Willink, E. (2011). Modeling the OCL Standard Library. *Electronic Communications of the* 

*EASST.* Vol. 44. Retrieved October 2011 from

http://journal.ub.tu-berlin.de/eceasst/

The reasons for reverse engineering software applications could be many. These include: to understand the design of the software system to improve it for future iterations, to communicate the design to others when prior documentation is either lost or does not exist or is out-dated, to understand competitors' product to replicate the design, to understand the details to discover patent infringements, to derive the meta model which can then be used to possibly translate the business application on to other platforms. Whatever the reasons, reverse engineering business applications is a tedious and complex technical activity. Reverse engineering a business application is not about analyzing code alone. It requires analysis of various aspects of a business application: the platform on which software runs, the underlying features of the platform that the software leverages, the interaction of a software system with other applications external to the software system being analyzed, the libraries and the components of the programming language as well as application development platforms that the business application uses etc. We argue that this context in which a business application runs is critical to analyzing it and understanding it for whatever end-use the analysis may be put to use. Much of the prior work on reverse engineering in software engineering field has focused on code analysis. Not much attention has been given in literature to understanding the context in which a business application runs from various perspectives such as the ones mentioned above. In our work we address this specific aspect of reverse engineering business applications.

Modern-day business applications are seldom developed from scratch. For example, they are often developed on higher-level building blocks such as programming language platforms such as J2EE in case of Java programming language and .Net in case of C# programming language. In addition most companies use even higher level application development platforms offered by vendors such as IBM's Websphere and Rational products [18][19], SAP's NetWeaver [20]and Oracles' Enterprise 2.0 software development platforms for Java J2EE application development [21] and Microsoft's .NET platform for C# programming language [22] etc. These platforms offer many in-built capabilities such as web application load balancing, resource pooling, multi-threading, and support for architectural patterns such as service-oriented architecture (SOA). All of these are part of the context in which a business application operates. Understanding this environment is crucial

Reverse Engineering Platform Independent Models from Business Software Applications 85

the structural models (e.g., class models) from implementation artifacts [15] [16] [17]. For example, if a UML model were to be derived from Java code, reverse engineering techniques have looked at deriving structural models such as classes, their data members and interfaces, etc. This approach, although works to a degree, does not provide a high-enough level of abstraction required to interpret the software application at a semantic level. These low level design artifacts lack the semantic context and are hard to reuse. For example, in a service-oriented architecture, modular reusable abstraction is defined at the level of services rather than classes. This distinction is important because abstraction at the level of services enables one to link the business functions offered by services with business objectives. The reusability of the reverse-engineered models with the current state-of-the-art is limited by

In this chapter, we present a method for extracting a platform independent model at *appropriate* levels of abstraction from a business application. The main motivation for reverse engineering in our work is to port a business application developed on one software development platform to a different one. We do this by reverse engineering the design models (we refer to them as platform independent models) from an application that is developed on one software development platform and then apply forward engineering to translate those platform independent models into platform specific models on the target platform. Reverse engineering plays an important role in this porting. While the focus of this book is more on reverse engineering, we feel that it is important to offer context to reverse engineering. Therefore, our work will present reverse engineering mainly from the point-ofview of the need to port business applications from one platform to the other. In the context of our work, a 'platform' refers to a J2EE application development platform such as the ones offered by vendors such as IBM, SAP and Oracle. In this chapter, we present a serviceoriented approach to deriving platform independent models from platform specific implementations. We experimentally verify that by focusing on service level components of software design one can simplify the model extraction problem significantly while still

The chapter is organized as follows. First, we present our motivation for reverse engineering. Then, we present our approach to reverse engineering followed by the results of our experiment in which we reverse engineer design models from the implementation artifacts of a business application developed and deployed on a specific software

**2. Our motivation for reverse engineering: Cross-platform porting of software** 

If a software solution is being designed for the first time, our objective is to be able to formally model that software solution and to generate as much of implementation/code from the model on as many software platforms as possible. This will serve our motivation to enable IT services companies to support software solution development on multiple platforms. In cases where a software solution already exists on a platform, our objective is to reuse as much of that software solution as possible in making that solution available on multiple platforms. To investigate this cross-platform portability, we have selected two development platforms namely IBM's WebSphere platform consisting of WebSphere

Business Services Fabric [19] and SAP's NetWeaver Developer Studio [20].

the lack of proper linkages to higher level business objectives.

achieving up to 40%-50% of model reusability.

development platform.

**solutions** 

to reverse engineering any software since the environment significantly influences how code gets written and managed. Reverse engineering models from business applications written on platforms that support higher level programming idioms (such as the ones noted above) is a difficult problem. If the applications developed involve several legacy systems, then reverse engineering is difficult to achieve due to the sheer nature of heterogeneity of systems. The nuances of each system may make reverse engineering difficult even if the code is built using the same programming language (e.g., Java) using the same standards (such as J2EE) on a given platform.

To understand automated reverse engineering, we must first understand model driven development/architecture [2] [3] and the transformation framework. Model driven development and code generation from models (aka *forward engineering*) has been discussed in literature. In a model driven development approach, given two meta-models, i.e., a source meta-model and a target meta-model and the transformation rules that can transform the source meta-model into the target meta-model, any given platform independent model that adheres to the source meta-model can be translated into a platform specific model (PSM) that adheres to the target meta-model. The resulting PSM can then be translated into various implementation artifacts on the target platform. This is called *forward engineering*. By reversing this approach, platform independent models can be extracted from platform specific models and implementation artifacts. Extraction of models from existing artifacts of a business application is termed *reverse engineering*. Figure 1 shows forward engineering transformation approach while Figure 2 shows reverse engineering transformation approach. The gears in the figures represent software transformations that automatically translate artifacts on the left to the artifacts on the right of the arrows they reside.

Fig. 1. Model driven transformation approach in forward engineering.

Fig. 2. Model driven transformation approach in reverse engineering.

Prior art [1] [5] [7] [10] [11] [12] and features in vendor tools such as the IBM Rational Software Architect (RSA) offer transformation methods and tools (with several gaps) to extract models. However, most of the reverse engineering work has focused on extracting 84 Reverse Engineering – Recent Advances and Applications

to reverse engineering any software since the environment significantly influences how code gets written and managed. Reverse engineering models from business applications written on platforms that support higher level programming idioms (such as the ones noted above) is a difficult problem. If the applications developed involve several legacy systems, then reverse engineering is difficult to achieve due to the sheer nature of heterogeneity of systems. The nuances of each system may make reverse engineering difficult even if the code is built using the same programming language (e.g., Java) using the same standards

To understand automated reverse engineering, we must first understand model driven development/architecture [2] [3] and the transformation framework. Model driven development and code generation from models (aka *forward engineering*) has been discussed in literature. In a model driven development approach, given two meta-models, i.e., a source meta-model and a target meta-model and the transformation rules that can transform the source meta-model into the target meta-model, any given platform independent model that adheres to the source meta-model can be translated into a platform specific model (PSM) that adheres to the target meta-model. The resulting PSM can then be translated into various implementation artifacts on the target platform. This is called *forward engineering*. By reversing this approach, platform independent models can be extracted from platform specific models and implementation artifacts. Extraction of models from existing artifacts of a business application is termed *reverse engineering*. Figure 1 shows forward engineering transformation approach while Figure 2 shows reverse engineering transformation approach. The gears in the figures represent software transformations that automatically

translate artifacts on the left to the artifacts on the right of the arrows they reside.

Platform Specific Model

> Platform Specific Model

Prior art [1] [5] [7] [10] [11] [12] and features in vendor tools such as the IBM Rational Software Architect (RSA) offer transformation methods and tools (with several gaps) to extract models. However, most of the reverse engineering work has focused on extracting

Model-2-Code Transformation

> Model-2-Model Transformation

Model-2-Model Transformation

Code-2Model Transformation

Fig. 2. Model driven transformation approach in reverse engineering.

Fig. 1. Model driven transformation approach in forward engineering.

Implementation artifacts (code, schema)

> Platform Independent Model

(such as J2EE) on a given platform.

Platform Independent Model

Implementation artifacts (code, schema)

the structural models (e.g., class models) from implementation artifacts [15] [16] [17]. For example, if a UML model were to be derived from Java code, reverse engineering techniques have looked at deriving structural models such as classes, their data members and interfaces, etc. This approach, although works to a degree, does not provide a high-enough level of abstraction required to interpret the software application at a semantic level. These low level design artifacts lack the semantic context and are hard to reuse. For example, in a service-oriented architecture, modular reusable abstraction is defined at the level of services rather than classes. This distinction is important because abstraction at the level of services enables one to link the business functions offered by services with business objectives. The reusability of the reverse-engineered models with the current state-of-the-art is limited by the lack of proper linkages to higher level business objectives.

In this chapter, we present a method for extracting a platform independent model at *appropriate* levels of abstraction from a business application. The main motivation for reverse engineering in our work is to port a business application developed on one software development platform to a different one. We do this by reverse engineering the design models (we refer to them as platform independent models) from an application that is developed on one software development platform and then apply forward engineering to translate those platform independent models into platform specific models on the target platform. Reverse engineering plays an important role in this porting. While the focus of this book is more on reverse engineering, we feel that it is important to offer context to reverse engineering. Therefore, our work will present reverse engineering mainly from the point-ofview of the need to port business applications from one platform to the other. In the context of our work, a 'platform' refers to a J2EE application development platform such as the ones offered by vendors such as IBM, SAP and Oracle. In this chapter, we present a serviceoriented approach to deriving platform independent models from platform specific implementations. We experimentally verify that by focusing on service level components of software design one can simplify the model extraction problem significantly while still achieving up to 40%-50% of model reusability.

The chapter is organized as follows. First, we present our motivation for reverse engineering. Then, we present our approach to reverse engineering followed by the results of our experiment in which we reverse engineer design models from the implementation artifacts of a business application developed and deployed on a specific software development platform.

#### **2. Our motivation for reverse engineering: Cross-platform porting of software solutions**

If a software solution is being designed for the first time, our objective is to be able to formally model that software solution and to generate as much of implementation/code from the model on as many software platforms as possible. This will serve our motivation to enable IT services companies to support software solution development on multiple platforms. In cases where a software solution already exists on a platform, our objective is to reuse as much of that software solution as possible in making that solution available on multiple platforms. To investigate this cross-platform portability, we have selected two development platforms namely IBM's WebSphere platform consisting of WebSphere Business Services Fabric [19] and SAP's NetWeaver Developer Studio [20].

Reverse Engineering Platform Independent Models from Business Software Applications 87

elements include text, table and chart elements.

**Process Functional** 

**Text View**

**Service Component**

**Business Entity**

**Chart View**

**Behavioral Models (Use-case, Activity, State machine diagrams)**

Fig. 3. Platform independent modeling elements: Our point-of-view

**Structural Models (Class, Object, Package, Component diagrams)**

**Screen Input Form**

**Non-Service Action**

**Operation**

**Message**

**Information Elements Layout Elements**

**Service Action**

**Service Specification** **Component**

**Technical Component**

> **UML2.0 constructs**

**Service Profile**

**User Experience Profile**

**Interaction Models (Sequence, Interaction, Communication Diagrams)**

**Table View**

than classes. This distinction is important because abstraction at the level of services enables one to link the business functions offered by services with business objectives/performance indicators. Establishing and retaining linkages between model elements and their respective business objectives can play a significant role in model reuse. This linkage can serve as the starting point in one's search for reusable models. A service exposes its interface signature, message exchanges and any associated metadata and is often more coarse-granular than a typical class in an object-oriented paradigm. This notion of working with services rather than classes enables us to think of a business application as a composition of services. We believe that this higher level abstraction is useful when deciding which model elements need to be transformed onto the target platforms and how to leverage existing assets in a client environment. This eliminates lower level classes that are part of the detailed design from our consideration set. For code generation purposes we leverage transformations that can transform a high level design to low-level design and code. For reverse engineering purposes, we focus only on deriving higher level service element designs in addition to the class models. This provides the semantic context required to interpret the derived models. 2. *We define the vocabulary to express the user experience modeling elements using the 'service' level abstractions*. Several best practice models have been suggested about user experience modeling but no specific profile is readily available for use in expressing platform independent models. In this work, we have created a profile that defines the language for expressing user experience modeling elements. These include stereotypes for information elements and layout elements. Information elements include screen, input form, and action elements that invoke services on the server side (called service actions) and those that invoke services locally on the client (non-service actions). Layout

One way to achieve, cross-platform portability of software solutions is by reusing code. Much has been talked about code reuse but the promise of code reuse is often hard to realize. This is so because code that is built on one platform may or may not be easily translated into another platform. If the programming language requirements are different for each platform or if the applications to be developed involve integrating with several custom legacy systems, then code reuse is difficult to achieve due to the sheer nature of heterogeneity. The nuances of each platform may make code reuse difficult even if the code is built using the same programming language (eg: Java) using the same standards (such as J2EE) on the source platform as is expected on the target platform. There is a tacit acknowledgement among practitioners that model reuse is more practical than code reuse. Platform independent models (PIMs) of a given set of business solutions either developed manually or extracted through automated tools from existing solutions can provide a valuable starting point for reuse. A platform independent model of a business application is a key asset for any company for future enhancements to their business processes because it gives the company a formal description of what exists. The PIM is also a key asset for IT consulting companies as well if the consulting company intends to develop pre-built solutions. The following technical question is at the heart of our work. *What aspects of the models are most reusable for cross-platform portability?* While we may not be able generalize the results from our effort on two platforms, we believe that our study still gives valuable insights and lessons that can be used for further exploration.

In the remaining portion of this section, we present our approach to cross-platform porting of software solutions.
