**2. The platform for research collaborative computing**

The service‐oriented computing is based on the software services, which are platform‐inde‐ pendent, autonomous, and computational elements. The services can be specified, published, discovered, and composed with other services using standard protocols. Such composition of services can be threat as wide‐distributed software system. Many languages for software service composition are developed [5]. The goal of such languages is to provide formal way for specifying the connections and the coordination logic between services.

In order to support design, development, and execution of distributed applications in Internet environment, we have developed the end‐user development framework called the platform for research collaborative computing. PRCC is an emerging interdisciplinary field, and it embraces physical sciences like chemistry, physics, biology, environmental sciences, hydro‐ meteorology, engineering, and even art and humanities. All these fields are demanding for potent tools for mathematical modeling and collaborative computing research support. These tools should implement the idea of virtual organization including the possibility to combine distributed workflows, sequences of data processing functions, and so on. The platform for research collaborative computing is the answer for this demand. PRCC has the potential to benefit research in all disciplines at all stages of research. A well‐constructed SOC can empower a research environment with a flexible infrastructure and processing environment by provisioning independent, reusable automated simulating processes (as services), and providing a robust foundation for leveraging these services.

PRCC concept is 24/7‐available online intelligent multidisciplinary gateway for researchers supporting the following main users' activities: login, new project creation, creation of work‐ flow, provision of input data such as computational task description and constrains, specifica‐ tion of additional parameters, workflow execution, and collection of data for further analysis.

*User authorization* is performed at two levels: for the virtual workplace access (login and password) and for grid/cloud resources access (grid certificate).

*Application creating*: Each customer has a possibility to create some projects, with services stored in the repository. Each application consists of a set of the files containing information about the computing workflow, the solved tasks, execution results, and so on.

*Solved task description* is allowed whether with the problem‐oriented languages of the respective services or with the graphic editor.

*Constructing of a computational route* consists of choosing the computing services needed and connecting them in the execution order required. The workflow editor checks the com‐ patibility of numerical procedures to be connected.

Parameters for different computational tasks are provided by means of the respective Web‐ interface elements or set up by default (except the control parameters, for instance, desirable value for time response, border frequencies for frequency analysis, and so on). It can be also required to provide type and parameters of output information (arguments of output charac‐ teristics, scale type used for plot construction and others).

*Launch for execution*initiates a procedure of the application description generation in the internal format and its transferring to the task execution system. Web and grid service orches‐ trator are responsible for automatic route execution composed of the remote service invoca‐ tion. Grid/cloud services invoked by the orchestrator during execution are responsible for preparing input data for a grid/cloud task, its launch, inquiring the execution state, unloading grid/cloud task results, and their transferring to the orchestrator.

*Execution results* consist of a set of files containing information on the results of computing fulfilled (according to the parameters set by a user) including plots and histograms, logs of the errors resulting in a stop of separate route's branches, ancillary data regarding grid/cloud resources used, and grid/cloud task executing. Based on the analysis of the received results, a customer could make a decision to repeat computational workflow execution with changed workflow's fragments, input data, and parameters of the computing procedures.

It is a need to know more details on services, its providers, and the customers, in order to manage service‐oriented applications. There are two roles in development process: the role of service provider and the role of application builder. This separation of concerns empowers application architects to concentrate more on the business logic (in this case research). The tech‐ nical details are left to service providers. Comprehensive repository of various services would ensure the possibility to use the services for the personal/institutional requirements of the sci‐ entific users via incorporation of existing services into widely distributed system (**Figure 1**).

Services can be clustered to two main groups: application supporting services (including subgroups: data processing services, modeling, and simulating services) and environ‐ ment supporting (generic) services (including subgroups: cloud hosting for computa‐ tional, network, and software resources provision, applications/services ecosystems and delivery framework, security, work‐flow engine for calculating purposes, digital science services).

As far as authors know, there are no similar user‐oriented platforms supporting experiments in mathematics and applied sciences. PRCC unveils new methodology for mathematical experiments planning and modeling. It can improve future competitiveness of the science by strengthening its scientific and technological base in the area of experimenting and data processing, which makes public service infrastructures and simulation processes smarter, i.e., more intelligent, more efficient, more adaptive, and sustainable.

#### **2.1. Possible content of services' repository**

research collaborative computing is the answer for this demand. PRCC has the potential to benefit research in all disciplines at all stages of research. A well‐constructed SOC can empower a research environment with a flexible infrastructure and processing environment by provisioning independent, reusable automated simulating processes (as services), and

PRCC concept is 24/7‐available online intelligent multidisciplinary gateway for researchers supporting the following main users' activities: login, new project creation, creation of work‐ flow, provision of input data such as computational task description and constrains, specifica‐ tion of additional parameters, workflow execution, and collection of data for further analysis. *User authorization* is performed at two levels: for the virtual workplace access (login and

*Application creating*: Each customer has a possibility to create some projects, with services stored in the repository. Each application consists of a set of the files containing information

*Solved task description* is allowed whether with the problem‐oriented languages of the

*Constructing of a computational route* consists of choosing the computing services needed and connecting them in the execution order required. The workflow editor checks the com‐

Parameters for different computational tasks are provided by means of the respective Web‐ interface elements or set up by default (except the control parameters, for instance, desirable value for time response, border frequencies for frequency analysis, and so on). It can be also required to provide type and parameters of output information (arguments of output charac‐

*Launch for execution*initiates a procedure of the application description generation in the internal format and its transferring to the task execution system. Web and grid service orches‐ trator are responsible for automatic route execution composed of the remote service invoca‐ tion. Grid/cloud services invoked by the orchestrator during execution are responsible for preparing input data for a grid/cloud task, its launch, inquiring the execution state, unloading

*Execution results* consist of a set of files containing information on the results of computing fulfilled (according to the parameters set by a user) including plots and histograms, logs of the errors resulting in a stop of separate route's branches, ancillary data regarding grid/cloud resources used, and grid/cloud task executing. Based on the analysis of the received results, a customer could make a decision to repeat computational workflow execution with changed

It is a need to know more details on services, its providers, and the customers, in order to manage service‐oriented applications. There are two roles in development process: the role of service provider and the role of application builder. This separation of concerns empowers

workflow's fragments, input data, and parameters of the computing procedures.

about the computing workflow, the solved tasks, execution results, and so on.

providing a robust foundation for leveraging these services.

72 Recent Progress in Parallel and Distributed Computing

password) and for grid/cloud resources access (grid certificate).

respective services or with the graphic editor.

patibility of numerical procedures to be connected.

teristics, scale type used for plot construction and others).

grid/cloud task results, and their transferring to the orchestrator.

Providing the ability to store ever‐increasing amounts of data, making them available for sharing, and providing scientists and engineers with efficient means of data processing are the problems today. In the PRCC, this problem is solving by using the service repository

**Figure 1.** General structure of PRCC.

which is described here. From the beginning, it includes application supporting services (AS) for the typical scheme of a computational modeling experiment, been already considered.

Web services can contain program codes for implementation of concrete tasks from math‐ ematical modeling and data processing and also represent results of calculations in grid/ cloud e‐infrastructures. They provide mathematical model equations solving procedures in depending on their type (differential, algebraic‐nonlinear, and linear) and selected science and engineering analysis. Software services are main building blocks for the following func‐ tionality: data preprocessing and results postprocessing, mathematical modeling, DC, AC, TR, STA, FOUR and sensitivities analysis, optimization, statistical analysis and yield maxi‐ mization, tolerance assignment, data mining, and so on. More detailed description of typical scheme of a computational modeling experiment in many fields of science and technology which has an invariant character is given in [3, 10]. The offered list of calculation types covers considerable part of possible needs in computational solving scientifically applied research tasks in many fields of science and technology.

Services are registered in the network service UDDI (Universal Description, Discovery, and Integration) which facilitate the access to them from different clients. Needed functionality is exposed via the Web service interface. Each Web service is capable to launch computations, to start and cancel jobs, to monitor their status, to retrieve the results, and so on.

Besides modeling tasks, there are other types of computational experiments in which distrib‐ uted Web service technologies for science data analysis solutions can be used. They include in user scenario procedures of curve fitting and approximation for estimating the relationships among variables, classification techniques for categorizing different data into various folders, clustering techniques for grouping a set of objects in such a way that objects in the same group (cluster) are more similar to each other than to those in other groups, pattern recognition utili‐ ties, image processing, and filtering and optimization techniques.

Above computational Web services for data proceeding are used in different science and tech‐ nology branches during data collection, data management, data analytics, and data visual‐ ization, where there are very large data sets: earth observation data from satellites; data in meteorology, oceanography, and hydrology; experimental data in physics of high energy; observing data in astrophysics; seismograms, earthquake monitoring data, and so on.

Services may be offered by different enterprises and communicate over the PRCC, that is why they provide a distributed computing infrastructure for both intra‐ and cross‐enterprise application integration and collaboration. For semantic service discovery in the repository, a set of ontologies was developed which include *resource ontology* (hardware and software grid and cloud resources used for workflow execution), *data ontology* (for annotation of large data files and databases), and *workflow ontology* (for annotating past workflows and enabling their reuse in the future). The ontologies will be separated into two levels: generic ontologies and domain‐specific ontologies. Services will be annotated in terms of their functional aspects such as IOPE, internal states (an activity could be executed in a loop, and it will keep track of its internal state), data transformation (e.g., unit or format conversion between input and output), and internal processes (which can describe in detail how to interact with a service, e.g., a service which takes partial sets of data on each call and performs some operation on the full set after last call).

#### **2.2. Management of Web services**

which is described here. From the beginning, it includes application supporting services (AS) for the typical scheme of a computational modeling experiment, been already considered.

Web services can contain program codes for implementation of concrete tasks from math‐ ematical modeling and data processing and also represent results of calculations in grid/ cloud e‐infrastructures. They provide mathematical model equations solving procedures in depending on their type (differential, algebraic‐nonlinear, and linear) and selected science and engineering analysis. Software services are main building blocks for the following func‐ tionality: data preprocessing and results postprocessing, mathematical modeling, DC, AC, TR, STA, FOUR and sensitivities analysis, optimization, statistical analysis and yield maxi‐ mization, tolerance assignment, data mining, and so on. More detailed description of typical scheme of a computational modeling experiment in many fields of science and technology which has an invariant character is given in [3, 10]. The offered list of calculation types covers considerable part of possible needs in computational solving scientifically applied research

Services are registered in the network service UDDI (Universal Description, Discovery, and Integration) which facilitate the access to them from different clients. Needed functionality is exposed via the Web service interface. Each Web service is capable to launch computations, to

Besides modeling tasks, there are other types of computational experiments in which distrib‐ uted Web service technologies for science data analysis solutions can be used. They include in user scenario procedures of curve fitting and approximation for estimating the relationships among variables, classification techniques for categorizing different data into various folders, clustering techniques for grouping a set of objects in such a way that objects in the same group (cluster) are more similar to each other than to those in other groups, pattern recognition utili‐

Above computational Web services for data proceeding are used in different science and tech‐ nology branches during data collection, data management, data analytics, and data visual‐ ization, where there are very large data sets: earth observation data from satellites; data in meteorology, oceanography, and hydrology; experimental data in physics of high energy;

Services may be offered by different enterprises and communicate over the PRCC, that is why they provide a distributed computing infrastructure for both intra‐ and cross‐enterprise application integration and collaboration. For semantic service discovery in the repository, a set of ontologies was developed which include *resource ontology* (hardware and software grid and cloud resources used for workflow execution), *data ontology* (for annotation of large data files and databases), and *workflow ontology* (for annotating past workflows and enabling their reuse in the future). The ontologies will be separated into two levels: generic ontologies and domain‐specific ontologies. Services will be annotated in terms of their functional aspects such as IOPE, internal states (an activity could be executed in a loop, and it will keep track of its internal state), data transformation (e.g., unit or format conversion between input and output), and internal processes (which can describe in detail how to interact with a service,

observing data in astrophysics; seismograms, earthquake monitoring data, and so on.

start and cancel jobs, to monitor their status, to retrieve the results, and so on.

ties, image processing, and filtering and optimization techniques.

tasks in many fields of science and technology.

74 Recent Progress in Parallel and Distributed Computing

Service‐oriented paradigm implies automated composition and orchestration of software ser‐ vices using workflows. Each workflow defines how tasks should be orchestrated and what components in what execution sequence should be. The workflow also includes the details of the synchronization and data flows. The workflow management may be based on standard Web‐service orchestration description language WS‐BPEL 2.0 (Business Process Execution Language). The initial XML‐based description of the abstract workflow containing the task description parameters (prepared by user via the editor) is transformed to the WS‐BPEL 2.0 description. Then, the orchestration engine invokes Web services passing this task description to them for execution.

The workflow management engine provides seamless and transparent execution of con‐ crete workflows generated at the composition service. This engine leverages existing solu‐ tions to allow execution of user‐defined workflows on the right combination of resources and services available through clusters, grids, clouds, or Web services. Furthermore, the project plans to work on the development of new scheduling strategies for workflow execu‐ tion can be implemented that will take into account multicriteria expressions defined by the user as a set of preferences and requirements. In this way, workflow execution could be directed, for instance, to minimize execution time, to reduce total fee cost, or any combina‐ tion of both.

The configuration and coordination of services in applications, based on the services, and the composition of services are equally important in the modern service systems [6]. The services interact with each other via messages. Message can be accomplished by using a template "request‐response," when at a given time, only one of the specific services caused by one user (the connection between "one‐to‐one" or synchronous model); using a template "publish/ subscribe" when on one particular event many services can respond (communications "one‐ to‐many" or asynchronous model); and using intelligent agents that determine the coordina‐ tion of services, because each agent has at its disposal some of the knowledge of the business process and can share this knowledge with other agents. Such a system can combine the quality of SOS, such as interoperability and openness, with MAS properties such as flexibility and autonomy.
