**2.1 Visualization tools**

In contrast to open-source libraries for creating charts, there are also configurable (drag and drop) tools available for data visualization. In this context, we shall explore Tableau and Qlik Sense which are the current market leaders in data visualization and data presentation. These tools make it easier to extract and convey key patterns and insights, something that is important in both the Mapping and Model Visualization relations.

**Tableau** is one of the first tools we consider when we talk about commercial data visualization tools. It has connectivity to multiple data sources, many different switchable chart formats, and a sophisticated mapping capability that can easily convert simple Excel data to colorful dashboards with a lot of interactivity [8]. It has a step-by-step configurable interface from creating charts in sheets to filtering and combining multiple charts that form a Dashboard to overall storytelling.

**Qlik Sense** is a commercial data visualization and analytics tool that enables the user to import and aggregate data from diverse sources and use the data visualization tools of the software to convert raw data into meaningful information. It has an in-memory data storage engine which helps in dynamic visualization building [9]. Qlik Sense also follows a step by step procedure such as Tableau where each sheet (which is also a dashboard) may contain multiple charts; the sheets are then used to create a story by adding a snapshot of the charts or the complete sheet to the storyline.

## **2.2 Data mining tools**

Another approach is to use Data Mining Tools in order to assist with the Data Transformation, Data Mining, and Model Building relations. There are various data mining tools available which are widely used by organizations for data transformation, data mining, and model building such as KNIME, RapidMiner, KEEL, Orange, WEKA, and many more. This subsection only discusses KNIME and RapidMiner as they are the most widely used data mining tools currently in the market.

**KNIME** [11] is an open-source data analytics tool that is written in Java and is based on the Eclipse platform. KNIME provides a user-friendly workspace and is based on the idea of graphical workflows to design the data analytics pipeline. It provides hundreds of nodes that incorporate data I/O, data cleaning, data manipulation, Machine Learning (ML) methods, scripting, and visualizations that can easily be used to create a workflow using a simple drag-and-drop approach [18].

**RapidMiner** [12] is an open-source tool that integrates a number of packages including text mining, ML, predictive analysis, DM, and business analytics [18]. Based on the tool, a desktop software is developed which is known as RapidMiner Studio which provides a GUI. With RapidMiner Studio, the user can perform DM and predictive tasks by creating workflows and then visualizing the output in an interactive representation [19]. RapidMiner allows scripting as well as workflows and is constantly being updated.

### **2.3 Summary state of the art**

While all the tools mentioned above have their pros and cons, none of them completely cover the data analytics pipeline. The business intelligence tools are more inclined towards interactive visualizations and presentation of data, while the data mining tools are more focused on applying machine learning models to data and transforming it. ViDAS fills this void by combining the features of these tools and accommodating the complete visual data analytics pipeline.

#### **3. User-context and requirements**

Identification of user-context is the first step in a project based on humancentered design. It refers to understanding the users and identifying the intended way the project will be used by those users. It could be sufficient to just identify the stakeholders, but in most cases identifying the purpose and scope of the project help recognize the environment, it will be used in. Similarly in ViDAS development, the first step consisted of doing background research on the stakeholders, understanding their needs, and documenting the requirements. It was found that the stakeholders have a simulation model which generates data based on individual simulation runs. This data can not be easily understood by just looking at its tabular form, which is why advanced analytics is needed that extracts crucial information from the raw data and visualizes it in order to gather the knowledge. Once the context of ViDAS was understood, the next step was to gather the requirements for the development of the tool.

After identifying and understanding the context of ViDAS in terms of humancentered design, the next step is to gather and specify the requirements which will be the basis for the development and evaluation phases. The requirement gathering process is not as simple as discussing the needs of the stakeholders and documenting them. Instead, it includes making the stakeholders realize which requirements are needed according to the scope and context of the project [20]. The requirements of a project are further divided into three main categories; Business Requirements, User Requirements, and System Requirements [21]. In this section, we will focus on the business, user, and system requirements of ViDAS and discuss them in detail.

#### **3.1 Business requirements**

Business Requirements consist of the high-level requirements that answer generic questions to define the overall scope of the project [22]. These requirements also include the stakeholder's objectives and the needs of the target users for which the system is to be developed. The basic requirements were gathered from the initial communication with the stakeholders and stated that the users of ViDAS will be software engineers that will possess some basic knowledge of data analytics and visualization. The users will get the data from their clients and provide them with useful insights on raw, unprocessed data. For ViDAS, the business requirements specified that the user should be able to upload raw data into the tool, pre-process and transform the data, apply machine learning models, and visualize the data to extract useful knowledge. In addition to this, the user should also be able to get a general overview of the complete data set in the form of a visualization. The business requirements for ViDAS have been summarized in **Table 1**.


#### **Table 1.**

*Summary of ViDAS business requirements.*

#### **3.2 User requirements**

User Requirements are the specifications of how the user wants to complete certain tasks which are based on the business requirements of the project. User requirements include designing the layout of the system, the sitemap, and developing prototypes while keeping all the user goals in mind. With the help of user requirements, the user needs regarding how the system responds to user input are catered to. Once the business requirements were gathered, a meeting was arranged with the stakeholders which consisted of multiple steps including group discussions, workshops, and questionnaires to gather the user requirements and finalize the design of ViDAS. The main idea behind conducting workshops was to observe the stakeholders while using a similar tool and document their approach as well as the steps taken in a data analysis task. With the help of these workshops, important user requirements were identified and gathered.

There were two workshops conducted and the purpose of both workshops was to gather important feedback on how the stakeholders want to perform the data analysis and visualization steps. The first workshop focused on Tableau, which is a business intelligence tool that allows its users to perform data analytics and create interactive visualizations to find patterns in the data and extract useful information. The second workshop consisted of Qlik Sense and focused more on data analytics including applying machine learning models on the data and later create interactive visualizations. The task lists designed for both of the workshops were fairly similar yet focused on their respective purpose and objectives. Crucial information was gathered during these workshops and important user requirements were pointed out during the discussions. After the workshops, the stakeholders were asked to fill questionnaires that are summarized in **Table 2**, giving brief feedback regarding the required user interaction and user experience the stakeholders expect.

With the help of the workshops and the questionnaires, the user requirements of ViDAS were collected. The user requirements focused on the drag-and-drop implementation of the tool during the data analytics and visualization processes. The custom analysis features of Qlik Sense and Tableau did not cover the basic requirements, due to which the custom analysis was prioritized to give ViDAS an advantage over these tools. The custom analysis should be created using a workflow pipeline and the resultant data should be exported to the visualization tab. The workflows in the custom analysis should comprise templates containing advanced machine learning models, and the ability to save and load custom workflows. Furthermore, the user should be able to write custom python scripts and execute them. The visualizations should be interactive and created using the charts as well as the fields. In the case of fields, ViDAS will automatically detect the dropped fields and create the best-suited chart based on those fields. The user requirements for ViDAS have been summarized below in **Table 3**.


#### **Table 2.**

*An overview of Qlik sense and tableau evaluation (1 star - unforgettably bad, 2 stars - below average, 3 stars - average, 4 stars - above average, and 5 stars - unforgettably good).*


#### **Table 3.**

*Summary of ViDAS user requirements.*

#### **3.3 System requirements**

System Requirements are the low-level requirements that act as the basic building blocks on which the system will be developed. These requirements cover all the technical details of the project including the technology used, the versioning, the compatibility, the database, and if the system will be hosted on a server. Once the user requirements were finalized, the next step was to define the system requirements based on the gathered user requirements. The discussions after the workshops gave a broad idea regarding the technology to be used in ViDAS development, thus shaping the system requirements as seen in **Figure 2**. It was required for the tool to be integrated into an already built website which was developed using React. js as a front-end and Java Spring as the back-end. Due to this limitation, ViDAS was *Implementing Visual Analytics Pipelines with Simulation Data DOI: http://dx.doi.org/10.5772/intechopen.96152*

**Figure 2.** *ViDAS technology stack.*

required to be developed using React.js as the front-end, while there was no restriction regarding the back-end implementation. After a detailed research, Python was decided as the back-end language due to it being highly dynamic for performing data analytics on big data and packaging a number of machine learning libraries. Flask was decided as the web framework for python to connect and communicate with the front-end.

For the front-end module, basic HTML was used to define the structure of ViDAS and CSS was used to style the HTML components for a complete user experience. React.js was used as the JavaScript framework to define the support of the functionalities in the front-end. To cater to the drag-and-drop requirement, MxGraph was used and customized according to the user requirements of ViDAS. The use of customized MxGraph further supported the User Interface and increased the intuitiveness of the tool while boosting the User Experience. Plotly was used to create interactive visualizations. These interactive visualizations were created in the back-end module and transmitted to the front-end for displaying them to the user.

The front-end was developed to run on the user's browser whereas the back-end was implemented to run on the server. The communication between the client and server was done using a promise-based HTTP API called Axios. With the help of Axios, the communication data was transmitted between the front-end and the back-end in JSON format. A state management library called Redux was used to manage the state and store the important data that was used by multiple pages on the front-end. Due to Redux, the data transmission among the front-end pages was eliminated and stored in a centralized state container.

The back-end is comprised of a number of Python libraries to deal with the big data and the data analytics functionalities. Pandas was used to store the big data in tabular form which helped in performing pre-processing steps before applying the machine learning models. Scikit-learn, a machine learning library, was used to support the machine learning models and provide various data analysis methods. The system requirements for ViDAS have been summarized below in **Table 4**.

#### **3.4 Requirement outcomes**

This subsection gives a brief outcome of this section and discusses the business, user, and system requirements that were gathered. The business requirements were gathered during the initial communication with the stakeholders and as a result, the scope of ViDAS and its users were identified.


#### **Table 4.**

*Summary of ViDAS system requirements.*

To gather the user requirements, two workshops were designed that comprised of task sheets. The stakeholders were guided through the tasks and their actions were observed. A number of new requirements were identified during this process. After the workshops, the stakeholders were asked to fill the open-ended and closedended questionnaires that gave shape to the user requirements of ViDAS. Once the user requirements were finalized, a number of post-workshop discussions were arranged where the technology stack of ViDAS development was discussed which gave shape to the system requirements of ViDAS.

All the requirements were gathered over a series of meetings, workshops, and discussions which have been summarized above in **Tables 1, 3**, and **4**. Now that the requirements of ViDAS were finalized, the next step in the human-centered design process was to produce the design workflows and to implement.

#### **4. Producing design solution**

The design creation process can be conducted in different ways depending on the scenario, from copying and development from previous design inspirations, to creating innovatively. Regardless of the source, all the design ideas progress through iterative development in the human-centered design approach. In such cases, mock-ups and simulations are essential to support this iterative cycle. Various design techniques are available such as brainstorming, parallel design, storyboarding, paper-based prototyping, and computer-based prototyping. While it is not intended to imply that all these techniques should be used in every product development, they should at least consist of a series of UI (User Interface) screens, and a partial database which allows the user to interact, visualize, and comment on the future design [23]. These early simulations are easy to create and result in a fault-free product at the end. The experts, stakeholders, and user representatives in the design development cycle help identify the faults, correct design, and undergo the costly process of re-implementation once the design is finalized.

After the requirements were finalized, as discussed in section 3, the next step was to create the design of ViDAS. Paper prototyping was used as a starting point to create the conceptual design of the tool. The stakeholders were involved in the design process to get their important feedback while creating the design. After the paper prototype feedback, the Balsamiq wireframing tool was used to transfer the conceptual design into interactive mock-ups [24] to better communicate the design with the stakeholders. These mockups consisted of design for each page of the tool

### *Implementing Visual Analytics Pipelines with Simulation Data DOI: http://dx.doi.org/10.5772/intechopen.96152*

and interactivity within the page (buttons, hyperlinks). Once the design was created and approved by the stakeholders, the development of ViDAS was started.

This section is divided into two subsections. The first one discusses the paper prototype of ViDAS. The next subsection addresses the software prototype and the implementation of ViDAS. These subsections will also address different tools that were used during this iterative process.

#### **4.1 Paper prototyping**

Paper prototyping is a widely used approach in the human-centered design process. It is a throwaway prototype technique used to create the initial conceptual design for a tool or application. Paper prototypes involve creating rough, even hand-sketched drawings or models of a design. The functionality is simulated by a member of the design team playing the computer and responding to the user's inputs by swapping the bits of the paper or writing an output. Creating paper prototyping is simple; however, it can provide convenient stakeholder feedback to aid the design [25].

Based on the requirements, paper prototypes of the different elements were drawn using pen and paper. These paper prototypes consist of data processing, chart creation workflow, and analytics workflow. Also, all the other necessary elements, such as menus, icons, buttons, labels, and dialog sequences were drawn. **Figure 3** shows the ViDAS paper prototype; each view has a description of what can be done and what happens when one interacts with individual elements.

The testing of the paper interface was video-taped as the elements moved and changed. This videotape of the paper prototype was shared with the stakeholders via email to get feedback about the initial design. Paper Prototyping was a handy technique during ViDAS initial design creation and getting the stakeholder's thoughts about the tool's overall shape. However, during the post-evaluation discussion of the paper prototype, it was found that the paper prototyping approach

**Figure 3.** *ViDAS paper prototype.*

does not discover every single usability problem. The paper prototype of the initial design was transformed into interactive mockups to address every aspect and give the design a more realistic feeling. It was also beneficial to communicate the design better with the stakeholders.

The mock-ups of ViDAS were created using Balsamiq [24] wireframing tool which is an industry-standard, light-weight wireframing tool used to create design mock ups and to show the interactivity among different pages and elements of the design. These mock-ups consist of each page of the tool along with interactivity that redirect when buttons or hyperlinks are clicked.

**Figure 4** shows different design mock-ups of ViDAS. The Data tab view shows the tabular view of the uploaded file. Next to Data, the Overview tab is about visually inspecting the overall data. In contrast, the the Data Analysis tab shows the chart creation process that focuses on data fields. The Custom Analysis tab shows the workflow of creating analytics using network graphs. In this phase of the design, different changes were carried out which are summarized in **Table 5** and discussed later on.
