*1.2.2 Visual analytics pipeline*

To apply visual analytics for both research and industrial applications, an appropriate definition and implementation of visual analytics pipeline must be followed that provides an effective abstraction for designing and implementing visual analytics systems [16]. The most common visual analytics pipeline can be seen in **Figure 1**. This conventional pipeline guides the visual analytics processes as an abstract outline which includes four major procedures and the relationships between them. This subsection explores these procedures in detail and discusses their role in the visual analytics pipeline.

#### **Figure 1.** *Visual analytics pipeline by Keim et al. [17].*

**Data** involves all the steps that are required to prepare the data set for visualizations and data analytics. These steps mainly include preparing and cleaning of data due to it being noisy and having missing values. All of these steps are collectively known as preprocessing steps and including data cleaning, data integration, data transformation, data reduction, and data discretization.

**Visualization** refers to the visual representation of data where the focus is on producing images that communicate the relationships between different attributes. This is achieved through the use of systematic mapping that establishes how the data values will be represented visually, determining how and to what extent a graphic mark property such as size or color will change to reflect the changes in the value of data. Effective visualizations help users to analyze the data and make complex data sets more accessible, understandable, and usable.

**Model** refers to the data analysis or machine learning methods that are applied to extract information out of the data which can later be visualized for crucial knowledge. These methods may include the steps from EDA in combination with machine learning algorithms such as Statistical models or Clustering models. After these methods have been applied, the resultant data is required to be visualized to extract knowledge that can help in understanding the data.

**Knowledge** refers to the process of generating a conclusion which either accepts or rejects a hypothesis. It is not a procedure by itself but an end result in the visual analytics pipeline. Knowledge is extracted from the data by applying all the abovementioned procedures in order to understand the final visualization and get some insight into the data.
