**3. Our approaches**

*Applications of Pattern Recognition*

not detected, or not available. If one of the interesting patterns is that a gene "xxx" is "detected" in experiments on tissue "yyy" of an organism at a particular developmental stage, then inconsistency of the pattern from the dataset will exist where there are data that shows that the gene "xxx" is "not detected" in other experiments that investigates the tissue "yyy" of the same organism at the same developmental stage. Also, uncertainty about the presence of the gene "xxx" can exist in the dataset where the information about the presence of the gene in the experiment about the tissue "yyy" at the same developmental stage is missing. Such missing information can be denoted by "unavailable" or empty space, among others. Inconsistent data relating to gene expressions in tissues of different developmental stages are reported in [17, 19]. Finally, a Radiologist chest x-ray report can be used to detect aortic unfolding which is mostly associated with systemic hypertension. However, there are instances of aortic unfolding which are not associated with systemic hypertension. There are also, some instances of aortic unfolding which it is not known if they are associated with systemic hypertension. These instances are inconsistent in a pattern

Inconsistent data which are associated to patterns in a large dataset can be difficult to visualise. This is because they are not explicitly indicated in the dataset as inconsistent. For example, missing data can exist as "unavailable", "forthcoming", "-", "not existing", or even empty spaces. Contradictions on the other hand, differ from one dataset to another, depending on the semantic definition of the data in the dataset. Interestingly, there are dedicated Applications such as CUBIST [19], ConTra [20], and R Package VIM [21] which enables the visualisation of the amount or pattern of contradiction and missingness in a noisy dataset. Inconsistent data whose pattern involves mutually exclusive type of contradictions is depicted by ConTra. Nwagwu explains in [20] how the contradictory attribute values in the gene "TSPAN6" of the tissue "Pancreas" is detected by ConTra and visualised in a pie chart. ConTra applies colour coding on charts to enable the visualisation of inconsistencies in a large dataset. Also, ConTra enables the visualisation of the pattern of distribution of

involving systemic hypertension as a cause of aortic unfolding.

**2.1 Visual analysis of inconsistencies in patterns of dataset**

contradictions across the dataset. It is further discussed in Section 3.11.

dataset apart from the missingness.

designed for particular domain of data analysis.

R Package VIM is a good analytical tool that focuses on visual presentations and analysis of missingness. It is used in plotting the aggregates of missingness in variables of a Barplots. It also shows missing data in a matrix plot, Histogram, Spline plot, Parallel coordinate plots and in Maps [21]. It uses Barplot to show the number and distributions of missing values for a sub-sample of the EU-SILC data from Statistics. Notwithstanding VIM's comprehensive collection of visualisation methods for exploring missing data, its environment requires extensive training in R skills in order to access its visualisation methods. Also, the VIM package does not enable the analysis of other types of inconsistencies such as contradictions in a

There are other tools which enables the visualisation of inconsistencies as explained in [19, 22, 23]. A graphical tool is proposed in [22] that highlight inconsistent instances in the network such as the highlights of direct comparisons that strongly drive other treatment effect estimates and hot spots of network inconsistency. It also proposed a clustering approach that automatically groups comparisons for highlighting hot spots. CUBISTs [19] is an example of an application that applies colour coding and fault tolerance in traditional visualisation tools such as pie or bar chart to enable easy visual analysis of inconsistencies. Even so, these applications are not holistic in exploring inconsistencies in patterns and most of them are

**28**

Two approaches are presented for visualising inconsistencies in patterns in this section namely; visualising inconsistencies in objects with many attribute values and Visual comparison of an investigated dataset with a case control dataset. These approaches and their associated tools which were developed by the authors are discussed in this section.
