Preface

Nowadays, the use of large amounts of data in industry is increasing because of the development of new sensors and more complex systems, which require increased reliability, availability, and safety. The Internet of Things (IoT) is one example of a source that creates large amounts and varieties of data.

Principal component analysis (PCA) is a statistical procedure increasingly being applied to analyze large datasets. It is mainly used to modify datasets, reducing the coordinate system by a linear transformation. The new, smaller set of information extracted via PCA is a set of summary indices called "principal components." The way to obtain these principal components is according to variance, where the first principal component is obtained for the greater variance.

The main objective of PCA is to transform the size of a dataset into a smaller transformed space provided by the eigenvectors of the covariances associated to the original dataset. A ranking of each eigenvector is done following the maximum variability. These are called principal components. In other words, the method transforms the original dataset into a new p-dimensional set of Cartesian coordinates, being a projection of the dataset onto the principal component vector, where the direction is given by the P matrix, where a is the first largest eigenvalue and its columns are the retained eigenvectors. PCA can be also related by canonical correlation analysis, where the coordinate systems are obtained optimally by the cross-covariance considering two datasets together, while PCA defines a new orthogonal coordinate system that optimally describes the variance in a single dataset.

PCA can also be associated with other algorithms, but with some differences, for example, factor analysis, non-negative matrix factorization, correspondence analysis, K-means clustering, and so on. PCA does have some shortcomings and thus generalizations of the technique have been developed to overcome these limitations. These include sparse PCA, robust PCA, and nonlinear PCA.

This book provides a comprehensive overview of PCA and its analytical principles. It examines the use of PCA in a variety of fields, including technology, engineering, finance, risk analysis, marketing, economics, and more. It presents practical case studies highlighting the use of PCA in several types of industries to solve problems both small and large using simple and complex algorithms.

> **Fausto Pedro García Márquez** Ingenium Research Group, University of Castilla-La Mancha, Ciudad Real, Spain

**Chapter 1**

**Abstract**

The Foundation for Open

*Ana Perišić and Branko Perišić*

contribution of this research chapter.

system of systems analysis

**1. Introduction**

**1**

Component Analysis: A System of

Systems Hyper Framework Model

The interoperability and integration of heterogeneous systems, with a high degree

multidimensional feature space, raise the problem configurations complexity. Due to the emergent nature of a large collection of locally interacting components, the properties and the behavior of a collection may not be fully understood or predicted even the full knowledge of its constituents is available. The simplification is contemporary addressed through either dimensional reduction methods, like Principal Component Analysis (PCA), or overall ontology managing through Physics of Open Systems (POS) paradigm. The question is: Is it possible to cope with the complexity by integrating dimension reduction steps with basic POS concepts on the Large Data Objects (LDOs) holding the structure and behavior of the complex system. The intended mission of this chapter is to formulate a starting System of Systems (SoS) based configurable hyper framework model that may be dynamically improved to better suit the static structure and dynamic behavior of complex SoS configurations. That is the reason why the reflexive integration of POS and different dimensional reduction methods, through an interoperability framework, have been proposed as the main

**Keywords:** collections complexity, framework modeling, large data objects, principal component analysis, physics of open systems, heterogeneous systems interoperability,

The globally accepted definitions of digitalization and digital transformation do not still exist, although the terms are in the field for quite a long time. In Gartner Glossary, digitalization is defined as *the use of digital technologies to change a business model and provide new revenue and value-producing opportunities; it is the process of moving to a digital business* [1]*.* This definition accents the higher granularity mission concerning global system aspects. Considering the particular enterprise systems, that have decided to move their business into the digital form, there is a challenging activity of developing a completely new set of processes and procedures in compliance

of autonomy and time-dependent dynamic configuration over multilevel and
