**4. Conclusions**

64 Principal Component Analysis

Fig. 14. Cooman's plot for the classes 2 and 4.

Fig. 15. Cooman's plot for the classes 3 and 4.

This study shows the importance of PCA in traceability studies which can be carried out on different kind of matrices. As the majority of products come about from a transformation of some raw material, traceability has components deriving from both the fingerprint geographical characteristics transfer to the raw material and the production techniques developed in a specific context.

Moreover, PCA is a very useful tool for dealing with some supervised problems, due to its capability of describe objects without altering their native structure. However, it must be noted that, especially in forensics, results originating from a multivariate statistical analysis need to be presented and considered in a court of law with great care. For these kinds of results, the probabilistic approach is different from the one generally adopted for analytical results. In fact, in univariate analytical chemistry, the result of a measurement is an estimate of its true value, with its uncertainty set at a stated level of confidence. On the other hand, the use of multivariate statistical analysis in a court of law would imply a comparison between an unknown sample and a data set of known samples belonging to a certain number of classes. However, there remains the real possibility that the unknown sample might belong to yet another class, different from those of the known samples. In case 1, for example, the unknown sample might have been produced in a refinery that had not been included in the data matrix used for the comparison, or in case 3, the piece of packing tape, might not have belonged to any of the rolls analyzed. (Case 2 appears to be different, because sample C was specifically required to be classified in class A or B).

In these cases, an initial approach to the analytical problem by using PCA is fundamental because it allows the characteristics of the unknown sample to be compared with those of samples of which the origin is known. Depending on the results obtained at this step, a potential similarity between the unknown sample and samples from some specific classes may be excluded, or the class presenting the best similarity with the unknown sample might be found.

Results derived from PCA present a real picture of the situation - without any data manipulation or system forcing - and as such can form the basis for further deduction and the application of any other multivariate statistical analysis. A second step might be the application of some discriminant analysis or class modeling tool and an attempt to classify the sample in one of the classes included in the data matrix. A good result is achieved when PCA results fit those of supervised analysis. However, in a court of law these results would only become compelling alongside other strong evidence from the investigation, because, as already stated, the sample would have been compared with samples belonging to some distinct classes (and not all existing ones) and the data matrix might not adequately show the variability within each class.

**0**

**4**

*Japan*

**Subset Basis Approximation of Kernel**

Principal component analysis (PCA) has been extended to various ways because of its simple definition. Especially, non-linear generalizations of PCA have been proposed and used in various areas. Non-linear generalizations of PCA, such as principal curves (Hastie & Stuetzle, 1989) and manifolds (Gorban et al., 2008), have intuitive explanations and formulations comparing to the other non-linear dimensional techniques such as ISOMAP (Tenenbaum et al.,

Kernel PCA (KPCA) is one of the non-linear generalizations of PCA by using the kernel trick (Schölkopf et al., 1998). The kernel trick nonlinearly maps input samples to higher dimensional space so-called the feature space F. The mapping is denoted by Φ, and let x

Then a linear operation in the feature space is a non-linear operation in the input space. The dimension of the feature space F is usually much larger than the input dimension *d*, or could be infinite. The positive definite kernel function *k*(·, ·) that satisfies following equation is used

By using the kernel function, inner products in F are replaced by the kernel function *k* : **<sup>R</sup>***<sup>d</sup>* <sup>×</sup> **<sup>R</sup>***<sup>d</sup>* <sup>→</sup> **<sup>R</sup>**. According to this replacement, the problem in <sup>F</sup> is reduced to the problem in **R***n*, where *n* is the number of samples since the space spanned by mapped samples is at most *n*-dimensional subsapce. For example, the primal problem of Support vector machines

In real problems, the number of *n* is sometimes too large to solve the problem in **R***n*. In the case of SVMs, the optimization problem is reduced to the convex quadratic programming whose size is *n*. Even if *n* is too large, SVMs have efficient computational techniques such as chunking or the sequential minimal optimization (SMO) (Platt, 1999), since SVMs have sparse solutions for the Wolf dual problem. After the optimal solution is obtained, we only have to store limited number of learning samples so-called support vectors to evaluate input vectors.

(SVMs) in <sup>F</sup> is reduced to the Wolf dual problem in **<sup>R</sup>***<sup>n</sup>* (Vapnik, 1998).

2000) and Locally-linear embedding (LLE) (Roweis & Saul, 2000).

**1. Introduction**

be a *d*-dimensional input vector,

to avoid calculation in the feature space,

where �·, ·� denotes the inner product.

**Principal Component Analysis**

<sup>Φ</sup> :**R***<sup>d</sup>* → F, <sup>x</sup> �→ <sup>Φ</sup>(x). (1)

*<sup>k</sup>*(x1,x2) = �Φ(x1), <sup>Φ</sup>(x2)� ∀x1,x<sup>2</sup> <sup>∈</sup> **<sup>R</sup>***d*, (2)

*The University of Electro-Communications*

Yoshikazu Washizawa
