2. Notations

xi ¼ ðxi1,xi2,…,xiDÞ <sup>T</sup> represents the ith data point in the original D-dimensional space. x without subscript is used to represent an arbitrary data point.

X ¼ ðx1, x2, …, xMÞ represents the M-data points collection.

N(i) represents the set of K-nearest neighbor of the ith data point.

y<sup>i</sup> ¼ ðyi1,yi2,…, yidÞ <sup>T</sup> represents the d-dimensional representation of the ith data point after dimensionality reduction. Similarly, y without subscript is used to represent an arbitrary data point.

Y ¼ ðy1, y2, …, yMÞ denotes the data collection in the low-dimensional space.

<sup>C</sup> <sup>¼</sup> XX<sup>T</sup> is the covariance matrix for the centered data, where <sup>1</sup> <sup>M</sup> ∑<sup>M</sup> <sup>i</sup>¼<sup>1</sup>x<sup>i</sup> <sup>¼</sup> <sup>0</sup>. Usually, this can be enabled by mean subtraction: <sup>x</sup> <sup>¼</sup> <sup>x</sup>−μ, and <sup>μ</sup> <sup>¼</sup> <sup>1</sup> <sup>M</sup> ∑<sup>M</sup> <sup>i</sup>¼<sup>1</sup>xi.

W ¼ fwijg is the weight matrix to model the graph for pairwise face images, which will be specified by different metrics, e.g. wij ¼ 1 if the jth data point is one of the K nearest neighbors of the ith data point, and wij ¼ 0 otherwise.

D ¼ fdijg is the distance matrix measuring the pairwise distances among the data points, which can be Euclidean or other distance metrics.

Λ ¼ diagðλ1, λ2, …, λDÞ is a diagonal matrix whose elements are the D eigenvalues (ranked decreasingly λ1≥λ2≥…≥λD≥0 decomposed from an D +D matrix.

V ¼ ðv1, v2, …, vd, …, vD<sup>−</sup>dþ<sup>1</sup>, …, vDÞ is the projection matrix, which is consisted of the eigenvectors corresponding to the ranked eigenvalues. ðv1, v2,…, vdÞ are the top d eigenvectors, and ðvD<sup>−</sup>dþ<sup>1</sup>,…, vDÞ are the bottom d eigenvectors.

I denotes the identity matrix.

In contrast, the template-based methods have the problem of serious dependence on training data. If similar poses to the query pose do not exist in the training set, the estimated result would be biased. The regression-based methods often require to use complicated regression models, for example, a high-order polynomial. However, complicated nonlinear function would cause the problem of overfitting, which will result in poor generalization of the model. The deformable models require the localization of dense facial features, such as landmarks of facial components, which are seriously influenced by the head pose. The manifold learningbased methods are somehow limited by some problems, such as identity and noise sensitivity; however, simple efforts can be made to efficiently improve the performance [15]. More importantly, the manifold learning-based methods show promising performance of generalization. And the head pose can be easily modeled and better visualized with low-dimensional features.

According to the previous analysis, the main focus of this chapter will be on the manifold learning based on head pose estimation. The main notations used in this chapter are listed and interrupted in Section 2. In Section 3, classical manifold learning algorithms will be elaborated. In Section 4, adaptions and extensions of manifold learning algorithms, which are more suitable for head pose estimation, are discussed. Section 4 summaries the work, and some

<sup>T</sup> represents the ith data point in the original D-dimensional space. x with-

<sup>T</sup> represents the d-dimensional representation of the ith data point after

<sup>M</sup> ∑<sup>M</sup> <sup>i</sup>¼<sup>1</sup>xi.

+

D matrix.

<sup>M</sup> ∑<sup>M</sup>

<sup>i</sup>¼<sup>1</sup>x<sup>i</sup> <sup>¼</sup> <sup>0</sup>. Usually, this can

dimensionality reduction. Similarly, y without subscript is used to represent an arbitrary data

W ¼ fwijg is the weight matrix to model the graph for pairwise face images, which will be specified by different metrics, e.g. wij ¼ 1 if the jth data point is one of the K nearest neighbors

D ¼ fdijg is the distance matrix measuring the pairwise distances among the data points,

Λ ¼ diagðλ1, λ2, …, λDÞ is a diagonal matrix whose elements are the D eigenvalues (ranked

More details will be given in following sections.

available resources of manifold learning are given.

out subscript is used to represent an arbitrary data point.

X ¼ ðx1, x2, …, xMÞ represents the M-data points collection.

N(i) represents the set of K-nearest neighbor of the ith data point.

<sup>C</sup> <sup>¼</sup> XX<sup>T</sup> is the covariance matrix for the centered data, where <sup>1</sup>

be enabled by mean subtraction: <sup>x</sup> <sup>¼</sup> <sup>x</sup>−μ, and <sup>μ</sup> <sup>¼</sup> <sup>1</sup>

of the ith data point, and wij ¼ 0 otherwise.

which can be Euclidean or other distance metrics.

decreasingly λ1≥λ2≥…≥λD≥0 decomposed from an D

Y ¼ ðy1, y2, …, yMÞ denotes the data collection in the low-dimensional space.

2. Notations

xi ¼ ðxi1,xi2,…,xiDÞ

116 Manifolds - Current Research Areas

y<sup>i</sup> ¼ ðyi1,yi2,…, yidÞ

point.
