3. Reformulations of LEM, LLE, HLLE and LTSA using local tangent coordinates

### 3.1. Reformulation of Laplacian eigenmaps

The method LEM was introduced by Belkin and Niyogi [4]. We can summarize the geometrical motivation of LEM as follows. Assume that we are searching for a smooth one-dimensional embedding f : M ! ℝ from the manifold to the real line so that data points near each together on the manifold are also mapped close together on the line. Think about two adjacent points, x;z∈M, which are mapped to fðxÞ and fðzÞ, respectively, we can obtain that

$$<\langle f(z) - f(\mathbf{x}) \vert \leq \|\nabla\_{\mathcal{M}} f(\mathbf{x})\| \|z\mathbf{-x}\| + O(\left\|z \mathbf{-x}\right\|^2) \tag{2}$$

where ∇Mf is the gradient vector field along the manifold. Thus, to the first order, ∥∇Mf ∥ provides us with an estimate of how far apart f maps nearby points. When we look for a map that best preserves locality on average, a natural choice to find f is to minimize [4]:

$$\mathcal{O}\_{\text{lap}}(f) = \int\_{\mathcal{M}} \|\nabla\_{\mathcal{M}} f\|^2 = \int\_{\mathcal{M}} \Delta\_{\mathcal{M}}(f) f \tag{3}$$

where the integral is taken with respect to the standard measure over the manifold. Thus, the function f that minimizes ΦlapðfÞ has to be an eigenfunction of the Laplace-Beltrami operator ΔM, which is a key geometric object associated with a Riemannian manifold [14].

Suppose that the tangent coordinate of x∈N ðxÞ is given by u. Then, the rule gðuÞ ¼ fðxÞ <sup>¼</sup> <sup>f</sup>∘ψðu<sup>Þ</sup> defines a function <sup>g</sup> : <sup>U</sup> ! <sup>ℝ</sup>, where <sup>U</sup> is the neighborhood of <sup>u</sup>∈ℝ<sup>d</sup>. With the help of local tangent coordinates, we can reduce the computation of the gradient vector ∇MfðxÞ on the manifold to the computation of the ordinary gradient vector on the Euclidean space:

$$\nabla\_{\text{tuf}}f(\mathbf{x}) = \nabla g(\boldsymbol{u}) = \left(\frac{\partial \mathbf{g}(\boldsymbol{u})}{\partial \boldsymbol{u}^1}, \dots, \frac{\partial \mathbf{g}(\boldsymbol{u})}{\partial \boldsymbol{u}^d}\right)^T \tag{4}$$

where <sup>u</sup> ¼ ðu<sup>1</sup>;…;udÞ∈ℝ<sup>d</sup>, and we keep up tan in the notation to make clear that it counts on the coordinate system in TxðMÞ. For different local coordinate systems, although the tangent gradient vector will be different, the norm ∥∇tanfðxÞ∥ is inimitably defined such that equation (3) can be approximated by estimating the following functional:

$$\tilde{\mathcal{O}}\_{\text{lap}}(f) = \int\_{\mathcal{M}} \left\| \nabla\_{\text{lau}} f(\mathbf{x}) \right\|^2 d\mathbf{x} \tag{5}$$

where dx stands for the probability measure on M.

In order to compute the local object ∥∇tanfðxÞ∥<sup>2</sup> , we first use the first-order Taylor series expansion to approximate the smooth functions {fðxij Þ} k <sup>j</sup>¼<sup>1</sup>; <sup>f</sup> : <sup>M</sup> ! <sup>ℝ</sup>, and together with Eq. (4), we have:

$$\begin{aligned} f(\mathbf{x}\_{i\cdot}) &= f(\mathbf{x}\_{i}) + \left(\nabla\_{\text{tmy}} f(\mathbf{x}\_{i})\right)^{T} (\mathbf{x}\_{i\cdot} - \mathbf{x}\_{i}) + O(\left\|\mathbf{x}\_{i\cdot} - \mathbf{x}\_{i}\right\|^{2}) \\ &= g(\mathbf{u}\_{j}^{i}) = g(0) + \left(\nabla\_{\text{tmy}} f(\mathbf{x}\_{i})\right)^{T} \mathbf{u}\_{j}^{i} + O(\left\|\mathbf{u}\_{j}^{i}\right\|^{2}) \end{aligned} \tag{6}$$

Over U<sup>i</sup> , we develop the operator <sup>α</sup><sup>i</sup> ¼ ½gð0Þ;∇gð0Þ� ¼ ½gð0Þ;∇tanfðxiÞ� that approximates the function <sup>g</sup>ðui j <sup>Þ</sup> by its projection on the basis <sup>U</sup><sup>i</sup> <sup>j</sup> <sup>¼</sup> {1;ui j 1 ;…;ui j d }:

$$f(\mathbf{x}\_{i\cdot}) = \mathbf{g}(\boldsymbol{u}\_{j}^{i}) = (\boldsymbol{\alpha}^{i})^{T} \boldsymbol{\mathcal{U}}\_{j}^{i} \tag{7}$$

The least-squares estimation of the operator α<sup>i</sup> can be computed by:

on the manifold are also mapped close together on the line. Think about two adjacent points,

<sup>j</sup> <sup>f</sup>ðzÞ−fðxÞj≤∥∇MfðxÞ∥∥z−x<sup>∥</sup> <sup>þ</sup> <sup>O</sup>ð∥z−x∥<sup>2</sup>

where ∇Mf is the gradient vector field along the manifold. Thus, to the first order, ∥∇Mf ∥ provides us with an estimate of how far apart f maps nearby points. When we look for a map

∥∇M<sup>f</sup> <sup>∥</sup><sup>2</sup> <sup>¼</sup>

where the integral is taken with respect to the standard measure over the manifold. Thus, the function f that minimizes ΦlapðfÞ has to be an eigenfunction of the Laplace-Beltrami operator

Suppose that the tangent coordinate of x∈N ðxÞ is given by u. Then, the rule gðuÞ ¼ fðxÞ <sup>¼</sup> <sup>f</sup>∘ψðu<sup>Þ</sup> defines a function <sup>g</sup> : <sup>U</sup> ! <sup>ℝ</sup>, where <sup>U</sup> is the neighborhood of <sup>u</sup>∈ℝ<sup>d</sup>. With the help of local tangent coordinates, we can reduce the computation of the gradient vector ∇MfðxÞ on the manifold to the computation of the ordinary gradient vector on the Euclidean

> � <sup>∂</sup>gðu<sup>Þ</sup> <sup>∂</sup>u<sup>1</sup> ;⋯;

∥∇tanfðxÞ∥<sup>2</sup>

Þ} k

−xiÞ þ Oð∥xij

<sup>j</sup> <sup>þ</sup> <sup>O</sup>ð∥u<sup>i</sup> j ∥2

ui

, we develop the operator <sup>α</sup><sup>i</sup> ¼ ½gð0Þ;∇gð0Þ� ¼ ½gð0Þ;∇tanfðxiÞ� that approximates the

<sup>j</sup> <sup>¼</sup> {1;ui j 1 ;…;ui j d }:

where <sup>u</sup> ¼ ðu<sup>1</sup>;…;udÞ∈ℝ<sup>d</sup>, and we keep up tan in the notation to make clear that it counts on the coordinate system in TxðMÞ. For different local coordinate systems, although the tangent gradient vector will be different, the norm ∥∇tanfðxÞ∥ is inimitably defined such that equa-

> ð M

∂gðuÞ ∂ud

�<sup>T</sup>

ð M

that best preserves locality on average, a natural choice to find f is to minimize [4]:

ð M

ΔM, which is a key geometric object associated with a Riemannian manifold [14].

∇tanfðxÞ ¼ ∇gðuÞ ¼

tion (3) can be approximated by estimating the following functional:

where dx stands for the probability measure on M.

In order to compute the local object ∥∇tanfðxÞ∥<sup>2</sup>

fðxij

expansion to approximate the smooth functions {fðxij

<sup>¼</sup> <sup>g</sup>ðui j

<sup>Þ</sup> by its projection on the basis <sup>U</sup><sup>i</sup>

<sup>Φ</sup><sup>~</sup> lapðfÞ ¼

Þ ¼ <sup>f</sup>ðxiÞþð∇tanfðxiÞÞ<sup>T</sup>ðxij

Þ ¼ <sup>g</sup>ð0Þþð∇tanfðxiÞÞ<sup>T</sup>

ΦlapðfÞ ¼

space:

136 Manifolds - Current Research Areas

Eq. (4), we have:

Over U<sup>i</sup>

function <sup>g</sup>ðui

j

Þ (2)

(4)

ΔMðfÞf (3)

dx (5)

<sup>j</sup>¼<sup>1</sup>; <sup>f</sup> : <sup>M</sup> ! <sup>ℝ</sup>, and together with

<sup>Þ</sup> (6)

, we first use the first-order Taylor series

−xi∥<sup>2</sup> Þ

x;z∈M, which are mapped to fðxÞ and fðzÞ, respectively, we can obtain that

$$\underset{a^i}{\text{argmin}} \sum\_{j=1}^k \left( f(\mathbf{x}\_{i\_j}) - (\alpha^i)^T \mathbf{U}\_j^i \right)^2 \tag{8}$$

It is easy to show that the least-squares solution of the above object function is <sup>α</sup><sup>i</sup> ¼ ðU<sup>i</sup> Þ † f i , where f <sup>i</sup> ¼ ½ <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup>;…; <sup>f</sup>ðxik Þ�∈ℝ<sup>k</sup> , <sup>U</sup><sup>i</sup> ¼ ½U<sup>i</sup> <sup>1</sup>; <sup>U</sup><sup>i</sup> <sup>2</sup>; …; <sup>U</sup><sup>i</sup> k�∈ℝ<sup>k</sup> · <sup>ð</sup>1þd<sup>Þ</sup> , and <sup>ð</sup>U<sup>i</sup> Þ † denotes the pseudoinverse of U<sup>i</sup> . If we define a local gradient operator G<sup>i</sup> ∈ℝ<sup>d</sup> · <sup>k</sup> which is constructed by the last d rows of <sup>ð</sup>U<sup>i</sup> Þ † , we have <sup>∇</sup>tanfðxiÞ ¼ <sup>G</sup><sup>i</sup> f i . Furthermore, the local object ∥∇tanfðxiÞ∥<sup>2</sup> can be computed as:

$$\|\nabla\_{\text{tuf}}f(\mathbf{x}\_i)\|^2 = \nabla\_{\text{tuf}}f(\mathbf{x}\_i)^T\nabla\_{\text{tuf}}f(\mathbf{x}\_i) = (f^i)^T(G^i)^T G^i f^i \tag{9}$$

An unresolved problem in our reformulation is how to connect the local object ∥∇tanfðxÞ∥<sup>2</sup> with the global functional <sup>Φ</sup><sup>~</sup> lapðf<sup>Þ</sup> in (5) and its discrete approximation. In Section4, we will discuss this issue in detail.

#### 3.2. Reformulation of locally linear embedding

The LLE method was introduced by Roweis and Saul [5]. It is based on simple geometric intuitions, which can be depicted as follows. Globally, the data points are sampled from a nonlinear manifold, while each data point and its neighbors are residing on or close to a linear patch of the manifold locally. Thus, it is possible to describe the local geometric properties of the neighborhood of each data point in the high-dimensional space by linear coefficients which reconstruct the data point from its neighbors under suitable conditions. The method of LLE computes the low-dimensional embedding which is optimized to preserve the local configurations of the data. In each locally linear patch, the reconstruction error in the original LLE can be written as:

$$\hat{\varepsilon}^i = \left\| \mathbf{x}\_i - \sum\_{j=1}^k \mathbf{w}\_{i\_j} \mathbf{x}\_{i\_j} \right\|^2 \tag{10}$$

where {wij } k <sup>j</sup>¼<sup>1</sup> are the reconstruction weights which encode the geometric information of the high-dimensional inputs and are constrained to satisfy ∑jwij ¼ 1.

Since the geometric structure of the local patch can be approximated by its projection on the tangent space Txi ðMÞ, we utilize the local tangent coordinates to estimate the local objects over the manifold in our reformulation framework. We can write the reconstruction error of each local tangent coordinate as:

$$\varepsilon^i = \|\boldsymbol{u}\_i - \sum\_{j=1}^k \boldsymbol{w}\_{i\cdot} \boldsymbol{u}\_j^i\|^2 = \|\sum\_j \boldsymbol{w}\_{i\cdot} (\boldsymbol{u}\_i - \boldsymbol{u}\_j^i)\|^2 = \sum\_{jk} \boldsymbol{w}\_{i\cdot} \boldsymbol{w}\_{i\cdot} \boldsymbol{G}\_{jk}^i \tag{11}$$

where we have employed the fact that the weights sum to one, and Gi is the local Gram matrix,

$$G\_{jk}^{i} = \langle (\boldsymbol{\mu}\_{i} - \boldsymbol{\mu}\_{j}^{i}), (\boldsymbol{\mu}\_{i} - \boldsymbol{\mu}\_{k}^{i}) \rangle \tag{12}$$

The optimal weights can be obtained analytically by minimizing the above reconstruction error. We solve the linear system of equations

$$\sideset{\_{k}}{\_{jk}}{\mathop{G}}w\_{i\_{k}} = 1 \tag{13}$$

and then normalize the solution by ∑kwik ¼ 1. Consider the problem of mapping the data points from the manifold to a line such that each data point on the line can be represented as a linear combination of its neighbors. Let <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup>;…;fðxik <sup>Þ</sup> denote the mappings of ui 1;…;ui k, respectively. Motivated by the spirit of LLE, the neighborhood of fðxiÞ should share the same geometric information as the neighborhood of ui, so we can define the following local object:

$$\left|\left|\sigma\_{\boldsymbol{f}}(\mathbf{x}\_{i})\right|^{2} = \left|f(\mathbf{x}\_{i}) - \sum\_{j=1}^{k} w\_{ij} f(\mathbf{x}\_{i\_{j}})\right|^{2} = \left(f^{i}\right)^{T} \left(\mathcal{W}^{i}\right)^{T} \mathcal{W}^{i} \mathcal{f}^{i} \tag{14}$$

where <sup>W</sup><sup>i</sup> ¼ ½1;−wi� <sup>∈</sup> <sup>ℝ</sup><sup>1</sup> · <sup>ð</sup>kþ1<sup>Þ</sup> ; f <sup>i</sup> ¼ ½ <sup>f</sup>ðxi<sup>Þ</sup> ; <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup> ; … ; <sup>f</sup>ðxik Þ�. The optimal mapping <sup>f</sup> can be obtained by minimizing the following global functional:

$$\mathcal{E}(f) = \int\_{\mathcal{M}} |\sigma\_f(\mathbf{x})|^2 d\mathbf{x} \tag{15}$$

where dx stands for the probability measure on the manifold.

#### 3.3. Reformulation of Hessian eigenmaps

The HLLE method was introduced by Donoho and Grimes [6]. In contrast to LLE that obtains linear embedding by minimizing the l<sup>2</sup> error in Eq. (10), the HLLE achieves linear embedding by minimizing the Hessian functional on the manifold where the data points reside. HLLE supposes that we can obtain the low-dimensional coordinates from the ðd þ 1Þ-dimensional null-space of the functional ℋðfÞ which presents the average curviness of f upon the manifold, if the manifold is locally isometric to an open connected subset of ℝ<sup>d</sup>. We can measure the functional ℋðfÞ by averaging the Frobenius-norm of the Hessians on the manifold M as [6]:

$$\mathcal{H}'(f) = \int\_{\mathcal{M}} \|H\_f^{\text{tan}}(\mathbf{x})\|\_F^2 d\mathbf{x} \tag{16}$$

where Htan <sup>f</sup> stands for the Hessian of f in tangent coordinates. In order to estimate the local Hessian matrix, we first perform a second-order Taylor expansion at a fixed xi on the smooth functions: { fðxij Þ} k <sup>j</sup>¼<sup>1</sup>;<sup>f</sup> : <sup>M</sup> ! <sup>ℝ</sup> that is <sup>C</sup><sup>2</sup> near xi:

#### A Fusion Scheme of Local Manifold Learning Methods http://dx.doi.org/10.5772/66303 139

$$\begin{aligned} f(\mathbf{x}\_{i\_j}) &\approx f(\mathbf{x}\_i) + (\nabla f)^T (\mathbf{x}\_{i\_j} - \mathbf{x}\_i) + \frac{1}{2} (\mathbf{x}\_{i\_j} - \mathbf{x}\_i)^T H\_f^i (\mathbf{x}\_{i\_j} - \mathbf{x}\_i) \\ &= g(u\_j^i) = g(0) + (\nabla g)^T u\_j^i + \frac{1}{2} u\_j^{iT} H\_f^i u\_j^i + O(\|u\_j^i\|^3) \end{aligned} \tag{17}$$

Here, <sup>∇</sup><sup>f</sup> <sup>¼</sup> <sup>∇</sup><sup>g</sup> is the gradient defined in (4), and <sup>H</sup><sup>i</sup> <sup>f</sup> is the local Hessian matrix defined as:

<sup>ε</sup><sup>i</sup> <sup>¼</sup> <sup>∥</sup>ui<sup>−</sup> <sup>∑</sup>

error. We solve the linear system of equations

138 Manifolds - Current Research Areas

where <sup>W</sup><sup>i</sup> ¼ ½1;−wi� <sup>∈</sup> <sup>ℝ</sup><sup>1</sup> · <sup>ð</sup>kþ1<sup>Þ</sup>

where Htan

functions: { fðxij

Þ} k

k j¼1 wij ui j <sup>∥</sup><sup>2</sup> <sup>¼</sup> <sup>∥</sup><sup>∑</sup> j wij <sup>ð</sup>ui−ui j <sup>Þ</sup>∥<sup>2</sup> <sup>¼</sup> <sup>∑</sup> jk wij wikG<sup>i</sup>

<sup>j</sup>σfðxiÞj<sup>2</sup> ¼ jfðxiÞ<sup>−</sup> <sup>∑</sup>

; f

where dx stands for the probability measure on the manifold.

3.3. Reformulation of Hessian eigenmaps

obtained by minimizing the following global functional:

Gi

where we have employed the fact that the weights sum to one, and Gi is the local Gram matrix,

The optimal weights can be obtained analytically by minimizing the above reconstruction

and then normalize the solution by ∑kwik ¼ 1. Consider the problem of mapping the data points from the manifold to a line such that each data point on the line can be represented as

respectively. Motivated by the spirit of LLE, the neighborhood of fðxiÞ should share the same geometric information as the neighborhood of ui, so we can define the following local object:

> Þj<sup>2</sup> ¼ ð<sup>f</sup> i Þ <sup>T</sup>ðW<sup>i</sup> Þ TWi f

<sup>j</sup>σfðxÞj<sup>2</sup>

<sup>i</sup> ¼ ½ <sup>f</sup>ðxi<sup>Þ</sup> ; <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup> ; … ; <sup>f</sup>ðxik Þ�. The optimal mapping <sup>f</sup> can be

j <sup>Þ</sup>;ðui−ui

jk <sup>¼</sup> 〈ðui−ui

∑ k Gi

a linear combination of its neighbors. Let <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup>;…;fðxik <sup>Þ</sup> denote the mappings of ui

k j¼1 wij fðxij

EðfÞ ¼

ℋðfÞ ¼

<sup>j</sup>¼<sup>1</sup>;<sup>f</sup> : <sup>M</sup> ! <sup>ℝ</sup> that is <sup>C</sup><sup>2</sup> near xi:

ð M ∥Htan <sup>f</sup> <sup>ð</sup>xÞ∥<sup>2</sup>

<sup>f</sup> stands for the Hessian of f in tangent coordinates. In order to estimate the local

Hessian matrix, we first perform a second-order Taylor expansion at a fixed xi on the smooth

ð M

The HLLE method was introduced by Donoho and Grimes [6]. In contrast to LLE that obtains linear embedding by minimizing the l<sup>2</sup> error in Eq. (10), the HLLE achieves linear embedding by minimizing the Hessian functional on the manifold where the data points reside. HLLE supposes that we can obtain the low-dimensional coordinates from the ðd þ 1Þ-dimensional null-space of the functional ℋðfÞ which presents the average curviness of f upon the manifold, if the manifold is locally isometric to an open connected subset of ℝ<sup>d</sup>. We can measure the functional ℋðfÞ by averaging the Frobenius-norm of the Hessians on the manifold M as [6]:

jk (11)

1;…;ui k,

<sup>i</sup> (14)

dx (15)

<sup>F</sup>dx (16)

<sup>k</sup>Þ〉 (12)

jkwik ¼ 1 (13)

$$(H\_f^i)\_{p,q}(\mathbf{x}) = \frac{\partial}{\partial u\_p} \frac{\partial}{\partial u\_q} \mathbf{g}(\boldsymbol{\mu}) \tag{18}$$

where g : U ! ℝ uses the local tangent coordinates and satisfies the rule gðuÞ ¼ fðxÞ ¼ f∘ψðuÞ. In the second identity of Eq. (17), we have exploited the fact that u<sup>i</sup> <sup>i</sup> <sup>¼</sup> 〈V<sup>i</sup> ;xi−xi〉 ¼ 0 [recall the computation of local tangent coordinates in Eq. (1)].

Over U<sup>i</sup> , we develop the operator <sup>β</sup><sup>i</sup> that approximates the function <sup>g</sup>ðu<sup>i</sup> j Þ by its projection on the basis U<sup>i</sup> <sup>j</sup> <sup>¼</sup> {1;ui j 1 ;…;ui j d ;ðui j 1 Þ 2 ;…;ðui j d Þ 2 ;…;ui j 1 · u<sup>i</sup> j 2 ;…;ui j d−1 · ui j d }, and we have:

$$f(\mathbf{x}\_{i\cdot}) = \mathbf{g}(\boldsymbol{\mu}\_{\cdot}^{i}) = (\boldsymbol{\beta}^{i})^{T} \boldsymbol{\mathcal{U}}\_{i}^{i} \tag{19}$$

Let <sup>β</sup><sup>i</sup> ¼ ½gð0Þ;∇g;hi �∈ℝ<sup>1</sup>þdþdðdþ1Þ<sup>=</sup>2, then hi ∈ℝ<sup>d</sup>ðdþ1Þ=<sup>2</sup> is the vector form of local Hessian matrix Hi <sup>f</sup> over neighborhood <sup>N</sup>ðxiÞ. The least-squares estimation of the operator <sup>β</sup><sup>i</sup> can be obtained by:

$$\underset{\boldsymbol{\beta}^{i}}{\text{argmin}} \; \sum\_{j=1}^{k} (f(\boldsymbol{x}\_{i\_{j}}) - (\boldsymbol{\beta}^{i})^{T} \boldsymbol{\mathcal{U}}\_{j}^{i})^{2} \tag{20}$$

The least-squares solution is <sup>β</sup><sup>i</sup> ¼ ðU<sup>i</sup> Þ † f i , where f <sup>i</sup> ¼ ½fðx1Þ;…;fðxkÞ�∈ℝ<sup>k</sup> , <sup>U</sup><sup>i</sup> ¼ ½U<sup>i</sup> <sup>1</sup>; <sup>U</sup><sup>i</sup> <sup>2</sup>; …; <sup>U</sup><sup>i</sup> k� ∈ℝ<sup>k</sup> · <sup>ð</sup>1þdþdðdþ1Þ=2<sup>Þ</sup> , and <sup>ð</sup>U<sup>i</sup> Þ † signifies the pseudo-inverse of U<sup>i</sup> . Notice that hi is the vector form of local Hessian matrix H<sup>i</sup> <sup>f</sup> , while the last <sup>d</sup>ð<sup>d</sup> <sup>þ</sup> <sup>1</sup>Þ=2 components of <sup>β</sup><sup>i</sup> correspond to hi . Meanwhile, we can construct the local Hessian operator H<sup>i</sup> <sup>∈</sup>ℝðdðdþ1Þ=2<sup>Þ</sup> · <sup>k</sup> by the last <sup>d</sup>ð<sup>d</sup> <sup>þ</sup> <sup>1</sup>Þ=<sup>2</sup> rows of <sup>ð</sup>U<sup>i</sup> Þ † , and therefore, we can obtain hi <sup>¼</sup> <sup>H</sup><sup>i</sup> f i . Thus, the local object ∥Htan <sup>f</sup> <sup>ð</sup>xiÞ∥<sup>2</sup> <sup>F</sup> can be estimated with:

$$\|\|H\_f^{\text{tan}}(\mathbf{x}\_i)\|\|\_F^2 = (h^i)^T(h^i) = (f^i)^T(H^i)^T(H^i)(f^i) \tag{21}$$

#### 3.4. Reformulation of local tangent space alignment

The method LTSA was introduced by Zhang and Zha [7]. LTSA is based on similar geometric intuitions as LLE. The neighborhoods of each data point remain nearby and similarly colocated in the low-dimensional space, if the data set is sampled from a smooth manifold. LLE constructs low-dimensional data so that the local linear relations of the original data are preserved, while LTSA constructs a locally linear patch to approximate the tangent space at the point. The coordinates provided by the tangent space give a low-dimensional representation of the patch. From Eq. (6), we can obtain:

$$f(\mathbf{x}\_{i\cdot}) = f(\mathbf{x}\_i) + \left(\nabla\_{\text{tan}} f(\mathbf{x}\_i)\right)^T \boldsymbol{u}\_j^i + O(\|\boldsymbol{u}\_j^i\|^2) \tag{22}$$

From the above equation, we can discover that there are some relations between the global coordinate fðxij <sup>Þ</sup> in the low-dimensional feature space and the local coordinate ui <sup>j</sup> which represents the local geometry. The LTSA algorithm requires the global coordinates fðxij Þ that should respect the local geometry determined by the ui j :

$$f(\mathbf{x}\_{i\_{\parallel}}) \approx f(\mathbf{x}\_{i}) + L\_{i} \boldsymbol{u}\_{j}^{i},\tag{23}$$

where fðxiÞ is the mean of fðxij Þ, j ¼ 1;…;k. Inspired by LTSA, the affine transformation Li should align the local coordinate with the global coordinate, and we can define the following local object:

$$|\kappa\_f(\mathbf{x}\_i)|^2 = |(f^i)^T - \frac{1}{k}(f^i)^T e e^T - L\_i \mathbf{U}^i|^2,\tag{24}$$

where f <sup>i</sup> ¼ ½ <sup>f</sup>ðxi<sup>1</sup> <sup>Þ</sup>;…; <sup>f</sup>ðxik Þ�<sup>T</sup>, <sup>U</sup><sup>i</sup> ¼ ½ui <sup>1</sup>; ui <sup>2</sup>;…; ui <sup>k</sup>�, and e is a k-dimensional column vector of all ones. Naturally, we should seek to find the optimal mapping f and a local affine transformation Li to minimize the following global functional:

$$\mathcal{K}(f) = \int\_{\mathcal{M}} |\kappa\_f(\mathbf{x})|^2 d\mathbf{x} \tag{25}$$

Obviously, the optimal affine transformation Li that minimizes the local reconstruction error for a fixed f <sup>i</sup> is given by:

$$L\_i = (f^i)^T \left( I - \frac{1}{k} e e^T \right) (\mathcal{U}^i)^\dagger \tag{26}$$

and therefore,

$$\left|\mathbf{x}\_{f}(\mathbf{x}\_{i})\right|^{2} = \left|(f^{i})^{T}\left(I - \frac{1}{k}ee^{T}\right)(I - (\mathcal{U}^{i})^{\dagger}\mathcal{U}^{i})\right|^{2},\tag{27}$$

Let <sup>W</sup><sup>i</sup> ¼ ðI−ðU<sup>i</sup> Þ † Ui Þ <sup>T</sup>ðI<sup>−</sup> <sup>1</sup> <sup>k</sup> ee<sup>T</sup><sup>Þ</sup> <sup>T</sup>, the local object <sup>κ</sup>fðxi<sup>Þ</sup> can be estimated as:

$$|\mathbf{x}\_f(\mathbf{x}\_i)|^2 = |(f^i)^T \left( I - \frac{1}{k} e e^T \right) (I - (\mathcal{U}^i)^\dagger \mathcal{U}^i)|^2 = (f^i)^T (\mathcal{W}^i)^T (\mathcal{W}^i) (f^i) \tag{28}$$
