**1. Introduction**

10 Will-be-set-by-IN-TECH

566 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

[13] G. Wang and U. Heute, "Time-varying MMSE modulated lapped transform and its applications to transform coding for speech and audio signals," *Signal Processing,* vol.

[14] G. Wang, "The most general time-varying filter bank and time-varying lapped transforms," *IEEE Trans. on Signal Proccesing*, vol. 54, No. 10, pp. 3775-3789, Oct. 2006. [15] G. Wang, "Analysis of M-channel time-varying filter banks," *Digital Signal Processing,*

[16] H. Wei, S.Billings and J. Liu, "Time-varying parametric modelling and time-dependent spectral characterisation with applications to EEG signals using multiwavelets", *International Journal of Modelling, Identification and Control*, vol. 9, No. 3, pp. 215-224,2010. [17] X. Xu, Z.Y. Shi and Q. You, "Identification of linear time-varying systems using a wavelet-based state-space method", *Mechanical Systems and Signal*

[18] Z. Lua, O. Gueganb, "Estimation of Time-Varying Long Memory Parameter Using Wavelet Method", *Communications in Statistics - Simulation and Computation*, vol. 40, No.

[19] M. Gokhale1 and D. Khanduja, "Time Domain Signal Analysis Using Wavelet Packet Decomposition Approach", *Int. J. Communications, Network and System Sciences*,

[20] Guangyu Wang, Zufan Zhang, and Qianbin Chen, "Analysis and Properties of Time-Varying Modified DFT Filter Banks," *EURASIP Journal on Advances in Signal*

*Processing*, vol. 2010, Article ID 342865, 15 pages, 2010. doi:10.1155/2010/342865. [21] G. Wang, Q. Chen and Z. Ren, "Modelling of time-varying discrete-time systems", *IET*

82, No. 9, pp. 1283-1304, Sept. 2002.

vol. 18, No. 2, pp. 127-147, May 2008.

*Processing*,doi:10.1016/j.ymssp.2011.07.005.

*Signal Processing*, vol. 5, No. 1, pp104-112, Feb. 2011.

4, pp. 596-613, March 2011.

doi:10.4236/ijcns.2010.33041.

3D graphics applications make use of polygonal 3D meshes for object's shape representation. The recent introduction of high-performance laser scanners and fast microcomputer systems gave rise to high-definition graphics applications. In such applications, objects with complex textures are represented using dense 3D meshes which consist of hundreds of thousands of vertices. Due to their enormous data size, such highlydetailed 3D meshes are rather intricate to store, costly to transmit via bandwidth-limited transmission media, and hard to display on end-user terminals with diverse display capabilities. Scalable compression, wherein the source representation can be adapted to the users' requests, available bandwidth and computational capabilities, is thus of paramount importance in order to make efficient use of the available resources to process, store and transmit high-resolution meshes.

State-of-the-art scalable mesh compression systems can be divided into two main categories. A first category includes codecs that directly compress the irregular topology meshes in the spatial domain. In such codecs, the connectivity information is encoded losslessly while mesh simplification methods such as vertex coalescing (Rossignac & Borrel, 1993), edge decimation (Soucy & Laurendeau, 1996) and edge collapsing (Ronfard & Rossignac, 1996) are employed to encode geometry. These mesh simplification methods progressively remove those mesh vertices which yield the smallest distortion. In order to enable the reconstruction of the original mesh at various levels of detail (LODs), the discarded vertices are encoded in the compressed bit-stream. Mesh compression systems belonging to this category include Progressive Meshes (Li & Kuo, 1998), (Pajarola & Rossignac, 2000) and Topological Surgery (Taubin et al., 1998). These techniques generally exhibit two major drawbacks: first, due to the highly irregular topology of the input mesh, a large source rate is needed for lossless encoding of connectivity. Secondly, encoding the removed vertices in the compressed bit-stream is quite costly for highresolution meshes. Therefore, such schemes are not useful for complex meshes containing a large number of vertices. An alternative that solves the problem of the large source rates needed to encode the connectivity information, described above, is remeshing, which can be used to convert the original irregular mesh into a mesh consisting of regular elements, such as B-spline (Eck & Hoppe, 1996) or subdivision connectivity patches (Eck et al., 1995). The regular

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 569

using different polygonal shapes, e.g., triangles, rectangles etc. However, in this chapter, we

In the following, a brief theoretical overview of semi-regular multiresolution analysis is presented. Later on, two practical transforms, namely the lifting-based wavelet transform

Subdivision is a process of iteratively refining a control polyhedron *M* 0 into fine geometry polyhedra such that the refined polyhedra <sup>123</sup> *MMM* , , ... converge to a limit surface *M* . In general, subdivision schemes consist of *splitting* and *averaging* steps. In the *splitting* step, each triangular face is split into four sub-triangles by adding new vertices. This way, an intermediate polyhedron *<sup>j</sup> M* is created for any level *j* . The *averaging* step is used to determine the position of each vertex in *<sup>j</sup> M* from its local neighborhood of vertices in *<sup>j</sup> M* ,

<sup>6</sup> *c* <sup>5</sup> *c*

3 *c*

*p c*

2 *c*

4 *c*

 **<sup>P</sup>** *<sup>j</sup> N N j j* 1 and **<sup>Q</sup>** *<sup>j</sup> N N j j* 1 1 (where *Nj* denotes the number of vertices of *<sup>j</sup> <sup>M</sup>* ) are the splitting and averaging matrix at level *j* . The subdivision process, expressed in matrix

c QPc *j j jj* <sup>1</sup> , *j J* 0,1,2,..., 1 .

<sup>1</sup> *<sup>p</sup> i i <sup>i</sup> c ac* whereby *<sup>i</sup> <sup>a</sup>* 's denote the Butterfly weights (Dyn et al., 1990).

A commonly used subdivision is Butterfly subdivision (Dyn et al., 1990). The subdivision stencil for Butterfly is shown in Fig. 1, where the position of a newly introduced vertex *p* is

Loop (Loop et al., 2009) and Catmull-Clark (Catmull & Clark, 1978) are among the other

Lounsbery (Lounsbery et al., 1997) first invented the multiresolution analysis for arbitrary topology semi-regular surfaces using subdivision. He proved that refinable bases exist when

<sup>7</sup> *c* <sup>8</sup> *c*

1 *c*

will confine our discussion to the triangular meshes only.

and the spatially adapted wavelet transform are detailed.

**2.1 Theory** 

*j J* 1,2,..., .

**2.1.1 Subdivision surfaces** 

Fig. 1. Butterfly subdivision stencil.

commonly used subdivision schemes for 3D meshes.

a coarse mesh *M* 0 is refined through subdivision, i.e.,

form, can be written as:

computed as, <sup>8</sup>

**2.1.2 Multiresolution analysis** 

mesh lends itself better to compression, and hence compared to the irregular mesh a much lower rate is needed to losslessly encode its connectivity information. Furthermore, multiresolution techniques alleviate the second problem of having to encode all the original vertices, because only detail information has to be encoded in order to create multiple LODs (or multiple resolution levels). Remeshing together with subdivision-based multiresolution (Lounsbery et al., 1997) are the two major components of the second category of codecs which use space-frequency dilation methods such as wavelet transforms to decorrelate the input mesh data (Khodakovsky et al., 2000), (Denis et al., 2010b). The generated wavelet coefficients are compressed using tree-based bit-plane coding methods (Shapiro, 1993), (Munteanu et al., 1999b) to achieve high compression efficiency. Multiresolution mesh compression techniques provide substantial compression gains compared to their competing schemes, and in this chapter we will confine our discussion to these techniques only.

In the recent past, several multiresolution scalable mesh compression schemes have been proposed. The majority of these schemes use coding techniques which were specifically developed for image compression. However, in general, image and mesh data exhibit different statistical characteristics as the images are consisting of pixels (with intensities) while mesh data involve geometry, i.e., the positions of vertices in a 3D space. Thus, one must be cautious when extrapolating image compression techniques towards mesh geometry encoding.

In this book chapter, we propose a constructive design methodology for multiresolutionscalable mesh compression systems. The input mesh is assumed to possess subdivision connectivity (Lounsbery et al., 1997), i.e., the connectivity in the mesh is built through subdivision1. A 3D mesh with subdivision connectivity is also referred to as a semi-regular mesh. With respect to *design*, we address two major aspects of scalable wavelet-based mesh compression systems, namely, the optimality of embedded quantization in scalable mesh coding and the type of coefficient dependencies that can assure the best compression performance. In this context, thorough analyses investigating the aforementioned aspects are carried out to establish the most appropriate design choices. Later on, the derived design choices are integrated as components of the scalable mesh coding system to achieve state-ofthe-art compression performance.

The remainder of the book chapter is organized as follows: in Section 2, a brief overview of multiresolution analysis of the mesh geometry is given. Section 3 presents a model-based theoretical investigation of optimal embedded quantization in wavelet-based mesh coding. An information theoretic analysis of the statistical dependencies among wavelet coefficients and the conclusions regarding the best exploitable statistical dependency are detailed in Section 4. Section 5 gives an overview of the state-of-the-art mesh compression systems.

## **2. Multiresolution analysis of semi-regular meshes**

A 3D mesh *M* { **c, p }** is generally represented as a set of two components, a vertex list **c** and a polygon list **p** . **c** is a matrix whose *i*th row *<sup>i</sup> c* contains the *x* , *y* and *z* position of the *i*th vertex, i.e., ,,, , , *i ix iy iz c ccc* . **p** is a list of polygons made up of edges where each edge is a line connecting two vertices. In computer graphics, 3D meshes are constructed

<sup>1</sup> In general, an initial remeshing step (Eck et al., 1995) is required to convert the original irregular mesh into a mesh with the required connectivity.

using different polygonal shapes, e.g., triangles, rectangles etc. However, in this chapter, we will confine our discussion to the triangular meshes only.

In the following, a brief theoretical overview of semi-regular multiresolution analysis is presented. Later on, two practical transforms, namely the lifting-based wavelet transform and the spatially adapted wavelet transform are detailed.

## **2.1 Theory**

568 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

mesh lends itself better to compression, and hence compared to the irregular mesh a much lower rate is needed to losslessly encode its connectivity information. Furthermore, multiresolution techniques alleviate the second problem of having to encode all the original vertices, because only detail information has to be encoded in order to create multiple LODs (or multiple resolution levels). Remeshing together with subdivision-based multiresolution (Lounsbery et al., 1997) are the two major components of the second category of codecs which use space-frequency dilation methods such as wavelet transforms to decorrelate the input mesh data (Khodakovsky et al., 2000), (Denis et al., 2010b). The generated wavelet coefficients are compressed using tree-based bit-plane coding methods (Shapiro, 1993), (Munteanu et al., 1999b) to achieve high compression efficiency. Multiresolution mesh compression techniques provide substantial compression gains compared to their competing schemes, and in this

In the recent past, several multiresolution scalable mesh compression schemes have been proposed. The majority of these schemes use coding techniques which were specifically developed for image compression. However, in general, image and mesh data exhibit different statistical characteristics as the images are consisting of pixels (with intensities) while mesh data involve geometry, i.e., the positions of vertices in a 3D space. Thus, one must be cautious

In this book chapter, we propose a constructive design methodology for multiresolutionscalable mesh compression systems. The input mesh is assumed to possess subdivision connectivity (Lounsbery et al., 1997), i.e., the connectivity in the mesh is built through subdivision1. A 3D mesh with subdivision connectivity is also referred to as a semi-regular mesh. With respect to *design*, we address two major aspects of scalable wavelet-based mesh compression systems, namely, the optimality of embedded quantization in scalable mesh coding and the type of coefficient dependencies that can assure the best compression performance. In this context, thorough analyses investigating the aforementioned aspects are carried out to establish the most appropriate design choices. Later on, the derived design choices are integrated as components of the scalable mesh coding system to achieve state-of-

The remainder of the book chapter is organized as follows: in Section 2, a brief overview of multiresolution analysis of the mesh geometry is given. Section 3 presents a model-based theoretical investigation of optimal embedded quantization in wavelet-based mesh coding. An information theoretic analysis of the statistical dependencies among wavelet coefficients and the conclusions regarding the best exploitable statistical dependency are detailed in Section 4. Section 5 gives an overview of the state-of-the-art mesh compression systems.

A 3D mesh *M* { **c, p }** is generally represented as a set of two components, a vertex list **c** and a polygon list **p** . **c** is a matrix whose *i*th row *<sup>i</sup> c* contains the *x* , *y* and *z* position of the *i*th vertex, i.e., ,,, , , *i ix iy iz c ccc* . **p** is a list of polygons made up of edges where each edge is a line connecting two vertices. In computer graphics, 3D meshes are constructed

1 In general, an initial remeshing step (Eck et al., 1995) is required to convert the original irregular mesh

when extrapolating image compression techniques towards mesh geometry encoding.

chapter we will confine our discussion to these techniques only.

**2. Multiresolution analysis of semi-regular meshes** 

the-art compression performance.

into a mesh with the required connectivity.

#### **2.1.1 Subdivision surfaces**

Subdivision is a process of iteratively refining a control polyhedron *M* 0 into fine geometry polyhedra such that the refined polyhedra <sup>123</sup> *MMM* , , ... converge to a limit surface *M* . In general, subdivision schemes consist of *splitting* and *averaging* steps. In the *splitting* step, each triangular face is split into four sub-triangles by adding new vertices. This way, an intermediate polyhedron *<sup>j</sup> M* is created for any level *j* . The *averaging* step is used to determine the position of each vertex in *<sup>j</sup> M* from its local neighborhood of vertices in *<sup>j</sup> M* , *j J* 1,2,..., .

Fig. 1. Butterfly subdivision stencil.

 **<sup>P</sup>** *<sup>j</sup> N N j j* 1 and **<sup>Q</sup>** *<sup>j</sup> N N j j* 1 1 (where *Nj* denotes the number of vertices of *<sup>j</sup> <sup>M</sup>* ) are the splitting and averaging matrix at level *j* . The subdivision process, expressed in matrix form, can be written as:

$$\mathbf{c}^{j+1} = \mathbf{Q}^j \cdot \mathbf{P}^j \cdot \mathbf{c}^j \; , \; j = 0, 1, 2, \dots, l-1 \; .$$

A commonly used subdivision is Butterfly subdivision (Dyn et al., 1990). The subdivision stencil for Butterfly is shown in Fig. 1, where the position of a newly introduced vertex *p* is computed as, <sup>8</sup> <sup>1</sup> *<sup>p</sup> i i <sup>i</sup> c ac* whereby *<sup>i</sup> <sup>a</sup>* 's denote the Butterfly weights (Dyn et al., 1990). Loop (Loop et al., 2009) and Catmull-Clark (Catmull & Clark, 1978) are among the other commonly used subdivision schemes for 3D meshes.

#### **2.1.2 Multiresolution analysis**

Lounsbery (Lounsbery et al., 1997) first invented the multiresolution analysis for arbitrary topology semi-regular surfaces using subdivision. He proved that refinable bases exist when a coarse mesh *M* 0 is refined through subdivision, i.e.,

$$\boldsymbol{\Phi}^{j}\left(\mathbf{x}\right) = \boldsymbol{\Phi}^{j+1}\left(\mathbf{x}\right) \cdot \mathbf{P}^{j}\left(\mathbf{x}\right) \text{ for } \mathbf{x} \in \boldsymbol{M}^{0} \text{ and } \; 0 \le j < J \; . \tag{1}$$

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 571

<sup>0</sup> *M <sup>j</sup>*<sup>1</sup> *M <sup>j</sup> M*

*<sup>j</sup>*<sup>1</sup> **A**

*<sup>j</sup>*<sup>1</sup> **B**

*<sup>j</sup>* are matrices representing the low and the high-pass filters,

*j jj j j* <sup>1</sup> **c Pc Qd =** , for *j* : 0 *j J* . (10)

*<sup>i</sup> i M* is a Riesz basis of *<sup>j</sup>* <sup>0</sup> *V M* (Schröder & Sweldens,

. A similar refinement relation as Eq (11) is also defined

*<sup>k</sup>* exists so that {| } *j j*

*p* (11)

*j*

*j i*

*j i l p* forms

*<sup>k</sup> k K* is a Riesz

*<sup>j</sup>* jointly form the synthesis part of the decomposition for the lossless

*<sup>j</sup>* **d** *<sup>j</sup>*<sup>1</sup> **d**

*j* **A**

*j* **B**

1997).

components. **A** s

expressed by:

Hence, **P** s

and **B** s

*<sup>j</sup>* and **Q** s

complex than the inverse transform.

**2.2 Lifting-based wavelet transform** 

of the *j*th level exists so that {| }

the entries of a matrix similar to **P***<sup>j</sup>*

basis of *W Mj* <sup>0</sup> :

for wavelet functions, i.e., each wavelet function

*j j*

1995). The refinement relation for the scaling functions is then:

where *l* is the set which defines all linear combination of scaling functions and ,

*<sup>j</sup>* and **B** s

respectively, also referred to as analysis filter pairs.

Fig. 2. Pictorial representation of the forward wavelet decomposition, (Lounsbery et al.,

where **d***<sup>j</sup>* is a matrix containing the wavelet coefficients for the *j*th level of the transform. In general, after the transform, a fair amount of correlation still exist between *x* , *y* and *z* wavelet component. Local frame representation (Khodakovsky et al., 2000) of wavelet coefficients is often used to make wavelet components much more independent. After the local frame transformation, each wavelet coefficient consists of a *normal* and two *tangential*

A similar reasoning as for Eq (9) can be used to formulate the inverse wavelet transform,

reconstruction of the input semi-regular mesh *M <sup>J</sup>* . Note that the computation of the **A** s

As explained earlier, the filter bank implementation of multiresolution analysis is quite complex in the sense that the computation of analysis filters involve the computationally intensive inversion of large subdivision matrices. In this context, the lifting-based wavelet implementation (Schröder & Sweldens, 1995) provides a low complexity construction of multiresolution methods. In lifting-based multiresolution analysis, each scaling function

> <sup>1</sup> , , *j j <sup>j</sup> i i i l l*

> > *<sup>j</sup>*

*<sup>j</sup>* involves the inversion of a large matrix, which makes the forward transform more

 *<sup>j</sup>* **x** in the above equation denotes the row vector of scaling functions *j <sup>i</sup>* . Given these refinable scaling functions, scalar-valued function spaces associated with the coarsest geometry *M* 0 are defined as (Lounsbery et al., 1997):

$$V^j\left(M^0\right) \coloneqq \text{Span}\left(\Phi^j\left(\mathbf{x}\right)\right), \text{ for } 0 \le j < J \text{ .}\tag{2}$$

Eq (1) implies that these spaces are indeed nested, i.e.,

$$V^0(\mathcal{M}^0) \subset V^1(\mathcal{M}^0) \subset V^2(\mathcal{M}^0) \subset \dots,\tag{3}$$

The wavelet space *W Mj* 0 is defined as a space which is the orthogonal complement of *<sup>j</sup>* <sup>0</sup> *V M* in *<sup>j</sup>* <sup>1</sup> <sup>0</sup> *V M* . Hence, *W Mj* 0 and *<sup>j</sup>* <sup>0</sup> *V M* together can represent any scalarvalued piecewise function in the space *<sup>j</sup>* <sup>1</sup> <sup>0</sup> *V M* . If x *<sup>j</sup>* is a row vector containing refinable bases functions of *W Mj* <sup>0</sup> , the following stands (Lounsbery et al., 1997):

$$\boldsymbol{\Psi}^{j}(\mathbf{x}) = \boldsymbol{\Phi}^{j+1}(\mathbf{x}) \cdot \mathbf{Q}^{j}, \text{for } \mathbf{x} \in M^{0} \text{ and } 0 \le j < l \text{ .} \tag{4}$$

Combining (1) with (4) yields

$$\left(\Phi^{j}(\mathbf{x}),\mathbf{\upvarphi}^{j}(\mathbf{x})\right) = \Phi^{j+1}(\mathbf{x}) \cdot \left(\mathbf{P}^{j},\mathbf{Q}^{j}\right) . \text{ or } \left(\Phi^{j}(\mathbf{x}),\mathbf{\upvarphi}^{j}(\mathbf{x})\right) \cdot \left(\mathbf{P}^{j},\mathbf{Q}^{j}\right)^{-1} = \Phi^{j+1}(\mathbf{x}) \ . \tag{5}$$

A set of scaling functions *j+1* **x** can then be used to decompose a surface *<sup>j</sup>*<sup>1</sup> *S* in <sup>1</sup> <sup>0</sup> ( ) *<sup>j</sup> V M* , i.e.,

$$S^{j+1} = \sum\_{i} c\_i^{j+1} \phi\_i^{j+1} = \Phi^{j+1} \left(\mathbf{x} \right) \cdot \mathbf{c}^{j+1} \,, \tag{6}$$

where *<sup>j</sup>*<sup>1</sup> *<sup>i</sup> c* is the *i*th vertex in *<sup>j</sup>*<sup>1</sup> *<sup>M</sup>* . Since the analysis filters are uniquely determined by the relationship

$$\mathbb{E}\left(\left.\mathbf{P}^j\right|\mathbf{Q}^j\right)^{-1} = \left(\frac{\mathbf{A}^j}{\mathbf{B}^j}\right)'\tag{7}$$

by combining Eq (5) and Eq (6) and making the above substitution for **P Q** <sup>1</sup> *j j* , we obtain:

$$\mathbf{S}^{j+1} = \boldsymbol{\Phi}^j(\mathbf{x}) \cdot \mathbf{A}^j \cdot \mathbf{c}^{j+1} + \mathbf{\Psi}^j(\mathbf{x}) \cdot \mathbf{B}^j \cdot \mathbf{c}^{j+1} \,. \tag{8}$$

From Eq (8), Eq (9) one derives the forward wavelet transform, given by:

$$\mathbf{c}^{j} = \mathbf{A}^{j} \cdot \mathbf{c}^{j+1}, \quad \mathbf{d}^{j} = \mathbf{B}^{j} \cdot \mathbf{c}^{j+1} \quad \forall j \; . \; 0 \le j < l \; \; \; \tag{9}$$

refinable scaling functions, scalar-valued function spaces associated with the coarsest

The wavelet space *W Mj* 0 is defined as a space which is the orthogonal complement of *<sup>j</sup>* <sup>0</sup> *V M* in *<sup>j</sup>* <sup>1</sup> <sup>0</sup> *V M* . Hence, *W Mj* 0 and *<sup>j</sup>* <sup>0</sup> *V M* together can represent any scalarvalued piecewise function in the space *<sup>j</sup>* <sup>1</sup> <sup>0</sup> *V M* . If x *<sup>j</sup>* is a row vector containing

refinable bases functions of *W Mj* <sup>0</sup> , the following stands (Lounsbery et al., 1997):

*<sup>j</sup>* **x** in the above equation denotes the row vector of scaling functions

geometry *M* 0 are defined as (Lounsbery et al., 1997):

Eq (1) implies that these spaces are indeed nested, i.e.,

Combining (1) with (4) yields

<sup>1</sup> <sup>0</sup> ( ) *<sup>j</sup> V M* , i.e.,

where *<sup>j</sup>*<sup>1</sup>

obtain:

the relationship

*<sup>j</sup> j j* **x xP** <sup>1</sup> , for **<sup>x</sup>** *<sup>M</sup>*0 and 0 *<sup>j</sup> <sup>J</sup>* . (1)

Span **<sup>x</sup>** <sup>0</sup> : *j j V M* , for 0 *j J* . (2)

00 10 20 *VM VM VM* ... , (3)

*<sup>j</sup> j j* **x xQ** <sup>1</sup> , for **<sup>x</sup>** *<sup>M</sup>*0 and 0 *j J* . (4)

**P Q** *jj j* **xx x** <sup>1</sup> , , , *j j* or *j j <sup>j</sup>* **x x P Q <sup>x</sup>** <sup>1</sup> <sup>1</sup> , , *j j* . (5)

*<sup>j</sup>*

*<sup>i</sup> c* is the *i*th vertex in *<sup>j</sup>*<sup>1</sup> *<sup>M</sup>* . Since the analysis filters are uniquely determined by

 

**B**

*S c* , (6)

**xAc xBc** 1 11. *j j jj j jj <sup>S</sup>* (8)

, *<sup>j</sup> jj j jj* , *j* : 0 *j J* , (9)

*<sup>j</sup>* , (7)

 **P Q** <sup>1</sup> *j j* , we

A set of scaling functions *j+1* **x** can then be used to decompose a surface *<sup>j</sup>*<sup>1</sup> *S* in

 **x c** *j j* 1 1 *<sup>j</sup>* <sup>1</sup> <sup>1</sup> *<sup>j</sup>* <sup>1</sup> *i i*

<sup>1</sup> *<sup>j</sup> j j*

**P Q <sup>A</sup>**

by combining Eq (5) and Eq (6) and making the above substitution for

From Eq (8), Eq (9) one derives the forward wavelet transform, given by:

**= = c Ac d Bc** 1 1

*i*

*j*

*<sup>i</sup>* . Given these

Fig. 2. Pictorial representation of the forward wavelet decomposition, (Lounsbery et al., 1997).

where **d***<sup>j</sup>* is a matrix containing the wavelet coefficients for the *j*th level of the transform. In general, after the transform, a fair amount of correlation still exist between *x* , *y* and *z* wavelet component. Local frame representation (Khodakovsky et al., 2000) of wavelet coefficients is often used to make wavelet components much more independent. After the local frame transformation, each wavelet coefficient consists of a *normal* and two *tangential* components. **A** s *<sup>j</sup>* and **B** s *<sup>j</sup>* are matrices representing the low and the high-pass filters, respectively, also referred to as analysis filter pairs.

A similar reasoning as for Eq (9) can be used to formulate the inverse wavelet transform, expressed by:

$$\mathbf{c}^{j+1} = \mathbf{P}^j \cdot \mathbf{c}^j + \mathbf{Q}^j \cdot \mathbf{d}^j \text{ , for } \forall j \colon 0 \le j < j \text{ .} \tag{10}$$

Hence, **P** s *<sup>j</sup>* and **Q** s *<sup>j</sup>* jointly form the synthesis part of the decomposition for the lossless reconstruction of the input semi-regular mesh *M <sup>J</sup>* . Note that the computation of the **A** s *j* and **B** s *<sup>j</sup>* involves the inversion of a large matrix, which makes the forward transform more complex than the inverse transform.

#### **2.2 Lifting-based wavelet transform**

As explained earlier, the filter bank implementation of multiresolution analysis is quite complex in the sense that the computation of analysis filters involve the computationally intensive inversion of large subdivision matrices. In this context, the lifting-based wavelet implementation (Schröder & Sweldens, 1995) provides a low complexity construction of multiresolution methods. In lifting-based multiresolution analysis, each scaling function *j i* of the *j*th level exists so that {| } *j j <sup>i</sup> i M* is a Riesz basis of *<sup>j</sup>* <sup>0</sup> *V M* (Schröder & Sweldens, 1995). The refinement relation for the scaling functions is then:

$$
\phi\_i^{\dot{j}} = \sum\_l p\_{i,l}^{\dot{j}} \cdot \phi\_i^{j+1} \,\prime \tag{11}
$$

where *l* is the set which defines all linear combination of scaling functions and , *j i l p* forms the entries of a matrix similar to **P***<sup>j</sup>* . A similar refinement relation as Eq (11) is also defined for wavelet functions, i.e., each wavelet function *<sup>j</sup> <sup>k</sup>* exists so that {| } *j j <sup>k</sup> k K* is a Riesz basis of *W Mj* <sup>0</sup> :

$$\boldsymbol{\nu}\_{k}\boldsymbol{\nu}\_{k}^{j} = \sum\_{l} \boldsymbol{q}\_{k,l}^{j} \cdot \boldsymbol{\phi}\_{k}^{j+1}.\tag{12}$$

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 573

prediction can be achieved if the prediction process is adapted to the local mesh geometry. Efficient prediction results in smaller energy of wavelet coefficients and hence an improved compression performance of the mesh coding system. To reverse the prediction operation, the decoder needs to know the weights used by the encoder for the prediction of each vertex *<sup>p</sup> c* . Since additional rate (compared to classical Butterfly) needs to be spent for coding the prediction weights, the total compression efficiency in the geometry adaptive case is a compromise between the bitrate saved due to the efficient prediction and the extra bitrate

*<sup>p</sup>* <sup>1</sup> *<sup>c</sup> <sup>c</sup>*

<sup>7</sup> *c* <sup>8</sup> *c* <sup>4</sup> *c*

In the following, a finite set of prediction filters is proposed in the context of spatiallyadaptive wavelet transforms (SAWT) (Denis et al., 2010a). The idea is to use one filter out of this set which best suits the geometry around the vertex to be predicted and which results in the smallest prediction error. A careful application of such an adaptive approach will provide an average rate gain if the reduction in the bitrate due to better prediction

In a first step, the input semi-regular mesh is segmented into regions as follows. Let *Brs* , denote the bounding box of the input semi-regular mesh, where *r x B BB* , , *y z* and *s sss xyz* , , represent the coordinates of the top-left corner and the size vector, respectively. Considering the bounding box as the root cell, each cell on a certain tree level is recursively split into eight equally sized sub-cells to create the next level of the octree. This recursive splitting continues until the number of vertices in the highest-level cells are

For each region *k* , the wavelet analysis is performed by selecting one of the six candidates

Loop

Hybrid of and

Note that the above set of filters is defined using a mixture of Butterfly, Loop and midpoint

dominates the extra bitrate needed to signal the filter type to the decoder.

 

11 1 1 12 34 5678 <sup>2</sup> <sup>8</sup> <sup>16</sup> <sup>111</sup> 2 12 34 5678 <sup>248</sup>

*f cc cc cccc f cc cc cccc*

edge anti edge

<sup>1</sup> 6 1234 <sup>4</sup> 4 5

*f cccc f f*

 

 

<sup>1</sup> 4 12 <sup>2</sup> <sup>1</sup> 5 34 <sup>2</sup>

*f cc f cc*

<sup>3</sup> <sup>1</sup> 3 12 34 8 8

*f cc cc*

<sup>5</sup> *<sup>c</sup>* <sup>3</sup> *<sup>c</sup>* <sup>6</sup> *<sup>c</sup>*

2 *c*

. This way, the semi-regular mesh is divided into

Butterfly

Modified Butterfly

needed for signaling the prediction weights.

Fig. 3. Butterfly footprint on an edge.

smaller than a user-defined threshold

filters given below:

subdivision schemes.

regions of approximately the same size – see Fig. 4.

*<sup>j</sup> K* and *<sup>j</sup> M* are disjoint sets and they jointly form the scaling function index set of the next higher level, i.e., *j jj* <sup>1</sup> *M M K* . The lifting-based forward decomposition is expressed by the following relations (Schröder & Sweldens, 1995):

$$\begin{aligned} \forall i \in M^j: \quad &c\_i^j = c\_i^{j+1} \Rightarrow \text{subsample} \\ \forall k \in K^j: \quad &d\_k^j = c\_k^{j+1} - \sum\_{i \in M^j} a\_i \cdot c\_i^j \Rightarrow \text{prediction} \\ \forall k \in K^j: \begin{cases} c\_1^j = c\_1^j + \vec{a}\_1 \cdot d\_k^{-j} \\ c\_2^j = c\_2^j + \vec{a}\_2 \cdot d\_k^{-j} \end{cases} \Rightarrow \text{update} \end{aligned} \tag{13}$$

In the forward transform, the first step is to produce a lower-resolution mesh *<sup>j</sup> M* starting from a higher-resolution version *<sup>j</sup>*<sup>1</sup> *<sup>M</sup>* . The wavelet coefficient *<sup>j</sup> <sup>k</sup> d* is the *prediction error* when a high-resolution vertex *<sup>j</sup>*<sup>1</sup> *<sup>k</sup> c* is predicted based on its low-resolution neighborhood in *<sup>j</sup> M* . After the prediction, an *update* step is used to modify the low resolution mesh *<sup>j</sup> M* . The *update* step is carried out on a pair 1 2 {,} *c c* of low-resolution vertices joined by a parent edge (Schröder & Sweldens, 1995) using the update weights 1 2 {,} *a a* . In general, the prediction and update weights only depend on the connectivity with respect to the vertex to be predicted. However, specific multiresolution analyses for which the weights depend on the specific resolution level and the underlying geometry can be also constructed (more details are given in Section 2.3).

The inverse transform can be formulated by following the forward-transform steps in the reverse order, i.e.:

$$\begin{aligned} \forall k \in \mathbb{K}^{j}: \begin{cases} \mathbf{c}\_{1}^{j} = \mathbf{c}\_{1}^{j} - \tilde{\mathbf{a}}\_{1} \cdot \mathbf{d}\_{k}^{j} \\ \mathbf{c}\_{2}^{j} = \mathbf{c}\_{2}^{j} - \tilde{\mathbf{a}}\_{2} \cdot \mathbf{d}\_{k}^{j} \end{cases} \implies \text{inverse update} \\ \forall k \in \mathbb{K}^{j}: \begin{array}{l} \mathbf{c}\_{k}^{j+1} = \mathbf{d}\_{k}^{j} + \sum\_{i \in \mathcal{M}^{j}} a\_{i} \cdot \mathbf{c}\_{i}^{j} \Rightarrow \text{inverse predict} \\ \forall i \in \mathcal{M}^{j}: \begin{array}{l} \mathbf{c}\_{i}^{j+1} = \mathbf{c}\_{i}^{j} \Rightarrow \text{inverse subsample} \end{array} \end{aligned} \tag{14}$$

#### **2.3 Spatially Adaptive Wavelet Transform (SAWT)**

As mentioned earlier, lifting-based transforms generally employ fixed prediction weights, independent of the spatial position and geometry around the vertex to be predicted. A simple observation reveals that a better prediction can result from adapting the prediction to the underlying geometry of the mesh. This argument is explained with a simple example: Fig. 3, referring to the position variable of the vertices, shows a scenario where the vertex to be predicted *<sup>p</sup> c* lies on the straight line joining the vertex pair *c c* 1 2 , , while the remaining six coarser vertices 8 <sup>3</sup> { }*i i c* lie on two different planes. In this situation, a prediction function for *<sup>p</sup> c* involving all eight coarser vertices will not be optimal and a better prediction could result by using 1 *c* and 2 *c* only. This is logical since *<sup>p</sup> c* lies on the edge formed by the vertex pair *c c* 1 2 , and is geometrically more correlated to vertices *c c* 1 2 , . Thus, an efficient prediction can be achieved if the prediction process is adapted to the local mesh geometry. Efficient prediction results in smaller energy of wavelet coefficients and hence an improved compression performance of the mesh coding system. To reverse the prediction operation, the decoder needs to know the weights used by the encoder for the prediction of each vertex *<sup>p</sup> c* . Since additional rate (compared to classical Butterfly) needs to be spent for coding the prediction weights, the total compression efficiency in the geometry adaptive case is a compromise between the bitrate saved due to the efficient prediction and the extra bitrate needed for signaling the prediction weights.

Fig. 3. Butterfly footprint on an edge.

572 Advances in Wavelet Theory and Their Applications in Engineering, Physics and Technology

*<sup>j</sup> K* and *<sup>j</sup> M* are disjoint sets and they jointly form the scaling function index set of the next higher level, i.e., *j jj* <sup>1</sup> *M M K* . The lifting-based forward decomposition is expressed by

*k*

In the forward transform, the first step is to produce a lower-resolution mesh *<sup>j</sup> M* starting

in *<sup>j</sup> M* . After the prediction, an *update* step is used to modify the low resolution mesh *<sup>j</sup> M* . The *update* step is carried out on a pair 1 2 {,} *c c* of low-resolution vertices joined by a parent edge (Schröder & Sweldens, 1995) using the update weights 1 2 {,} *a a* . In general, the prediction and update weights only depend on the connectivity with respect to the vertex to be predicted. However, specific multiresolution analyses for which the weights depend on the specific resolution level and the underlying geometry can be also constructed (more

The inverse transform can be formulated by following the forward-transform steps in the

*k jj j j k k ii i M*

As mentioned earlier, lifting-based transforms generally employ fixed prediction weights, independent of the spatial position and geometry around the vertex to be predicted. A simple observation reveals that a better prediction can result from adapting the prediction to the underlying geometry of the mesh. This argument is explained with a simple example: Fig. 3, referring to the position variable of the vertices, shows a scenario where the vertex to be predicted *<sup>p</sup> c* lies on the straight line joining the vertex pair *c c* 1 2 , , while the remaining

for *<sup>p</sup> c* involving all eight coarser vertices will not be optimal and a better prediction could result by using 1 *c* and 2 *c* only. This is logical since *<sup>p</sup> c* lies on the edge formed by the vertex pair *c c* 1 2 , and is geometrically more correlated to vertices *c c* 1 2 , . Thus, an efficient

*j*

*j*

 

: .

*j jj j k k ii i M*

1 1

 

*j j j j k j j j*

*k K d c ac*

 

1 1 1 2 2 2

*c c ad*

*c c ad*

 <sup>1</sup> , . *j j <sup>j</sup> k k k l l*

subsample

update

inverse update

<sup>3</sup> { }*i i c* lie on two different planes. In this situation, a prediction function

inverse subsample

inverse predict

*q* (12)

(13)

*<sup>k</sup> d* is the *prediction error*

(14)

prediction

*<sup>k</sup> c* is predicted based on its low-resolution neighborhood

*j jj i i*

:

*iM c c*

:

from a higher-resolution version *<sup>j</sup>*<sup>1</sup> *<sup>M</sup>* . The wavelet coefficient *<sup>j</sup>*

 

*j j j j k j j j*

*c c ad*

*c c ad k K c d ac*

*jj j i i*

1

*iM c c*

:

*k K*

**2.3 Spatially Adaptive Wavelet Transform (SAWT)** 

8

:

:

*k K*

when a high-resolution vertex *<sup>j</sup>*<sup>1</sup>

details are given in Section 2.3).

reverse order, i.e.:

six coarser vertices

the following relations (Schröder & Sweldens, 1995):

In the following, a finite set of prediction filters is proposed in the context of spatiallyadaptive wavelet transforms (SAWT) (Denis et al., 2010a). The idea is to use one filter out of this set which best suits the geometry around the vertex to be predicted and which results in the smallest prediction error. A careful application of such an adaptive approach will provide an average rate gain if the reduction in the bitrate due to better prediction dominates the extra bitrate needed to signal the filter type to the decoder.

In a first step, the input semi-regular mesh is segmented into regions as follows. Let *Brs* , denote the bounding box of the input semi-regular mesh, where *r x B BB* , , *y z* and *s sss xyz* , , represent the coordinates of the top-left corner and the size vector, respectively. Considering the bounding box as the root cell, each cell on a certain tree level is recursively split into eight equally sized sub-cells to create the next level of the octree. This recursive splitting continues until the number of vertices in the highest-level cells are smaller than a user-defined threshold . This way, the semi-regular mesh is divided into regions of approximately the same size – see Fig. 4.

For each region *k* , the wavelet analysis is performed by selecting one of the six candidates filters given below:

$$\begin{aligned} f\_1 &= \frac{1}{2}(c\_1 + c\_2) + \frac{1}{8}(c\_3 + c\_4) - \frac{1}{16}(c\_5 + c\_6 + c\_7 + c\_8) \Rightarrow \text{(Butterfly)}\\ f\_2 &= \frac{1}{2}(c\_1 + c\_2) + \frac{1}{4}(c\_3 + c\_4) - \frac{1}{8}(c\_5 + c\_6 + c\_7 + c\_8) \Rightarrow \text{(Modified Butterully)}\\ f\_3 &= \frac{3}{8}(c\_1 + c\_2) + \frac{1}{8}(c\_3 + c\_4) \Rightarrow \text{(Loop)}\\ f\_4 &= \frac{1}{2}(c\_1 + c\_2) \Rightarrow \text{(edge)}\\ f\_5 &= \frac{1}{2}(c\_3 + c\_4) \Rightarrow \text{(anti edge)}\\ f\_6 &= \frac{1}{4}(c\_1 + c\_2 + c\_3 + c\_4) \Rightarrow \text{(Hydrid of } f\_4 \text{ and } f\_5) \end{aligned}$$

Note that the above set of filters is defined using a mixture of Butterfly, Loop and midpoint subdivision schemes.

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 575

the observed histograms of the d*<sup>J</sup>*3 subband of *Rabbit* (*non-normal* mesh) and *Dino* (*normal* mesh) obtained using the classical Butterly transform. It is observed experimentally that, in

In the literature, the observed histogram of any component of a wavelet subband is generally modeled using a zero mean generalized Gaussian (GG) distribution (Mallat, 1989),

(, , ) 2 (1 )

*GG x fx e*

 

0,2 , is the shape control parameter.

<sup>1</sup> <sup>2</sup> 3 1 , where is the Gamma function. Note that, for

transforms into a zero-mean Laplacian probability density function (PDF) given by:

<sup>2</sup> <sup>1</sup> (, ) <sup>2</sup> <sup>2</sup>

where

Although GG distributions closely approximate the observed histogram of wavelet coefficients, only approximate rate and distortion expressions for a uniformly quantized GG random variable are known (Fraysse et al,. 2008). The extension of these expressions to embedded quantization is not evident as the rate and distortion functions for such distributions are not easily tractable and can only be computed numerically. Moreover, computing these quantities gets very cumbersome due to the slow numerical integration of

In order to avoid the aforementioned drawbacks of GG distributions, we propose a simple Laplacian mixture (LM) model which not only gives an easy closed-form derivation of the distortion and rate quantities but also better approximates the observed histogram of wavelet coefficients in the majority of cases. The proposed LM is a linear combination of two

1 2

12 12

 

1 2 () , 1 , *LM <sup>L</sup> <sup>L</sup> x f x fx f x* . (19)

() 1 *f x dx LM* .

 

 <sup>1</sup> , 2 and

 

> 

  . The

The LM model is fitted over the observed data using the expectation maximization (EM)

, 1 , ( ) , () , ,1 , ,1 ,

*f x f x f x f x*

*L i L i L i L i L i L i*

 

> 

*f x f x r i r i*

*<sup>L</sup> x fx e e*

2 Eq (17) corresponds to a zero-mean Gaussian PDF.

expressions involving a GG probability function, especially for

Note that ( ) *fLM <sup>x</sup>* indeed defines a probability function, as

E-step in the EM process calculates two responsibility factors

1 2

 

 

 

algorithm (Dempster et al., 1977) in order to determine the parameters

1

*<sup>x</sup> <sup>x</sup>*

*x*

, (17)

1 .

2

0 is the scaling factor and

1 , Eq (17)

, (18)

general, 2 2 <sup>1</sup> ( ) () *j j* 

expressed by:

 , 

Laplacian PDFs, i.e.,

 

 

**3.2 Proposed Laplacian mixture model** 

where

 

and for

 *H H k k*

for 1 *j J* .

Fig. 4. Mesh partitioning for 400 . The red and green patches indicate different regions for which different prediction filters will be selected.

Similar to (Chang & Girod, 2006), a filter candidate for a particular region *k* in *M <sup>J</sup>* , is chosen in an optimal distortion-rate (D-R) manner. More specifically, a predictor for each region *k* is selected such that the following Lagrangian cost function is minimized:

$$\mathbf{A}\_{M^l,k} = \underset{\text{Iw}\{1,2,\dots,6\}}{\arg\min} \left\{ E\_k \left[ \left( c\_p - \tilde{c}\_{p,f\_l} \right)^2 \right] + \mathcal{A} \cdot R\_k \left( f\_l \right) \right\} \tag{15}$$

where *Rk l f* denotes the rate necessary for encoding the filter index *l* used for prediction in the region *k.*

### **3. Scalable quantization of wavelet coefficients**

In scalable mesh compression, the wavelet coefficients in the subbands are quantized using a generic family of embedded deadzone scalar quantizers (EDSQ) (Taubman & Marcelin, 2001), in which every wavelet coefficient *X* is quantized to:

$$q\_{\xi,n} = \begin{cases} \operatorname{sign}(X) \cdot \left\lfloor \frac{|X|}{\Delta\_n} + \tilde{\xi}\_n \right\rfloor & \text{if } \frac{|X|}{\Delta\_n} + \tilde{\xi}\_n > 0 \\ 0 & \text{otherwise} \end{cases} \tag{16}$$

where *n* denotes the quantization level. *<sup>n</sup>* and *n* denote the deadzone control parameter and the step size for any *n* 0 , respectively, with <sup>0</sup> 2*<sup>n</sup> <sup>n</sup>* and <sup>0</sup> 2*<sup>n</sup> <sup>n</sup>* , where 0 and 0 are the parameters for the highest rate quantizer ( 0) *n* . Note that <sup>0</sup> 0 corresponds to the well-known SAQ (Shapiro, 1993) in which the deadzone size is twice the step size *n* for any *n* .

#### **3.1 Wavelet coefficient histogram**

In general, the observed histogram *<sup>j</sup> Hk* of the *k*th , *k xyz* , , , coordinate component of the *j*th wavelet subband is symmetric around its center of mass which is often zero or very close to zero. Moreover, the histogram is peaky around the mean and the frequency of occurrence decays as the magnitude of the coefficient's component increases. Fig. 5 depicts

Similar to (Chang & Girod, 2006), a filter candidate for a particular region *k* in *M <sup>J</sup>* , is chosen in an optimal distortion-rate (D-R) manner. More specifically, a predictor for each

region *k* is selected such that the following Lagrangian cost function is minimized:

, , 1,2,..,6

*sign X if <sup>q</sup>*

0

<sup>2</sup>

where *Rk l f* denotes the rate necessary for encoding the filter index *l* used for prediction

In scalable mesh compression, the wavelet coefficients in the subbands are quantized using a generic family of embedded deadzone scalar quantizers (EDSQ) (Taubman & Marcelin,

*X X*

0 and 0 are the parameters for the highest rate quantizer ( 0) *n* . Note that

corresponds to the well-known SAQ (Shapiro, 1993) in which the deadzone size is twice the

In general, the observed histogram *<sup>j</sup> Hk* of the *k*th , *k xyz* , , , coordinate component of the *j*th wavelet subband is symmetric around its center of mass which is often zero or very close to zero. Moreover, the histogram is peaky around the mean and the frequency of occurrence decays as the magnitude of the coefficient's component increases. Fig. 5 depicts

*n n n*

*<sup>J</sup>* arg min *<sup>l</sup> M k k p pf k l*

400 . The red and green patches indicate different regions

> 

*n n*

*otherwise*

0

 <sup>0</sup> 2*<sup>n</sup>*

*<sup>n</sup>* and *n* denote the deadzone control

(16)

*<sup>n</sup>* and <sup>0</sup> 2*<sup>n</sup>*

*<sup>n</sup>* ,

<sup>0</sup> 0

*Ecc R f* (15)

Fig. 4. Mesh partitioning for

in the region *k.*

where 

step size *n* for any *n* .

**3.1 Wavelet coefficient histogram** 

*l*

**3. Scalable quantization of wavelet coefficients** 

2001), in which every wavelet coefficient *X* is quantized to:

 

parameter and the step size for any *n* 0 , respectively, with

where *n* denotes the quantization level.

,

for which different prediction filters will be selected.

the observed histograms of the d*<sup>J</sup>*3 subband of *Rabbit* (*non-normal* mesh) and *Dino* (*normal* mesh) obtained using the classical Butterly transform. It is observed experimentally that, in general, 2 2 <sup>1</sup> ( ) () *j j H H k k* for 1 *j J* .

In the literature, the observed histogram of any component of a wavelet subband is generally modeled using a zero mean generalized Gaussian (GG) distribution (Mallat, 1989), expressed by:

$$\forall \mathbf{x} \in \mathbb{R} \quad f\_{\text{GG}}(\mathbf{x}, \sigma, \alpha) = \frac{\alpha \nu^{\prime \mathbf{x}}}{2\Gamma(1/\alpha)} e^{-\nu \|\mathbf{x}\|^{\alpha}},\tag{17}$$

where , 0,2 , is the shape control parameter. 0 is the scaling factor and <sup>1</sup> <sup>2</sup> 3 1 , where is the Gamma function. Note that, for 1 , Eq (17) transforms into a zero-mean Laplacian probability density function (PDF) given by:

$$\forall \mathbf{x} \in \mathbb{R} \quad f\_{\mathbf{L}}(\mathbf{x}, \sigma) = \frac{1}{\sigma \sqrt{2}} e^{-\frac{\mathbf{Q} \cdot \mathbf{x}}{\sigma} \|\mathbf{x}\|} = \frac{\lambda}{2} e^{-\lambda \|\mathbf{x}\|} \text{ where } \lambda = \frac{\sqrt{2}}{\sigma}, \tag{18}$$

and for 2 Eq (17) corresponds to a zero-mean Gaussian PDF.

Although GG distributions closely approximate the observed histogram of wavelet coefficients, only approximate rate and distortion expressions for a uniformly quantized GG random variable are known (Fraysse et al,. 2008). The extension of these expressions to embedded quantization is not evident as the rate and distortion functions for such distributions are not easily tractable and can only be computed numerically. Moreover, computing these quantities gets very cumbersome due to the slow numerical integration of expressions involving a GG probability function, especially for 1 .

#### **3.2 Proposed Laplacian mixture model**

In order to avoid the aforementioned drawbacks of GG distributions, we propose a simple Laplacian mixture (LM) model which not only gives an easy closed-form derivation of the distortion and rate quantities but also better approximates the observed histogram of wavelet coefficients in the majority of cases. The proposed LM is a linear combination of two Laplacian PDFs, i.e.,

$$\forall \mathbf{x} \in \mathbb{R} \quad f\_{LM}(\mathbf{x}) = \boldsymbol{\beta} \cdot f\_{\mathcal{L}}(\mathbf{x}, \sigma\_1) + (1 - \boldsymbol{\beta}) \cdot f\_{\mathcal{L}}(\mathbf{x}, \sigma\_2) \,. \tag{19}$$

Note that ( ) *fLM <sup>x</sup>* indeed defines a probability function, as () 1 *f x dx LM* .

The LM model is fitted over the observed data using the expectation maximization (EM) algorithm (Dempster et al., 1977) in order to determine the parameters <sup>1</sup> , 2 and . The E-step in the EM process calculates two responsibility factors

$$r\_1(i) = \frac{\beta \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_i, \sigma\_1\right)}{\beta \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_{i'}, \sigma\_1\right) + \left(1 - \beta\right) \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_{i'}, \sigma\_2\right)}, \quad r\_2(i) = \frac{\left(1 - \beta\right) \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_i, \sigma\_2\right)}{\beta \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_{i'}, \sigma\_1\right) + \left(1 - \beta\right) \cdot f\_{\mathcal{L}}\left(\mathbf{x}\_i, \sigma\_2\right)},$$

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 577

Fig. 5. Probability function fitting over the observed histogram (Exp) for **d***<sup>J</sup>* <sup>3</sup> -*normal component* for Rabbit (left) and Dino (right). SL is used as the abbreviation of single

Fig. 6. Modeled and observed D-R functions for the histograms of Fig. 5. Rate is taken as bits

Fig. 5 illustrates that the proposed mixture model provides a better fitting probability function for the observed histogram compared to the Laplacian and GG distributions. This is especially true for the middle range positive and negative coefficients values – see Fig. 5. For the Rabbit mesh, LM gives only slightly better fitting than the other two models. However, for Dino, the LM can clearly model the fast decay of the observed histogram more accurately than the GG. The Laplacian PDF in this case only gives a very coarse

Laplacian PDF.

per spatial coordinate component.

approximation of the observed histogram.

of each observation *x iN <sup>i</sup>* , 1 and the M-step updates the parameters to be estimated, as:

$$\sigma\_m = \sqrt{2} \frac{\sum\_{i=1}^N r\_m(i). \left| \boldsymbol{x}\_i \right|}{\sum\_{i=1}^N r\_m(i)}, \quad m = 1, 2, \quad \text{and} \quad \beta = \frac{1}{N} \sum\_{i=1}^N r\_1(i) \dots$$

The E- and M- steps are executed in tandem till the algorithm achieves minimum *Kullback-Leibler* (KL) distance between the observed and model histograms. A better convergence rate is achieved by the initialization condition 2 2 <sup>1</sup> 0.5 *<sup>E</sup>* , 2 2 <sup>2</sup> 2 *<sup>E</sup>* and 0.9 , where 2 *<sup>E</sup>* is the estimated data variance. Histogram fitting for GG distributions is done using the bruteforce method where parameters <sup>1</sup> , <sup>2</sup> and are exhaustively computed for a minimum KL distance.

#### **3.3 Distortion-Rate (D-R) function**

Closed-form expressions for the output distortion *DL* and the output rate *RL* of a Laplacian source quantized using an *n* level EDSQ are derived in the Appendix. In this section, we derive the D-R function for our proposed LM model. Since the distortion is a linear function of the source PDF, the output distortion *DLM* of the LM PDF for any quantization level *n* can be written as:

$$D\_{LM}\left(Q\_{\delta\_n,\Lambda\_n}\right) = \beta \cdot D\_L\left(Q\_{\delta\_n,\Lambda\_n}\right) + \left(1-\beta\right) \cdot D\_L\left(Q\_{\delta\_n,\Lambda\_n}\right), \text{ with } \delta\_n = 1 - \xi\_n. \tag{20}$$

This does not hold for the output rate *RLM* since the entropy involves the non-linear log . function. Instead, *RLM* can be computed as an infinite sum:

$$P\_0 = 2\int\_0^{\delta\_n \Lambda\_n} f\_{LM}(\mathbf{x})d\mathbf{x} \; \prime \; P\_k = \int\_{(k-1+\delta\_n)\Lambda\_n}^{(k+\delta\_n)\Lambda\_n} f\_{LM}(\mathbf{x})d\mathbf{x} \; \prime \; k = 1,2,3..., \text{and } R\_{LM}(Q\_{\delta\_n,\Lambda\_n}) = -\sum\_{k=-\alpha}^{\alpha} P\_k \log\_2 P\_k$$

where *Pk* denotes the probability mass of the *k*th quantization cell ( *k* 0 corresponds to the deadzone cell). Since the LM model is symmetric around its mean, *P P k k* . Note that the probability mass function (PMF) can be computed exactly due to the possibility of analytical integration of ( ) *fLM x* . For the GG distribution, however, only numerical integration is possible.

#### **3.4 Model validation**

This section demonstrates that the proposed LM model is able to approximate the observed histogram and the observed D-R function of 3D wavelet coefficients more accurately compared to the commonly utilized GG distributions. For comparison purpose, results for the single Laplacian *fLM* 0 case are also reported.

of each observation *x iN <sup>i</sup>* , 1 and the M-step updates the parameters to be estimated,

. <sup>1</sup> <sup>2</sup> , 1,2, and

The E- and M- steps are executed in tandem till the algorithm achieves minimum *Kullback-Leibler* (KL) distance between the observed and model histograms. A better convergence rate

the estimated data variance. Histogram fitting for GG distributions is done using the brute-

Closed-form expressions for the output distortion *DL* and the output rate *RL* of a Laplacian source quantized using an *n* level EDSQ are derived in the Appendix. In this section, we derive the D-R function for our proposed LM model. Since the distortion is a linear function of the source PDF, the output distortion *DLM* of the LM PDF for any quantization level *n* can

> 

,

where *Pk* denotes the probability mass of the *k*th quantization cell ( *k* 0 corresponds to the deadzone cell). Since the LM model is symmetric around its mean, *P P k k* . Note that the probability mass function (PMF) can be computed exactly due to the possibility of analytical integration of ( ) *fLM x* . For the GG distribution, however, only numerical integration is

This section demonstrates that the proposed LM model is able to approximate the observed histogram and the observed D-R function of 3D wavelet coefficients more accurately compared to the commonly utilized GG distributions. For comparison purpose, results for

0 case are also reported.

This does not hold for the output rate *RLM* since the entropy involves the non-linear log .

,, ,

1 *LM n n <sup>L</sup> n n <sup>L</sup> n n D Q DQ D Q* , with

<sup>1</sup> <sup>1</sup>

 2 2 <sup>1</sup> 0.5 *<sup>E</sup>* ,

*m r i*

 2 2 <sup>2</sup> 2 *<sup>E</sup>* and

 

*Pk <sup>k</sup> f x dx LM* , *<sup>k</sup>* 1,2,3..., and , 2 ( ) log

 

*m i N*

*<sup>i</sup> <sup>m</sup>*

*<sup>N</sup> r i*

1

are exhaustively computed for a minimum

 1 *n n* . (20)

 *LM n n k k k RQ P P*

0.9 , where

2 *<sup>E</sup>* is

.

*rix*

1

 <sup>1</sup> , <sup>2</sup> and

*i*

*N*

*<sup>i</sup> <sup>m</sup> <sup>N</sup>*

is achieved by the initialization condition

force method where parameters

**3.3 Distortion-Rate (D-R) function** 

*P f x dx LM* , ( )

function. Instead, *RLM* can be computed as an infinite sum:

*k*

 *n n n n*

(1 ) ( )

KL distance.

be written as:

possible.

<sup>0</sup> <sup>0</sup> 2 () *n n*

**3.4 Model validation** 

the single Laplacian *fLM*

as:

Fig. 5. Probability function fitting over the observed histogram (Exp) for **d***<sup>J</sup>* <sup>3</sup> -*normal component* for Rabbit (left) and Dino (right). SL is used as the abbreviation of single Laplacian PDF.

Fig. 6. Modeled and observed D-R functions for the histograms of Fig. 5. Rate is taken as bits per spatial coordinate component.

Fig. 5 illustrates that the proposed mixture model provides a better fitting probability function for the observed histogram compared to the Laplacian and GG distributions. This is especially true for the middle range positive and negative coefficients values – see Fig. 5. For the Rabbit mesh, LM gives only slightly better fitting than the other two models. However, for Dino, the LM can clearly model the fast decay of the observed histogram more accurately than the GG. The Laplacian PDF in this case only gives a very coarse approximation of the observed histogram.

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 579

0.039 (3.5)

0.076 (7.6)

0.038 (4.0)

0.082 (8.1)

0.111 (11.2)

0.085 (9.0)

0.623 (34.7)

1.832 (32.4)

0.483 (39.7)

From Table 1, it is evident that on average the proposed LM model performs better than the GG and Laplacian models also in the *ME* sense. Better *ME* results are also obtained for SAWT (not reported here). Hence, the proposed LM model along with the derived D-R function is a better choice for modeling both the histogram and the D-R curve of mesh wavelet coefficients compared to the contemporary models. One notices that, a best

Table 2 reports the model validation results for different resolution subbands. For each trial the average is taken across the three spatial coordinate components. It is observed that the GG model performs slightly better for the low-resolution subbands of some meshes. The observed histograms in such cases are more Gaussian-alike, i.e., they have a round top. In general, the LM model faces difficulty in approximating such a round-top histogram due to the peaky nature of each of its Laplacian components; the GG fits well such histograms, as it

average, the LM model outperforms the Laplacian and the GG models in *KL* as well as in

In this section, conclusions regarding the optimal EDSQ to be used in scalable wavelet-based coding of meshes are drawn. Let *z* denote the ratio between the deadzone size for *n* = 0 (see Eq. (16)) and the step size for *n* 0 of a general EDSQ. The total average signal-to-noise

Table 2. *KL* (%*ME*) for three resolution subbands averaged over the three coordinate

*J* 1 *J* 2 *J* 3 *J* 1 *J* 2 *J* 3 *J* 1 *J* 2 *J* 3

0.008 (0.85)

0.010 (1.5)

0.009 (1.6)

0.011 (1.0)

0.011 (1.1)

0.013 (1.9)

0.074 (12.8)

0.040 (7.6)

0.074 (18.2) 0.025 (1.5)

0.027 (1.7)

0.023 (0.80)

0.035 (1.3)

0.035 (1.8)

0.034 (2.0)

0.049 (4.6)

0.076 (5.1)

0.065 (7.3)

0.027 (9.5)

0.029 (9.9)

0.036 (11.0)

0.029 (8.6)

0.029 (8.6)

0.042 (11.5)

0.031 (41.3)

0.036 (40.2)

0.035 (64.0)

2 . Nevertheless, the results show that, on

0.024 (5.6)

0.041 (10.1)

0.031 (6.8)

0.032 (6.8)

0.040 (8.6)

0.038 (8.7)

0.042 (32.9)

0.068 (15.1)

0.064 (54.3) 0.033 (2.8)

0.049 (4.1)

0.032 (2.7)

0.054 (4.9)

0.058 (5.9)

0.058 (5.4)

0.058 (20.4)

0.066 (8.3)

0.101 (25.4)

0.014 (2.9)

0.010 (2.1)

0.016 (2.4)

0.008 (1.5)

0.007 (1.5)

0.011 (2.0)

0.029 (13.7)

0.054 (34.2)

0.067 (53.2)

**Type Mesh (Filter) SL LM GG** 

0.038 (8.4)

0.070 (13.9)

0.054 (11.0)

0.062 (11.4)

0.093 (13.9)

0.088 (14.7)

0.873 (46.7)

1.981 (49.9)

0.696 (61.6)

histogram fitting in *KL* sense may not always yield the lowest *ME*.

**Mesh** 

*Venus(U-BF)* 0.044

*Venus(L-BF)* 0.050

*Venus(Loop)* 0.051

*Rabbit(U-BF)* 0.064

*Rabbit(L-BF)* 0.069

*Rabbit(Loop)* 0.082

*Dino(U-BF)* 1.208

*Skull(U-BF)* 2.039

corresponds to a Gaussian distribution for

**3.5 Optimal embedded quantization** 

*Skrewdriver(U-BF)*

(13.6)

(13.8)

(13.8)

(14.0)

(14.2)

(16.4)

(56.7)

(65.7)

0.536 (67.0)

**Non-Normal** 

**Normal**

components*.* 

*ME* sense.


Table 1. *KL* (%*ME*, the modeling error as defined in Eq (19)) for the *normal* (*NOR*) and the two *tangential* components (*TAN1, TAN2*) averaged over the five subbands*. U-BF* (Unlifted Butterfly), *L-BF* (Lifted Butterfly).

Fig. 6 plots the observed and model D-R curves for the same subband as the one used in Fig. 5. For Rabbit, the LM D-R almost completely overlaps the observed D-R curve. In both cases, the D-R function of the proposed LM model follows the experimental D-R curve more closely than the other two models.

In Table 1, the average *KL* divergence results for the Laplacian, GG and LM models for two non-normal (Venus, Rabbit) and three normal (Dino, Skull, Skredriver) meshes are shown. Each of the three coordinate components is considered separately. For each trial of Table 1, average is taken over five highest resolution subbands. For the large majority of cases, the LM model gives better fitting of the observed histogram than the competing GG model. Note that the Laplacian model gives always the worst fitting results. Also, the LM model gives equally good fitting for both *normal* (*Nor*) and *tangential* (*Tan 1* and *Tan 2*) components. Superior histogram fitting results of our proposed model are also observed for the SAWT of Section 2.3. These results are not reported here due to lack of space.

In Table 1, the percentage modeling error *ME*(%) relative to the *KL* divergence is shown in parenthesis of each table entry. The *ME*(%) is defined in order to gauge the D-R accuracy of the proposed mixture model with respect to other two models. *ME*(%) is defined as:

$$ME(\%) = \frac{\int \left| D\_M \left( R \right) - D\_E \left( R \right) \right|}{\int \max\_{R} \left\{ D\_M \left( R \right), D\_E \left( R \right) \right\}} \times 100 \,\tag{21}$$

0.075 (1.3)

0.090 (2.0)

0.085 (3.3)

0.134 (1.4)

0.143 (2.5)

0.115 (2.4)

0.145 (5.6)

0.120 (3.9)

0.309 (14.4)

*Nor Tan 1 Tan 2 Nor Tan 1 Tan 2 Nor Tan1 Tan 2* 

0.100 (3.3)

0.108 (2.9)

0.090 (1.8)

0.136 (2.0)

0.140 (1.5)

0.156 (1.8)

0.147 (7.8)

0.138 (7.9)

0.251 (17.0) 0.091 (2.8)

0.080 (1.9)

0.069 (1.7)

0.132 (1.8)

0.138 (1.8)

0.135 (2.3)

0.154 (9.4)

0.157 (20.5)

0.263 (20.0) 0.086 (4.7)

0.112 (8.1)

0.098 (7.1)

0.150 (5.7)

0.160 (6.7)

0.136 (7.9)

0.165 (7.5)

0.145 (12.8)

0.315 (25.0) 0.102 (5.0)

0.121 (5.4)

0.092 (3.9)

0.143 (5.2)

0.152 (5.1)

0.177 (7.6)

0.132 (23.4)

0.141 (15.4)

0.262 (34.8) 0.089 (4.7)

0.092 (4.1)

0.081 (5.0)

0.147 (6.2)

0.153 (5.3)

0.152 (5.2)

0.141 (30.3)

0.141 (19.9)

0.249 (35.2)

**Type Mesh (Filter) SL LM GG** 

0.103 (8.9)

0.104 (6.0)

0.091 (6.7)

0.172 (10.2)

0.177 (8.4)

0.173 (8.3)

0.971 (42.7)

1.877 (50.4)

0.6377 (41.9)

Table 1. *KL* (%*ME*, the modeling error as defined in Eq (19)) for the *normal* (*NOR*) and the two *tangential* components (*TAN1, TAN2*) averaged over the five subbands*. U-BF* (Unlifted

Fig. 6 plots the observed and model D-R curves for the same subband as the one used in Fig. 5. For Rabbit, the LM D-R almost completely overlaps the observed D-R curve. In both cases, the D-R function of the proposed LM model follows the experimental D-R curve more

In Table 1, the average *KL* divergence results for the Laplacian, GG and LM models for two non-normal (Venus, Rabbit) and three normal (Dino, Skull, Skredriver) meshes are shown. Each of the three coordinate components is considered separately. For each trial of Table 1, average is taken over five highest resolution subbands. For the large majority of cases, the LM model gives better fitting of the observed histogram than the competing GG model. Note that the Laplacian model gives always the worst fitting results. Also, the LM model gives equally good fitting for both *normal* (*Nor*) and *tangential* (*Tan 1* and *Tan 2*) components. Superior histogram fitting results of our proposed model are also observed for the SAWT of

In Table 1, the percentage modeling error *ME*(%) relative to the *KL* divergence is shown in parenthesis of each table entry. The *ME*(%) is defined in order to gauge the D-R accuracy of

 

*M E*

*D R DR*

 (%) <sup>100</sup> max ,

*M E <sup>R</sup> <sup>R</sup>*

*D RDR* . (21)

the proposed mixture model with respect to other two models. *ME*(%) is defined as:

*R*

0.114 (10.3)

0.137 (10.1)

0.102 (7.3)

0.171 (10.4)

0.188 (10.7)

0.207 (11.2)

0.656 (34.3)

1.473 (44.9)

0.6477 (42.4)

Section 2.3. These results are not reported here due to lack of space.

*ME*

**Mesh** 

*Venus(U-BF)* 0.097

*Venus(L-BF)* 0.137

*Venus(Loop)* 0.113

*Rabbit(U-BF)* 0.170

*Rabbit(L-BF)* 0.208

*Rabbit(Loop)* 0.167

*Dino(U-BF)* 0.527

*Skull(U-BF)* 1.108

*Skrewdriver(U-BF)* 

Butterfly), *L-BF* (Lifted Butterfly).

closely than the other two models.

(6.3)

(11.4)

(9.4)

(8.6)

(12.0)

(11.2)

(16.2)

(37.2)

0.5294 (33.0)

**Non-Normal** 

**Normal**


Table 2. *KL* (%*ME*) for three resolution subbands averaged over the three coordinate components*.* 

From Table 1, it is evident that on average the proposed LM model performs better than the GG and Laplacian models also in the *ME* sense. Better *ME* results are also obtained for SAWT (not reported here). Hence, the proposed LM model along with the derived D-R function is a better choice for modeling both the histogram and the D-R curve of mesh wavelet coefficients compared to the contemporary models. One notices that, a best histogram fitting in *KL* sense may not always yield the lowest *ME*.

Table 2 reports the model validation results for different resolution subbands. For each trial the average is taken across the three spatial coordinate components. It is observed that the GG model performs slightly better for the low-resolution subbands of some meshes. The observed histograms in such cases are more Gaussian-alike, i.e., they have a round top. In general, the LM model faces difficulty in approximating such a round-top histogram due to the peaky nature of each of its Laplacian components; the GG fits well such histograms, as it corresponds to a Gaussian distribution for 2 . Nevertheless, the results show that, on average, the LM model outperforms the Laplacian and the GG models in *KL* as well as in *ME* sense.

## **3.5 Optimal embedded quantization**

In this section, conclusions regarding the optimal EDSQ to be used in scalable wavelet-based coding of meshes are drawn. Let *z* denote the ratio between the deadzone size for *n* = 0 (see Eq. (16)) and the step size for *n* 0 of a general EDSQ. The total average signal-to-noise

Optimized Scalable Wavelet-Based Codec Designs for Semi-Regular 3D Meshes 581

Similar to images, parent-children and neighboring wavelet coefficient dependencies exist in wavelet decomposed mesh structure. In Fig. 8 (middle, right), the positions of the wavelet coefficients at different levels of the transform are shown with the help of white and dark circles. In particular, wavelet coefficients have a one-to-one correspondence with the edges of the coarser mesh. For each wavelet coefficient there are rings of neighboring coefficients which lie in the same wavelet subband – see Fig. 8 (right). Also, a set of four wavelet coefficients have a parent coefficient at the next coarser resolution – see Fig. 8 (middle,

Fig. 8. Parent-children and neighboring wavelet coefficients: actual mesh (left); coarser

Statistical intraband dependencies exist between neighboring coefficients of each resolution level. The main reason for the existence of these dependencies is the smoothness of the surface. Wavelet coding paradigms that exploit the intraband dependencies between the wavelet coefficients are known as intraband wavelet codecs such as block-based coding techniques (Munteanu et al., 1999a), quadtree coding approaches (Munteanu et al., 1999b), and the EBCOT codec employed in the JPEG-2000 scalable image coding standard

Statistical dependencies also exist between the parent and descendants (children) due to the natural decay of the coefficients' magnitude for increasing frequencies. In other words, if a parent coefficient magnitude is below a certain threshold, then there is a high probability that the magnitude of its descendants will be also below this threshold. This corresponds to the so-called zerotree-model, firstly introduced by Shapiro in (Shapiro, 1993). The wavelet coding paradigms that exploit the parent-children dependencies are known as interband

Finally, there is a third category of coding paradigms, exploiting both the interband and intraband statistical dependencies between the wavelet coefficients. They are generally known as composite codecs, EZBC (Hsiang & Woods, 2000) and the ECECOW approach of

In the following, an information theoretical analysis of the aforementioned coefficient dependencies is presented. Our aim is to single out the type of dependencies which can ensure best compression performance in the context of wavelet-based mesh compression.

(Wu, 1997) are typical examples of codecs in this category.

meshes after one (middle), and after two wavelet decomposition levels (right).

**4. Analysis of wavelet coefficient dependencies** 

right).

(Taubman, 2000).

wavelet codecs.

ratio (*SNR*) difference which is utilized to measure the performance gap of different embedded quantizers is defined as:

$$\Delta \overline{SNR} = \frac{1}{N} \sum\_{\Re} \left( SNR(\Re) - SNR(\Re) \right)\_{\Im}$$

which is computed over a rate range for *N* rate points, where *SNR R*( ) denotes the discrete *SNR*-rate function. The 2 <sup>10</sup> *SNR* 10log ( ) *D* is computed in dBs, where *D* is the total distortion in the transform domain. The difference in *SNR* is computed relative to the uniform embedded quantizer (UEQ), i.e., *z* 1 . *SNR* for five embedded deadzone quantizers is plotted in Fig. 7. over a wide range of standard deviation ratios 2 1 . In Fig. 7., the commonly observed proportion 0.9 is considered, as mentioned in Section 3.2.

We determined experimentally that at lower standard deviation ratios, *SNR* is positive and the UEQ is optimal for 2 1 120 . For 2 1 120 290 , the quantizer with *z* 1.5 performs better compared to all other quantizers. Similarly, *z* 2 (i.e. the SAQ) performs the best in the range 2 1 290 600 , while *z* 2.5 performs the best for 600 2 1 . In general, small standard deviation ratios correspond to close to 1 , observed in nonnormal meshes, while higher ratios correspond to 1 , observed in normal meshes. These results show that one cannot determine a single embedded quantizer that provides the best performance for *all* 3D meshes. However, an optimal quantizer per wavelet coordinate can be determined based on the corresponding 2 1 extracted from the model.

Overall, for 2 1 120 , the difference between SAQ and the UEQ is significant, and hence UEQ is the optimal choice. For 2 1 120 , SAQ is not always the optimum, but lies not far from the optimum.

Fig. 7. *SNR* difference for five EDSQs with respect to UEQ.

Given the fact that SAQ is closely linked to bit-plane coding and that it can be implemented using simple binary arithmetic, one concludes that SAQ is not an optimal, but an acceptable solution in scalable coding of meshes.
