**1. Introduction**

16 Will-be-set-by-IN-TECH

222 Principal Component Analysis

Birchfield, S. & Rangarajan, S. (2005). Spatiograms versus histograms for region-based

Cheng, Y. (1995). Mean shift, mode seeking, and clustering, *IEEE Transactions on Pattern*

Comaniciu, D., Meer, P. & Ramesh, V. (2003). Kernel-based object tracking, *IEEE Transactions*

Elgammal, A., Duraiswami, R. & Davis, L. (2003). Probabilistic tracking in joint feature-spatial spaces, *IEEE Conference on Computer Vision and Pattern Recognition*, pp. 781–788. Freedman, D. & Zhang, T. (2004). Active contours for tracking distributions, *IEEE Transactions*

Girolami, M. (2002). Orthogonal series density estimation and the kernel eigenvalue problem,

Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B. & Smola, A. (2007). A kernel method for

Hager, G., Dewan, M. & Stewart, C. (2004). Multiple kernel tracking with SSD, *IEEE Conference*

Kim, J. & Scott, C. (2008). Robust kernel density estimation, *IEEE International Conference on*

Leventon, M. (2002). *Statistical models in medical image analysis*, PhD thesis, Massachusetts

Porikli, F., Tuzel, O. & Meer, P. (2006). Covariance tracking using model update based means

Rathi, Y., Dambreville, S. & Tannenbaum, A. (2006). Statistical shape analysis using kernel

Rathi, Y., Malcolm, J. & Tannenbaum, A. (2006). Seeing the unseen: Segmenting with

Schölkopf, B. & Smola, A. (2001). *Learning with Kernels: Support Vector Machines, Regularization,*

Scholköpf, B., Smola, A. & Muller, K.-R. (1998). Nonlinear component analysis as a kernel

Smola, A., Gretton, A., Song, L. & Schölkopf, B. (2007). A Hilbert space embedding for

Yang, C. & R. Duraiswami, L. D. (2005). Efficient mean-shift tracking via a new similarity measure, *IEEE Conference on Computer Vision and Pattern Recognition*, pp. 176–183.

distributions, *International Conference on Signal and Image Processing*.

on Riemannian manifolds, *IEEE Conference on Computer Vision and Pattern Recognition*,

Huber, P., Ronchetti, E. & MyiLibrary (1981). *Robust statistics*, Vol. 1, Wiley Online Library. Kalman, R. (1960). A new approach to linear filtering and prediction problems, *Journal of Basic*

the two-sample problem, *Technical Report 157*, Max Planck Institute.

tracking, *IEEE Conference on Computer Vision and Pattern Recognition*, Vol. 2,

**8. References**

pp. 1158–1163.

*on Image Processing* 13(4).

*Engineering* 82(1): 35–45.

Institute of Technology.

pp. 728–735.

*Neural Computation.* 14(3): 669–688.

*Analysis and Machine Intelligence* 17: 790–799.

*on Pattern Analysis and Machine Intelligence* 25: 564–577.

*on Computer Vision and Pattern Recognition*, pp. 790–797.

*Acoustics, Speech and Signal Processing*, pp. 3381–3384.

PCA, *Proceedings of SPIE*, Vol. 6064, pp. 425–432.

distributions, *Lecture Notes in Computer Science* .

eigenvalue problem, *Neural Computation* pp. 1299–1319.

*Optimization, and Beyond*, The MIT Press.

The analysis and understanding of video sequences is currently quite an active research field. Many applications such as video surveillance, optical motion capture or those of multimedia need to first be able to detect the objects moving in a scene filmed by a static camera. This requires the basic operation that consists of separating the moving objects called "foreground" from the static information called "background". Many background subtraction methods have been developed (Bouwmans et al. (2010); Bouwmans et al. (2008)). A recent survey (Bouwmans (2009)) shows that subspace learning models are well suited for background subtraction. Principal Component Analysis (PCA) has been used to model the background by significantly reducing the data's dimension. To perform PCA, different Robust Principal Components Analysis (RPCA) models have been recently developped in the literature. The background sequence is then modeled by a low rank subspace that can gradually change over time, while the moving foreground objects constitute the correlated sparse outliers. However, authors compare their algorithm only with the PCA (Oliver et al. (1999)) or another RPCA model. Furthermore, the evaluation is not made with the datasets and the measures currently used in the field of background subtraction. Considering all of this, we propose to evaluate RPCA models in the field of video-surveillance. Contributions of this chapter can be summarized as follows:


The rest of this paper is organized as follows: In Section 2, we firstly provide the survey on robust principal component analysis. In Section 3, we evaluate and compare robust principal component analysis in order to achieve background subtraction. Finally, the conclusion is established in Section 4.

#### **2. Robust principal component analysis: A review**

In this section, we review the original PCA and five recent RPCA models and their applications in background subtraction:

• Principal Component Analysis (PCA) (Eckart & Young (1936); Oliver et al. (1999))

model presents several limitations. The first limitation of this model is that the size of the foreground object must be small and don't appear in the same location during a long period in the training sequence. The second limitation appears for the background maintenance. Indeed, it is computationally intensive to perform model updating using the batch mode PCA. Moreover without a mechanism of robust analysis, the outliers or foreground objects may be absorbed into the background model. The third limitation is that the application of this model is mostly limited to the gray-scale images since the integration of multi-channel data is not straightforward. It involves much higher dimensional space and causes additional difficulty to manage data in general. Another limitation is that the representation is not multimodal so various illumination changes cannot be handled correctly. In this context, several robust PCA

<sup>225</sup> Robust Principal Component Analysis

for Background Subtraction: Systematic Evaluation and Comparative Analysis

Torre & Black (2003) proposed a Robust Subspace Learning (RSL) which is a batch robust PCA method that aims at recovering a good low-rank approximation that best fits the majority of the data. RSL solves a nonconvex optimization via alternative minimization based on the idea of soft-detecting andown-weighting the outliers. These reconstruction coefficients can be arbitrarily biased by an outlier. Finally, a binary outlier process is used which either completely rejects or includes a sample. Below we introduce a more general analogue outlier process that has computational advantages and provides a connection to robust M-estimation.

*ρ*(*A* − *μ***1n**

where *μ* is the mean vector and the *ρ* − *f unction* is the particular class of robust *ρ*-function (Black & Rangarajan (1996)). They use the Geman-McClure error function *<sup>ρ</sup>*(*x*, *<sup>σ</sup>p*) = *<sup>x</sup>*<sup>2</sup>

where *σ<sup>p</sup>* is a scale parameter that controls the convexity of the robust function. Similar, the

to this *ρ* − *f unction*. This is confirmed by the results presented whitch show that the RSL outperforms the standard PCA on scenes with illumination change and people in various

where *L* is a low-rank matrix and *S* must be sparse matrix. The straightforward formulation

where *λ* is arbitrary balanced parameter. But this problem is NP-hard, typical solution might involve a search with combinatorial complexity. For solve this more easily, the natural way is

� − *S kk U ik V jk*

*Lpi* <sup>−</sup> <sup>1</sup>)2. The robustness of De La Torre's algorithm is due

*A* = *L* + *S* (7)

*Rank*(*L*) + *λ*||*S*||<sup>0</sup> subj *A* = *L* + *S* (8)

) , 1 ≤ *k* ≤ *r* (6)

*x*<sup>2</sup>+*σ*<sup>2</sup> *p*

can be used to alleviate these limitations.

**2.2 RPCA via Robust Subspace Learning**

The energy function to minimize is then:

**2.3 RPCA via Principal Component Pursuit**

is to use *L*<sup>0</sup> norm to minimize the energy function:

argmin *L*,*S*

penalty term associate is (

locations.

(*S*0, *U*0, *V*0) = argmin

*S*,*U*,*V*

min(*n*,*m*) ∑ *r*=1

Candes et al. (2009) achieved Robust PCA by the following decomposition:


#### **2.1 Principal component analysis**

Assuming that the video is composed of *n* frames of size *width* × *height*. We arrange this training video in a rectangular matrix *<sup>A</sup>* <sup>∈</sup> *<sup>R</sup>m*×*<sup>n</sup>* (*<sup>m</sup>* is the total amount of pixels), each video frame is then vectorized into column of the matrix *A*, and rows correspond to a specific pixel and its evolution over time. The PCA firstly consists of decomposing the matrix *A* in the product *USV*� . where *<sup>S</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*n*(*diag*) is a diagonal matrix (singular values), *<sup>U</sup>* <sup>∈</sup> **<sup>R</sup>***m*×*<sup>n</sup>* and *<sup>V</sup>* <sup>∈</sup> **<sup>R</sup>***n*×*<sup>n</sup>* (singular vectors) . Then only the principals components are retained. To solve this decomposition, the following function is minimized (in tensor notation):

$$\mathbf{S}(\mathbf{S}\_{0}, \mathbf{L}\_{0}, V\_{0}) = \underset{\mathbf{S}, \mathbf{L}, V}{\operatorname{argmin}} \sum\_{r=1}^{\min(n, m)} ||A - \underset{\mathbf{k} \mathbf{k} \text{ ik } \mathbf{j} \mathbf{k}}{\operatorname{ll}}||\_{\text{F}}^{2}, \quad 1 \le k \le r \quad \text{subj} \quad \begin{cases} \operatorname{U} \mathbf{L} = \operatorname{V} \mathbf{v} = 1 & \text{if } \quad i = j \\ \operatorname{S} = 0 & \text{if } \quad i \ne j \\ \operatorname{i} & \text{if } \quad i \ne j \end{cases} \tag{1}$$

This imply singular values are straightly sorted and singular vectors are mutually orthogonal (*U*� <sup>0</sup>*U*<sup>0</sup> = *V*� <sup>0</sup>*V*<sup>0</sup> = *In*). The solutions *S*0, *U*<sup>0</sup> and *V*<sup>0</sup> of (1) are not unique.

We can define *U*<sup>1</sup> and *V*1, the set of cardinality 2min(*n*,*m*) of all solution;

$$\begin{array}{ccccc} \mathcal{U}\_1 = \mathcal{U}\_0 \mathbb{R} & \mathcal{V}\_1 = \mathcal{R} \mathcal{V}\_0 & \mathcal{R} = \begin{cases} \pm 1 & \text{if } \text{ i} = j \\ 0 & \text{elsewhere} \end{cases} \end{array} \tag{2}$$

We choose *k* (small) principal components:

$$\underset{\text{ij}}{\mathcal{U}}\_{\text{ij}} = \mathcal{U}\_1 \; \; \; \; 1 \le j \le k \tag{3}$$

The background is computed as follows:

$$B\mathfrak{g} = \mathcal{U}\mathcal{U}^{\prime}v\tag{4}$$

where *v* is the current frame. The foreground dectection is made by thresholding the difference between the current frame *v* and the reconstructed background image (in Iverson notation):

$$F\mathcal{g} = \left[ |\mathcal{v} - \mathcal{B}\mathcal{g}| < T \right] \tag{5}$$

where *T* is a constant threshold.

Results obtained by Oliver et al. (1999) show that the PCA provides a robust model of the probability distribution function of the background, but not of the moving objects while they do not have a significant contribution to the model. As developped in Bouwmans (2009), this

<sup>1</sup> http://tfocs.stanford.edu/

<sup>2</sup> http://perception.csl.uiuc.edu/matrix-rank/sample\_code.html

model presents several limitations. The first limitation of this model is that the size of the foreground object must be small and don't appear in the same location during a long period in the training sequence. The second limitation appears for the background maintenance. Indeed, it is computationally intensive to perform model updating using the batch mode PCA. Moreover without a mechanism of robust analysis, the outliers or foreground objects may be absorbed into the background model. The third limitation is that the application of this model is mostly limited to the gray-scale images since the integration of multi-channel data is not straightforward. It involves much higher dimensional space and causes additional difficulty to manage data in general. Another limitation is that the representation is not multimodal so various illumination changes cannot be handled correctly. In this context, several robust PCA can be used to alleviate these limitations.
