**3.1 State of the art**

2 Will-be-set-by-IN-TECH

and automobiles. In Fakhfakh et al. (2010), the conventional technologies applied at LC are

One of the main operational purposes for the introduction of CCTV (Closed Circuit Television) at LC is the automatic detection of specific events. Some object detection vision-based systems have been tested at level crossings, and provide more or less significant information. In video surveillance, one camera, or a set of cameras, supervise zones considered as unsafe in which security must be increased Fakhfakh et al. (2011). Referring to the literature, little research has focused on passive vision to solve the problems at LC. Among the existing systems, two of them based on CCTV cameras are to be distinguished: a system using a single camera Foresti (1998). It uses a single grayscale CCD camera placed on a high pole in a corner of the LC, classifying objects as cars, bikes, trucks, pedestrians and others, and localizing them according to the camera calibration process, assuming a planar model of the road and railroad. This system is prone to false and missed alarms caused by fast illumination changes or shadows. In Ohta (2005), a second system using two cameras with a basic stereo matching algorithm and 3D background removal. This system allows detecting more or less vehicles and pedestrians, but it is extremely sensitive to adverse weather conditions. The 3D localization module is not

discussed and both the advantages and drawbacks of each are highlighted.

very accurate because of the simplicity of the proposed stereo matching algorithm.

and perspectives are provided.

of the two following modules:

**2. Overview of the AVS system**

We propose in this chapter an Automatic Video-Surveillance system (AVS) for an automatic detection of specific events at level crossing. The system allows automatically and accurately detecting and 3D localizing obstacles which are stopped or in motion at the level crossing. This information can be timely transmitted to the train's driver, in a form of red lighting in the cabin, and, on his monitor, the images of such hazardous situation. So, we would be able to evaluate the risk and to warn the appropriate persons. This chapter is organized as follows: after an introduction covering the problem and the area of reserach, we describe in section 2 an overview of our proposed system for object localization at LC. Section 3 will focus on detailing the background subtraction algorithm for stationary and moving object detection from real scenes. Section 4 is dedicated to outlining a robust approach for 3D localization the objects highlighted in section 3. Results concern the object extraction and 3D localization steps are detailed in Section 6. The conclusion is devoted to a discussion on the obtained results,

Our research aims at developing an Automatic Video-Surveillance (AVS) system using the passive stereo vision principle. The proposed imaging system uses two color cameras to detect and localize any kind of object lying on a railway level crossing. The system supervises and estimates automatically the critical situations by localizing objects in the hazardous zone defined as the crossing zone of a railway line by a road or path. The AVS system is used to monitor dynamic scenes where interactions take place among objects of interest (people or vehicles). After a classical image grabbing and digitizing step, this architecture is composed

– *Background Subtraction for Moving and Stationary object detection:* the first step consists in separating the motion and stationary regions from the background. It is performed using Spatio-temporal Independent Component Analysis (stICA) technique for high-quality motion detection. The color information is introduced in the ICA algorithm that models Complex scenes acquired in outdoor environments require advanced tools to be dealt with, for instance, sharp brightness variation, swaying branches, shadows and sensor noise. The use of stationary cameras restricts the choice of techniques to those based on temporal differencing and background subtraction. The latter aims at segmenting foreground regions corresponding to moving objects from the background, somehow by evaluating the difference of pixel features between a reference background and a current scene image. This kind of technique requires updating the background model over time by modeling the possible states that a pixel can take. A trade-off is to be found between performing a real time implementation and handling background changes which are caused by gradual or sudden illumination fluctuations and moving background objects.

The pixel-based techniques assumes statistical independence between the intensity at each pixel throughout the training sequence of images. The main drawback is that it is not effective to model a complex scene. A mixture of Gaussian distribution (GMM) Stauffer & Grimson (2000) have been proposed to model complex and non-static scenes. It consists of modeling the background as a constant or adaptive number of Gaussians. A relatively robust non-parametric method has been proposed in Elgammal et al. (2000). The authors estimate the density function of a distribution given only very recent history information. This method allows obtaining a sensitive detection. In Zhen & Zhenjiang (2008) the authors use an improved GMM and Graph Cut to minimize an energy function to extract foreground objects. The main disadvantage is that the fast variations cannot be accurately modeled.

the background images, the ICA algorithm allows estimating a source which represents the temporal difference between pixels. Typically, only the five most recent background images seem to be sufficient in our experiments. The matrix which allows separating the foreground from its background, termed de-mixing matrix, is estimated in the following way: the ICA algorithm is performed only once on a data matrix from which the independent components, i.e. the background and the foreground, will be estimated. The data matrix is constructed from two images which are the most recent background, and another on which a foreground

<sup>79</sup> Intelligent Surveillance System Based

− The detection step consists of the approximation and the extraction of foreground objects. However, the data matrix is constructed from two images; one is an incoming image from the sequence and the other is the most recent available background. The approximated foreground is then obtained simply by multiplying the data matrix with the de-mixing matrix. The approximated foreground is filtered in order to effectively segment the true foreground objects. This is performed by the use of a spatio-temporal belief propagation method. The principal guidelines of our framework can be explained and summarized in

 




 '

&(& -

\*- 

+-
