**2. Overview of the AVS system**

Our research aims at developing an Automatic Video-Surveillance (AVS) system using the passive stereo vision principle. The proposed imaging system uses two color cameras to detect and localize any kind of object lying on a railway level crossing. The system supervises and estimates automatically the critical situations by localizing objects in the hazardous zone defined as the crossing zone of a railway line by a road or path. The AVS system is used to monitor dynamic scenes where interactions take place among objects of interest (people or vehicles). After a classical image grabbing and digitizing step, this architecture is composed of the two following modules:

– *Background Subtraction for Moving and Stationary object detection:* the first step consists in separating the motion and stationary regions from the background. It is performed using Spatio-temporal Independent Component Analysis (stICA) technique for high-quality motion detection. The color information is introduced in the ICA algorithm that models the background and the foreground as statistically independent signals in space and time. Although many relatively effective motion estimation methods exist, ICA is retained for two reasons: first, it is less sensitive to noise caused by the continuously environment changes over time, such as swaying branches, sensor noise, and illumination changes. Second, this method provides clear-cut separation of the objects from the background, and can detect objects that remain motionless for a long period. Foreground extraction is performed separately on both cameras. The motion detection step allows focusing on the areas of interest, in which 3-D localization module is applied.

– *3-D localization of Moving and Stationary object detection:* this process applies a specific stereo matching algorithm for localizing the detected objects. In order to deal with poor quality images, a selective stereo matching algorithm is developed and applied to the moving regions. First, a disparity map is computed for all moving pixels according to a dissimilarity function entitled Weighted Average Color Difference (WACD) detailed in Fakhfakh et al. (2010). An unsupervised classification technique is then applied to the initial set of matching pixels. This allows to automatically choose only well-matched pixels. A pixel is considered as well-matched if the pair of matched pixels have a confidence measure higher than a threshold. The classification is performed applying a Confidence Measure technique detailed in Fakhfakh et al. (2009). It consists in evaluating the result of the likelihood function, based on the *winner-take-all* strategy. However, the disparities of pixels considered as badly-matched are then estimated applying a hierarchical belief propagation technique detailed further. This allows obtaining, for each obstacle, a high accurate dense disparity map.
