**Forest Structure Retrieval from Multi-Baseline SARs**

Stefano Tebaldini *Politecnico di Milano Italy*

#### **1. Introduction**

26 Remote Sensing of Biomass – Principles and Applications

Woodhouse, I.H., Nichol, C., Sinclair, P., Jack, J., Morsdorf, F., Malthus, T. and Patenaude,

Wulder, M., Bater, C., Coops, N.C., Hilker, T. and White, J. (2008). The role of LiDAR in

Zhang, K., Chen, S., Whitman, D., Shyu, M., Yan, J. and Zhang, C. (2003). A Progressive

sustainable forest management, *The Forestry Chronicle*, 84(6).

*Geoscience and Remote Sensing Letters*.

G. (in press 2011). A Multispectral Canopy LiDAR Demonstrator Project, *IEEE* 

Morphological Filter for Removing Nonground Measurements from Airborne Lidar Data, *IEEE Transactions on Geoscience and Remote Sensing*, 41(4): 872-882. Zianis, D., Muukkonen, P., Mäkipää, R. and Mencuccini, M. (2005). Biomass and stem volume equations for tree species in Europe, *Silva Fennica*, Monographs 4: 1-63.

> The vertical structure of the forested areas has been recognized by the scientific community as one major element in the assessment of forest biomass. Dealing with a volumetric object, the remote sensing techniques that are best suited to infer information about the forest structure are those that guarantee under foliage penetration capabilities. One important, widely popular technology used to investigate the forest vertical structure from the above (typically on board aircrafts) is given by high resolution LIDAR (Light Detection and Ranging) sensors, whose signals penetrate down to the ground trough the gaps within the vegetation layer. In the recent years, however, the attention of the scientific community has been drawn by the use of SAR (Synthetic Aperture Radar) techniques. As opposed to LIDARs, for which high resolution is crucial, the signals emitted by SARs propagate down to the ground by virtue of the under foliage penetration capabilities of microwaves. This different way of sensing the volumetric structure of the imaged objects determines many peculiarities of SAR imaging with respect to LIDAR, some of which are advantages and other drawbacks. The most remarkable advantage is perhaps the one due to the ability of microwaves to penetrate into semi-transparent media, which makes lower wavelength (L-Band and P-Band, typically) SARs capable of acquiring data almost independently of weather conditions and vegetation density. Conversely, the most relevant drawback is that the three dimensional (3D) reconstruction of the imaged objects requires the exploitation of multiple (at least two, but preferable many) images acquired from different points of view, and hence multiple passes over the scene to be investigated.

> The aim of this chapter is to discuss relevant topics associated with the employment of a multi-baseline SAR system for the reconstruction of the 3D structure of the imaged targets, with particular attention to the case of forested areas.

> The first topic considered is the design of a multi-baseline SAR system for 3D reconstruction in the framework of Fourier Tomography (FT), also referred to as 3D focusing. Even though seldom used in practical applications due to poor imaging quality, FT allows to discuss quite easily the design and overall features of a SAR tomographic system, and represents the basis for all of the developments presented in the remainder of this chapter.

> The next part of this chapter will focus on operative methods the generation of high-resolution tomographic imaging from sparse data-sets. This is the case of interest in practical applications, due to the costs associated with flying a high number of passes and platform trajectory accuracy. T-SAR will be cast here in terms of an estimation problem, considering both non-parametric and parametric, or model based, approaches. Non-parametric

Multi-Baseline SARs 3

Forest Structure Retrieval from Multi-Baseline SARs 29

result is a complex 2D signal indexed by the slant range, azimuth coordinates, usually referred

 *<sup>x</sup>* <sup>=</sup> *<sup>λ</sup>* 2*Ls*

where *λ* is the carrier wavelength, *Ls* is the length of the synthetic aperture, and *r* is the stand-off distance between the sensor and the target. Slant range resolution is obtained according to eq. (1). A further description of SAR image formation is beyond the scopes

Given the definitions above, a SAR SLC image represents the scene as projected onto the slant range, azimuth coordinates. This entails the complete loss of the information about the third dimension, i.e.: about the vertical structure of the targets within the imaged scene. The key to the recovery of the third dimension is to enhance the acquisition space so as to form a *further* synthetic aperture. The easiest way to do this is to employ a multi-baseline SAR system, where several SAR sensors, flown along parallel tracks, image the scene from different points

Such a system offers the possibility to gather the backscattered echoes not only along the azimuth direction, but also along the cross-range direction, defined by the axis orthogonal to the Line Of Sight (LOS) and to the orbital track. Accordingly, the backscattered echoes can be focused not only in the slant range, azimuth plane, but in the whole 3D space. Therefore, the exploitation of multi-baseline acquisitions allows to create a fully 3D imaging system, where the size of the 3D resolution cell is determined by the pulse bandwidth along the slant range direction, and by the lengths of the synthetic apertures, according to eq. (2), in the azimuth

Both 2D and 3D SAR imaging data may be regarded as specializations to the SAR case of the more general concept of Diffraction Tomography, widely exploited in seismic processing (5). As such, 3D focusing is analogous to the basic formulation of SAR Tomography (T-SAR), in

which no particular assumption is adopted to describe the imaged scene (6).

*r* (2)

Assuming a narrow bandwidth signal the azimuth resolution, *x*, can be obtained as:

of this chapter. For further details, the reader is referred to (1), (2), (3), (4).

Fig. 1. SAR acquisition geometry.

**2.2 3D SAR imaging**

of view, see figure (2).

and cross range directions, see Fig. (3).

to Single Look Complex (SLC) image.

approaches provide enhanced resolution capabilities without the need for a-priori information about the targets. Model based approaches are even more powerful, providing access to a quantitative, large scale characterization of the forest structure. Nevertheless, model based approaches are affected by an intrinsic limitation, in that data interpretation can be carried out only on the basis of the model that has been adopted.

The final part of this chapter will focus on the joint exploitation of multi-polarimetric and multi-baseline data. It will be shown that under large hypotheses the second-order statistics of such data can be expressed as a Sum of Kronecker Products (SKP), which provides the basis to proceed to decomposing the SAR data into ground-only and volume-only contributions even in absence of a parametric model. Furthermore, the SKP formalism will be shown to provide a compact representation of all physically valid and data-consistent two-layered models. Such a feature will be exploited to provide a exhaustive discussion about the validity of such a class of models for forestry investigation.

Different case studies will be discussed throughout the whole chapter, basing on real data-sets from the ESA campaigns BIOSAR 2007, BIOSAR 2008, and TROPISAR.

#### **2. SAR tomography**

#### **2.1 SAR imaging principles**

RADAR (Radio Detection And Ranging) is a technology to detect and study far off targets by transmitting ElectroMagnetic (EM) pulses at radio frequency and observing the backscattered echoes. In its most simple implementation, a Radar "knows" the position of a target by measuring the delay of the backscattered echo. Delay measurements are then converted into distance, or range, measurements basing on the wave velocity in the medium (most commonly free space is assumed). Range resolution, *r*, is then directly related to the bandwidth of the transmitted pulse according to well known relation:

$$
\triangle r = \frac{c}{2B} \tag{1}
$$

where *c* is the speed of light and *B* the signal bandwidth.

A Synthetic Aperture Radar (SAR) system is constituted, in its essence, by a Radar mounted aboard a moving platform, such as an aircraft or a satellite. As the platform moves along its trajectory, the Radar mounted aboard transmits pulse waveforms at radio-frequency and gathers the echoes backscattered from the scene. The ensemble of the echoes recorded along the track is generally referred to as raw data, meaning that this product is not, in most cases, directly interpretable in terms of features of the illuminated scene. This happens because of the small size of the antennas usually employed by SAR sensors (few meters as order of magnitude), which results in a significantly large beam on the ground. As a consequence, each recorded echo is determined by a very large number of targets, widely dispersed along the scene. The procedure through which it is possible to separate the contributions of the targets within the Radar beam is generally referred to as SAR focusing. Such a procedure consists in combining the echoes collected along the platform track so as to *synthesize* a much longer antenna than the real one, typically ranging from tens of meters in the airborne case to tens of kilometers in the spaceborne case, and narrow the beam width on the ground to few meters. After focusing, the contribution of each target is identified by two coordinates, as shown in see Fig. (1). The first one, conventionally named azimuth, may be though of as the projection of the platform track onto the Earth's surface. The other, conventionally named slant range, denotes the distance from the platform track at any given azimuth location. The 2 Will-be-set-by-IN-TECH

approaches provide enhanced resolution capabilities without the need for a-priori information about the targets. Model based approaches are even more powerful, providing access to a quantitative, large scale characterization of the forest structure. Nevertheless, model based approaches are affected by an intrinsic limitation, in that data interpretation can be carried

The final part of this chapter will focus on the joint exploitation of multi-polarimetric and multi-baseline data. It will be shown that under large hypotheses the second-order statistics of such data can be expressed as a Sum of Kronecker Products (SKP), which provides the basis to proceed to decomposing the SAR data into ground-only and volume-only contributions even in absence of a parametric model. Furthermore, the SKP formalism will be shown to provide a compact representation of all physically valid and data-consistent two-layered models. Such a feature will be exploited to provide a exhaustive discussion about the validity of such a class

Different case studies will be discussed throughout the whole chapter, basing on real data-sets

RADAR (Radio Detection And Ranging) is a technology to detect and study far off targets by transmitting ElectroMagnetic (EM) pulses at radio frequency and observing the backscattered echoes. In its most simple implementation, a Radar "knows" the position of a target by measuring the delay of the backscattered echo. Delay measurements are then converted into distance, or range, measurements basing on the wave velocity in the medium (most commonly free space is assumed). Range resolution, *r*, is then directly related to the bandwidth of the

*<sup>r</sup>* <sup>=</sup> *<sup>c</sup>*

A Synthetic Aperture Radar (SAR) system is constituted, in its essence, by a Radar mounted aboard a moving platform, such as an aircraft or a satellite. As the platform moves along its trajectory, the Radar mounted aboard transmits pulse waveforms at radio-frequency and gathers the echoes backscattered from the scene. The ensemble of the echoes recorded along the track is generally referred to as raw data, meaning that this product is not, in most cases, directly interpretable in terms of features of the illuminated scene. This happens because of the small size of the antennas usually employed by SAR sensors (few meters as order of magnitude), which results in a significantly large beam on the ground. As a consequence, each recorded echo is determined by a very large number of targets, widely dispersed along the scene. The procedure through which it is possible to separate the contributions of the targets within the Radar beam is generally referred to as SAR focusing. Such a procedure consists in combining the echoes collected along the platform track so as to *synthesize* a much longer antenna than the real one, typically ranging from tens of meters in the airborne case to tens of kilometers in the spaceborne case, and narrow the beam width on the ground to few meters. After focusing, the contribution of each target is identified by two coordinates, as shown in see Fig. (1). The first one, conventionally named azimuth, may be though of as the projection of the platform track onto the Earth's surface. The other, conventionally named slant range, denotes the distance from the platform track at any given azimuth location. The

<sup>2</sup>*<sup>B</sup>* (1)

from the ESA campaigns BIOSAR 2007, BIOSAR 2008, and TROPISAR.

out only on the basis of the model that has been adopted.

transmitted pulse according to well known relation:

where *c* is the speed of light and *B* the signal bandwidth.

of models for forestry investigation.

**2. SAR tomography**

**2.1 SAR imaging principles**

Fig. 1. SAR acquisition geometry.

result is a complex 2D signal indexed by the slant range, azimuth coordinates, usually referred to Single Look Complex (SLC) image.

Assuming a narrow bandwidth signal the azimuth resolution, *x*, can be obtained as:

$$
\triangle \text{ x = } \frac{\lambda}{2L\_s} r \tag{2}
$$

where *λ* is the carrier wavelength, *Ls* is the length of the synthetic aperture, and *r* is the stand-off distance between the sensor and the target. Slant range resolution is obtained according to eq. (1). A further description of SAR image formation is beyond the scopes of this chapter. For further details, the reader is referred to (1), (2), (3), (4).

#### **2.2 3D SAR imaging**

Given the definitions above, a SAR SLC image represents the scene as projected onto the slant range, azimuth coordinates. This entails the complete loss of the information about the third dimension, i.e.: about the vertical structure of the targets within the imaged scene. The key to the recovery of the third dimension is to enhance the acquisition space so as to form a *further* synthetic aperture. The easiest way to do this is to employ a multi-baseline SAR system, where several SAR sensors, flown along parallel tracks, image the scene from different points of view, see figure (2).

Such a system offers the possibility to gather the backscattered echoes not only along the azimuth direction, but also along the cross-range direction, defined by the axis orthogonal to the Line Of Sight (LOS) and to the orbital track. Accordingly, the backscattered echoes can be focused not only in the slant range, azimuth plane, but in the whole 3D space. Therefore, the exploitation of multi-baseline acquisitions allows to create a fully 3D imaging system, where the size of the 3D resolution cell is determined by the pulse bandwidth along the slant range direction, and by the lengths of the synthetic apertures, according to eq. (2), in the azimuth and cross range directions, see Fig. (3).

Both 2D and 3D SAR imaging data may be regarded as specializations to the SAR case of the more general concept of Diffraction Tomography, widely exploited in seismic processing (5). As such, 3D focusing is analogous to the basic formulation of SAR Tomography (T-SAR), in which no particular assumption is adopted to describe the imaged scene (6).

Multi-Baseline SARs 5

Forest Structure Retrieval from Multi-Baseline SARs 31

Consider the case of a multi-baseline data-set, represented by *N* Single Look Complex (SLC) monostatic SAR images, properly co-registered and phase flattened (7). Let *yn* denote a complex valued pixel in the *n* − *th* image, the dependence on the slant range, azimuth location, (*r*, *x*), being made implicit in order to simplify the notation. After (7), each acquisition can be

resolution cell; *ξ*� is the cross range coordinate with respect to the center of the SAR resolution

namely a complex valued quantity accounting for amplitude and phase modifications due to the interactions of the transmitted pulse with each of the scatterers within the SAR resolution

, *x*�

is the target projection within the SAR resolution cell along the cross-range coordinate. Equation (4) may be thought of as the key equation of T-SAR. It states that the SAR data and the target projections form a Fourier pair. Hence, the cross-range distribution of the

> |*P* (*ξ*)| 2

can be retrieved by estimating the Fourier Spectrum (FS) of the data with respect to the normal baselines. The evaluation of (6) for every slant range, azimuth location results in a 3D reconstruction of the imaged scene, in the coordinate system defined by the slant range, azimuth, cross-range coordinates. Accordingly, the vertical distribution of the backscattered power can be retrieved by re-sampling the reconstructed scene from the slant range, azimuth, cross-range coordinates to the ground range, azimuth, elevation coordinates. For sake of simplicity, in this paper I will suppose that the passage from the cross-range coordinate, *ξ*,

where *θ* is the look angle. Under this assumption, the vertical distribution of the backscattered power can be obtained from the FS simply as *S* (*z*) = *S<sup>ξ</sup>* (*ξ* = *z*/ sin *θ*). Notice that such a simplification does not entail any loss of generality, as the exact reconstruction of the vertical backscattered power at every ground range, azimuth location can be carried out a-posteriori

*S<sup>ξ</sup>* (*ξ*) = *E*

, *x*� ); *s*(*r*� , *x*� , *ξ*�

relative to the *n* − *th* image with respect to a common master image.

*yn* = ˆ *P ξ*� exp *j* 4*π λr bnξ*� 

*P ξ*� = ˆ *h r*� , *x*� *s r*� , *x*� , *ξ*� *dr*�

to elevation, *z*, can be simply described through the equation

through simple geometrical arguments.

) are the slant range, azimuth coordinates with respect to the center of the SAR

) is the end-to-end SAR impulse response (after focusing); *bn* is the normal baseline

), one gets:

*dξ*� (3)

) is a the scene complex reflectivity,

*dξ*� (4)

*dx*� (5)

, (6)

*ξ* = *z*/sin*θ* (7)

**2.3 SAR tomography: mathematical formulation**

expressed through the following forward model:

*yn* = ˆ *h r*� , *x*� *s r*� , *x*� , *ξ*� exp *j* 4*π λr bnξ*� *dr*� *dx*�

cell, given by the axis orthogonal to (*r*�

Resolving the integral in (3) with respect to (*r*�

where: (*r*�

cell; *h* (*r*�

where

, *x*�

, *x*�

backscattered power, namely

Fig. 2. A multi-baseline SAR system

Fig. 3. Section of the 3D resolution cell in the slant range, cross range plane, represented as a function of baseline aperture and pulse bandwidth.

#### **2.3 SAR tomography: mathematical formulation**

Consider the case of a multi-baseline data-set, represented by *N* Single Look Complex (SLC) monostatic SAR images, properly co-registered and phase flattened (7). Let *yn* denote a complex valued pixel in the *n* − *th* image, the dependence on the slant range, azimuth location, (*r*, *x*), being made implicit in order to simplify the notation. After (7), each acquisition can be expressed through the following forward model:

$$y\_{\rm il} = \int \ln \left( r', \mathbf{x'} \right) \mathbf{s} \left( r', \mathbf{x'}, \tilde{\xi}' \right) \exp \left( j \frac{4 \pi}{\lambda r} b\_n \tilde{\xi}' \right) dr' d\mathbf{x'} d\tilde{\xi}' \tag{3}$$

where: (*r*� , *x*� ) are the slant range, azimuth coordinates with respect to the center of the SAR resolution cell; *ξ*� is the cross range coordinate with respect to the center of the SAR resolution cell, given by the axis orthogonal to (*r*� , *x*� ); *s*(*r*� , *x*� , *ξ*� ) is a the scene complex reflectivity, namely a complex valued quantity accounting for amplitude and phase modifications due to the interactions of the transmitted pulse with each of the scatterers within the SAR resolution cell; *h* (*r*� , *x*� ) is the end-to-end SAR impulse response (after focusing); *bn* is the normal baseline relative to the *n* − *th* image with respect to a common master image. Resolving the integral in (3) with respect to (*r*� , *x*� ), one gets:

$$y\_{\rm nl} = \int P\left(\xi'\right) \exp\left(j\frac{4\pi}{\lambda r}b\_n\xi'\right)d\xi' \tag{4}$$

where

4 Will-be-set-by-IN-TECH

Fig. 3. Section of the 3D resolution cell in the slant range, cross range plane, represented as a

function of baseline aperture and pulse bandwidth.

Fig. 2. A multi-baseline SAR system

$$P\left(\tilde{\xi}'\right) = \int h\left(r', \mathbf{x}'\right) \mathbf{s}\left(r', \mathbf{x}', \tilde{\xi}'\right) dr' d\mathbf{x}' \tag{5}$$

is the target projection within the SAR resolution cell along the cross-range coordinate. Equation (4) may be thought of as the key equation of T-SAR. It states that the SAR data and the target projections form a Fourier pair. Hence, the cross-range distribution of the backscattered power, namely

$$S\_{\xi} \left( \xi \right) = E \left[ \left| P \left( \xi \right) \right|^{2} \right],\tag{6}$$

can be retrieved by estimating the Fourier Spectrum (FS) of the data with respect to the normal baselines. The evaluation of (6) for every slant range, azimuth location results in a 3D reconstruction of the imaged scene, in the coordinate system defined by the slant range, azimuth, cross-range coordinates. Accordingly, the vertical distribution of the backscattered power can be retrieved by re-sampling the reconstructed scene from the slant range, azimuth, cross-range coordinates to the ground range, azimuth, elevation coordinates. For sake of simplicity, in this paper I will suppose that the passage from the cross-range coordinate, *ξ*, to elevation, *z*, can be simply described through the equation

$$\xi = z / \sin \theta \tag{7}$$

where *θ* is the look angle. Under this assumption, the vertical distribution of the backscattered power can be obtained from the FS simply as *S* (*z*) = *S<sup>ξ</sup>* (*ξ* = *z*/ sin *θ*). Notice that such a simplification does not entail any loss of generality, as the exact reconstruction of the vertical backscattered power at every ground range, azimuth location can be carried out a-posteriori through simple geometrical arguments.

Multi-Baseline SARs 7

Forest Structure Retrieval from Multi-Baseline SARs 33

Fig. 4. Pictorial representation of the Fourier Spectra associated to the four scattering mechanisms discussed. Brown: Trunk-ground scattering; yellow: ground backscattering;

spectral component of target projection corresponding to the spatial frequency:

to correctly represent the power spectrum of the data, the baseline set has to be designed in such a way as to avoid the arising of aliasing phenomena while allowing a sufficient resolution. As discussed above, equation (4) states that, at each slant range azimuth location, the complex valued pixel *yn* associated with the *n* − *th* baseline is obtained by taking the

*fn* <sup>=</sup> <sup>2</sup>*bn*

*<sup>λ</sup><sup>r</sup>* (8)

green: canopy backscattering; blue: canopy-ground scattering.

#### **2.4 Expected FS for forested areas**

A brief discussion is now required about the physics of radar scattering from forested areas, in order to discuss the expected properties of the FS. Many excellent works have been done on modeling radar backscattering for forested areas, see for example (8), (9), (10). After these works, forested areas will be characterized as the ensemble of trunks, canopies, intended as the ensemble of leaves and branches, and terrain. A simple and widely exploited solution to describe the scattering behavior of a target above the ground is provided by modeling the ground as a flat half space dielectric and using image theory. Four Scattering Mechanisms arise by retaining this approach (11), (12), (9): backscatter from the target; target to ground double bounce scattering; ground to target double bounce scattering; backscatter from the target illuminated by the reflected wave, namely ground to target to ground scattering. In the following the contributions from ground to target to ground scattering will be assumed to be negligible. In the case of trunks, this assumption is supported by considering that backscatter is dominated by specular scattering, and may thus be neglected (9). For the same reason, direct backscatter from the trunks will be neglected as well. The case of the canopy is a little more delicate, since randomly oriented small objects such as branches and leaves should be treated as isotropic scatterers. Also in this case, however, it makes sense to retain that ground to target to ground contributions may be neglected, by virtue of the attenuation undergone by the wave as it bounces on the ground and propagates through the trunk layer and the canopy layer of adjacent trees.

Accordingly, higher order multiple scattering will be neglected as well. As for double bounce scattering, through simple geometric arguments it is possible to see that the distance covered by the wave as it undergoes the two consecutive specular reflections on the ground and on the target, or viceversa, is equal to the distance between the sensor and the projection of the target onto the ground. Therefore, at least on an approximately flat terrain, every double bounce mechanism is located on the ground. Assuming that trees grows perfectly vertical, the projection of the trunk onto the ground collapses into a single point. Hence, double bounce scattering due to trunk-ground interactions may be regarded as a point-like scattering mechanism, located on the ground. The projection of the canopy onto the ground, instead, gives rise to a superficial scattering mechanism located on the ground, analogous to ground backscatter1.

Based on the model here discussed, a forest scenario is to be characterized, from the point of view of a SAR based analysis, as an ensemble of point like, superficial and volumetric scatterers. Basing on single polarimetric, multi-baseline observations, the element of diversity among the different Scattering Mechanisms (SMs) is given by their corresponding FS, see Fig. (4). Trunk-ground interactions give rise to a point-like SM whose phase center is ground locked, resulting in an isolated narrow peak in the associated FS, located at ground level. Ground backscatter and canopy-ground interactions give rise to a superficial, ground locked, SM. It follows that the associated FS is located at ground level as well, but it is spread along the cross-range axis. Canopy backscatter gives rise to a volumetric SM whose phase center is located above the ground, depending on canopy height, from which it follows that the associated FS is located above the ground, and it is spread along the cross-range axis as well.

#### **2.5 Baseline design**

After equation (4), the design of the baseline set for tomographic application may be performed basing on standard results from Signal Theory (14). In particular, as the aim is

<sup>1</sup> It has to be remarked that these conclusions do not apply in the case of bi-static acquisitions, see for example (13)

6 Will-be-set-by-IN-TECH

A brief discussion is now required about the physics of radar scattering from forested areas, in order to discuss the expected properties of the FS. Many excellent works have been done on modeling radar backscattering for forested areas, see for example (8), (9), (10). After these works, forested areas will be characterized as the ensemble of trunks, canopies, intended as the ensemble of leaves and branches, and terrain. A simple and widely exploited solution to describe the scattering behavior of a target above the ground is provided by modeling the ground as a flat half space dielectric and using image theory. Four Scattering Mechanisms arise by retaining this approach (11), (12), (9): backscatter from the target; target to ground double bounce scattering; ground to target double bounce scattering; backscatter from the target illuminated by the reflected wave, namely ground to target to ground scattering. In the following the contributions from ground to target to ground scattering will be assumed to be negligible. In the case of trunks, this assumption is supported by considering that backscatter is dominated by specular scattering, and may thus be neglected (9). For the same reason, direct backscatter from the trunks will be neglected as well. The case of the canopy is a little more delicate, since randomly oriented small objects such as branches and leaves should be treated as isotropic scatterers. Also in this case, however, it makes sense to retain that ground to target to ground contributions may be neglected, by virtue of the attenuation undergone by the wave as it bounces on the ground and propagates through the trunk layer and the canopy

Accordingly, higher order multiple scattering will be neglected as well. As for double bounce scattering, through simple geometric arguments it is possible to see that the distance covered by the wave as it undergoes the two consecutive specular reflections on the ground and on the target, or viceversa, is equal to the distance between the sensor and the projection of the target onto the ground. Therefore, at least on an approximately flat terrain, every double bounce mechanism is located on the ground. Assuming that trees grows perfectly vertical, the projection of the trunk onto the ground collapses into a single point. Hence, double bounce scattering due to trunk-ground interactions may be regarded as a point-like scattering mechanism, located on the ground. The projection of the canopy onto the ground, instead, gives rise to a superficial scattering mechanism located on the ground, analogous to ground

Based on the model here discussed, a forest scenario is to be characterized, from the point of view of a SAR based analysis, as an ensemble of point like, superficial and volumetric scatterers. Basing on single polarimetric, multi-baseline observations, the element of diversity among the different Scattering Mechanisms (SMs) is given by their corresponding FS, see Fig. (4). Trunk-ground interactions give rise to a point-like SM whose phase center is ground locked, resulting in an isolated narrow peak in the associated FS, located at ground level. Ground backscatter and canopy-ground interactions give rise to a superficial, ground locked, SM. It follows that the associated FS is located at ground level as well, but it is spread along the cross-range axis. Canopy backscatter gives rise to a volumetric SM whose phase center is located above the ground, depending on canopy height, from which it follows that the associated FS is located above the ground, and it is spread along the cross-range axis as well.

After equation (4), the design of the baseline set for tomographic application may be performed basing on standard results from Signal Theory (14). In particular, as the aim is

<sup>1</sup> It has to be remarked that these conclusions do not apply in the case of bi-static acquisitions, see for

**2.4 Expected FS for forested areas**

layer of adjacent trees.

backscatter1.

**2.5 Baseline design**

example (13)

Fig. 4. Pictorial representation of the Fourier Spectra associated to the four scattering mechanisms discussed. Brown: Trunk-ground scattering; yellow: ground backscattering; green: canopy backscattering; blue: canopy-ground scattering.

to correctly represent the power spectrum of the data, the baseline set has to be designed in such a way as to avoid the arising of aliasing phenomena while allowing a sufficient resolution. As discussed above, equation (4) states that, at each slant range azimuth location, the complex valued pixel *yn* associated with the *n* − *th* baseline is obtained by taking the spectral component of target projection corresponding to the spatial frequency:

$$f\_n = \frac{2b\_n}{\lambda r} \tag{8}$$

Multi-Baseline SARs 9

Forest Structure Retrieval from Multi-Baseline SARs 35

targets in different positions through this simple approach is dramatically limited by the effective baseline aperture, according to equation (12). Furthermore, the vertical backscatter distribution that is yielded by Fourier based T-SAR is likely to be affected by the presence of grating lobes, that can arise in case of uneven baseline sampling (17). It follows that T-SAR can be more effectively posed as the problem of estimating the vertical distribution of the backscattered power based on a limited number of observations, that is as a Spectral Estimation problem. Many excellent works have appeared in literature about the application of spectral estimation methods to SAR tomography. A possible solution to improve the vertical resolution and ease the arising of side lobes is to exploit super-resolution techniques developed in the Direction of Arrival (DOA) framework, such as Capon adaptive filtering, MUSIC, SVD analysis, Compressive Sensing, and others. In the framework of T-SAR, such techniques have been applied in a number of works, among which (17), (18), (19), (20). The main limitation of most super-resolution techniques derives from the assumption that the scene is constituted by a finite number of point-like scatterers, which hinders their application in the analysis of scenarios characterized by the presence of distributed targets. For this reason, super-resolution techniques appear to be mainly suited for the tomographic characterization of urban areas. A different solution may be found in the works by *Fornaro et al.*, (21), (22), focused on the analysis of urban areas, and in those relative to Polarization Coherence Tomography (PCT), due to *Cloude*, (23), (24), focused on the analysis of forested areas. The common trait of those works is that super-resolution is achieved by exploiting a priori information about target location. In the case of forested areas, for example, the a priori

information is represented by ground topography and canopy top height.

among different tracks, namely the matrix whose elements are obtained as:

*<sup>H</sup>*

carried out only on the basis of the model that has been adopted.

**3.1 Two simple spectral estimators**

··· ··· exp

*j* <sup>4</sup>*<sup>π</sup> <sup>λ</sup><sup>r</sup> bNξ*

provided by such an estimator is then given by eq. (12).

which is defined as:

designs a filter **f**min such that:

under the linear constraint

exp *j* <sup>4</sup>*<sup>π</sup> <sup>λ</sup><sup>r</sup> b*1*ξ* 

Alternatively, a sound characterization of the imaged scene can be given by modeling its vertical backscattering distribution, in such a way as to pose T-SAR as a Parametric Estimation problem. Model based techniques have been successfully employed in a number of works about the analysis of forested areas, see for example (25), (13), (26), (27). Nevertheless, model based approaches are affected by an intrinsic limitation, in that data interpretation can be

Spectral estimation techniques are mostly based on the analysis of the data covariance matrix

{**R**}*nm* = *E* [*yny*<sup>∗</sup>

The most simple estimator of the scene backscatter distribution is given by the Periodogram,

where **<sup>R</sup>** is the sample estimate of the data covariance matrix defined above and **<sup>a</sup>** (*ξ*) <sup>=</sup>

Fourier transform of the data with respect to the normal baseline. The vertical resolution

This limit may be overcome by adaptive filtering techniques. Capon filtering, for example,

**<sup>f</sup>***H***Rf**

**<sup>f</sup>**min <sup>=</sup> arg min

*Sp* (*ξ*) = **a** (*ξ*)

*<sup>m</sup>*] (13)

(15)

*<sup>T</sup>* **Ra** (*ξ*) (14)

. The Periodogram is formally equivalent to the

**f***H***a** (*ξ*) = 1 (16)

Assume now that baseline sampling is uniform with baseline spacing �*b*, i.e.: *bn* = *n*�*b*. It then follows that the reconstruction of *P* (*ξ*) from *yn* is unambiguous only provided that the overall extent of the target along the cross-range direction, *H<sup>ξ</sup>* , is lower than the inverse of the frequency spacing, i.e.: *<sup>H</sup><sup>ξ</sup>* <sup>≤</sup> <sup>1</sup> � *<sup>f</sup>* <sup>=</sup> *<sup>λ</sup><sup>r</sup>* <sup>2</sup>�*<sup>b</sup>* . After fig. (4) it is immediate to see that:

$$H\_{\tilde{\xi}} = \frac{H}{\sin \theta} + \triangle \, r \cos \theta \,\tag{9}$$

where *H* is forest height. Accordingly, the fundamental constraint to obtain a correct tomographic imaging of a forest is that2:

$$\frac{H}{\sin \theta} + \triangle r \cos \theta \le \frac{\lambda r}{2 \triangle b} \tag{10}$$

which allows to set the baseline spacing as a function of forest height and system features. Cross range system resolution can be derived analogously to 2, simply by noting that the overall extent of the synthetic aperture in the cross direction is given by *N*�*b*, *N* being the total number of tracks. Accordingly:

$$
\triangle \cdot \xi = \frac{\lambda r}{2N \triangle b} \tag{11}
$$

after which vertical resolution is written immediately as:

$$
\Delta z = \frac{\lambda r}{2N\triangle b} \sin \theta. \tag{12}
$$

Accordingly, the number of tracks is directly related to the vertical resolution of the tomographic system. Put in these terms, one would be tempted to assume that vertical resolution can be improved at will, simply by allowing a sufficient number of tracks. There are factors, though, which pose an upper limit about baseline aperture:


#### **3. SAR Tomography as an estimation problem**

Despite being a most powerful tool for system design, casting T-SAR in terms of a simple evaluation of the data Fourier Transform constitutes too simple an approach for practical applications. The main reason for this statement is that the capability to resolve

<sup>2</sup> This formula is given for typical geometries used in SAR imaging, where the incidence angle is usually larger than, say, 25°. The case of nadir looking Radar (*θ* = 0) is not considered here.

8 Will-be-set-by-IN-TECH

Assume now that baseline sampling is uniform with baseline spacing �*b*, i.e.: *bn* = *n*�*b*. It then follows that the reconstruction of *P* (*ξ*) from *yn* is unambiguous only provided that the overall extent of the target along the cross-range direction, *H<sup>ξ</sup>* , is lower than the inverse of the

where *H* is forest height. Accordingly, the fundamental constraint to obtain a correct

+ *r*cos*θ* ≤

which allows to set the baseline spacing as a function of forest height and system features. Cross range system resolution can be derived analogously to 2, simply by noting that the overall extent of the synthetic aperture in the cross direction is given by *N*�*b*, *N* being the

*<sup>ξ</sup>* <sup>=</sup> *<sup>λ</sup><sup>r</sup>*

*<sup>z</sup>* <sup>=</sup> *<sup>λ</sup><sup>r</sup>*

are factors, though, which pose an upper limit about baseline aperture:

to use multiple SAR sensors or fly the same sensor many times.

**3. SAR Tomography as an estimation problem**

2*N*�*b*

Accordingly, the number of tracks is directly related to the vertical resolution of the tomographic system. Put in these terms, one would be tempted to assume that vertical resolution can be improved at will, simply by allowing a sufficient number of tracks. There

• if different baselines are collected at different times it has to be considered that the scene can undergo significant changes, resulting in the impossibility to coherently combine different images. This phenomenon is known in literature as temporal decorrelation (15),

• It is important to keep in mind that equation (3) is valid upon the condition that the scene reflectivity is isotropic. This condition generally holds for narrow-angle (i.e.: small baseline) measurements, whereas it tends to break down in case of wide-angle

• Gathering multiple baselines is, in general, associated with high costs, due to the necessity

Despite being a most powerful tool for system design, casting T-SAR in terms of a simple evaluation of the data Fourier Transform constitutes too simple an approach for practical applications. The main reason for this statement is that the capability to resolve

<sup>2</sup> This formula is given for typical geometries used in SAR imaging, where the incidence angle is usually

larger than, say, 25°. The case of nadir looking Radar (*θ* = 0) is not considered here.

*<sup>H</sup><sup>ξ</sup>* <sup>=</sup> *<sup>H</sup>* sin*θ*

*H* sin*θ*

� *<sup>f</sup>* <sup>=</sup> *<sup>λ</sup><sup>r</sup>* <sup>2</sup>�*<sup>b</sup>* . After fig. (4) it is immediate to see that:

*λr*

+ *r*cos*θ* (9)

<sup>2</sup>�*<sup>b</sup>* (10)

<sup>2</sup>*N*�*<sup>b</sup>* (11)

sin*θ*. (12)

frequency spacing, i.e.: *<sup>H</sup><sup>ξ</sup>* <sup>≤</sup> <sup>1</sup>

tomographic imaging of a forest is that2:

total number of tracks. Accordingly:

(16).

measurements.

after which vertical resolution is written immediately as:

targets in different positions through this simple approach is dramatically limited by the effective baseline aperture, according to equation (12). Furthermore, the vertical backscatter distribution that is yielded by Fourier based T-SAR is likely to be affected by the presence of grating lobes, that can arise in case of uneven baseline sampling (17). It follows that T-SAR can be more effectively posed as the problem of estimating the vertical distribution of the backscattered power based on a limited number of observations, that is as a Spectral Estimation problem. Many excellent works have appeared in literature about the application of spectral estimation methods to SAR tomography. A possible solution to improve the vertical resolution and ease the arising of side lobes is to exploit super-resolution techniques developed in the Direction of Arrival (DOA) framework, such as Capon adaptive filtering, MUSIC, SVD analysis, Compressive Sensing, and others. In the framework of T-SAR, such techniques have been applied in a number of works, among which (17), (18), (19), (20). The main limitation of most super-resolution techniques derives from the assumption that the scene is constituted by a finite number of point-like scatterers, which hinders their application in the analysis of scenarios characterized by the presence of distributed targets. For this reason, super-resolution techniques appear to be mainly suited for the tomographic characterization of urban areas. A different solution may be found in the works by *Fornaro et al.*, (21), (22), focused on the analysis of urban areas, and in those relative to Polarization Coherence Tomography (PCT), due to *Cloude*, (23), (24), focused on the analysis of forested areas. The common trait of those works is that super-resolution is achieved by exploiting a priori information about target location. In the case of forested areas, for example, the a priori information is represented by ground topography and canopy top height.

Alternatively, a sound characterization of the imaged scene can be given by modeling its vertical backscattering distribution, in such a way as to pose T-SAR as a Parametric Estimation problem. Model based techniques have been successfully employed in a number of works about the analysis of forested areas, see for example (25), (13), (26), (27). Nevertheless, model based approaches are affected by an intrinsic limitation, in that data interpretation can be carried out only on the basis of the model that has been adopted.

#### **3.1 Two simple spectral estimators**

Spectral estimation techniques are mostly based on the analysis of the data covariance matrix among different tracks, namely the matrix whose elements are obtained as:

$$\{\mathbf{R}\}\_{nm} = E\left[y\_n y\_m^\*\right] \tag{13}$$

The most simple estimator of the scene backscatter distribution is given by the Periodogram, which is defined as:

$$S\_p\left(\xi\right) = \mathbf{a}\left(\xi\right)^T \hat{\mathbf{R}}\mathbf{a}\left(\xi\right) \tag{14}$$

where **<sup>R</sup>** is the sample estimate of the data covariance matrix defined above and **<sup>a</sup>** (*ξ*) <sup>=</sup> exp *j* <sup>4</sup>*<sup>π</sup> <sup>λ</sup><sup>r</sup> b*1*ξ* ··· ··· exp *j* <sup>4</sup>*<sup>π</sup> <sup>λ</sup><sup>r</sup> bNξ <sup>H</sup>* . The Periodogram is formally equivalent to the Fourier transform of the data with respect to the normal baseline. The vertical resolution provided by such an estimator is then given by eq. (12).

This limit may be overcome by adaptive filtering techniques. Capon filtering, for example, designs a filter **f**min such that:

$$\mathbf{f\_{min}} = \arg\min \left\{ \mathbf{f}^H \hat{\mathbf{R}} \mathbf{f} \right\} \tag{15}$$

under the linear constraint

$$\mathbf{f}^{H}\mathbf{a}\left(\xi\right) = 1\tag{16}$$

Multi-Baseline SARs 11

Forest Structure Retrieval from Multi-Baseline SARs 37

By letting the ground and volume structure matrices depend upon some physically meaningful parameter, equation (20) provides an easy access point to an effective modeling of the problem. In this chapter I will consider a very simple model, where both the ground and the vegetation layer are represented as phase centers plus the associated decorrelation term arising from the angular spreading described in section 2.4. Hence, the phase of the structure

> *nm* <sup>=</sup> <sup>4</sup>*<sup>π</sup> λr*

where *zg*,*<sup>v</sup>* is the elevation of the phase center associated with ground and volume scattering.

where *ρg*,*<sup>v</sup>* will be referred to as spreading constant and Δ*<sup>b</sup>* is the average normal baseline sampling. The role of the spreading constant is to describe in a simple fashion the angular spreading that arises from the spatial structure associated to each phase center, avoiding the dependence upon a particular physical model. Given the definition above, the spreading constant ranges from 0 to 1, corresponding to the cases of pure noise and of a perfectly coherent received signal (i.e. a point-like scatterer), respectively. For details on more

In principle, the availability of a model for the covariance matrix suffices for solving for the unknowns through Maximum Likelihood Estimation (MLE) (28). This solution, however, would not be efficient, since it requires an exhaustive search in a significantly wide parameter space. A significant complexity reduction may be achieved by applying the Covariance Matching Estimation Method (COMET) (29), (30), (31), (32), which allows to solve the problem by minimizing a weighted Frobenius norm of the difference between the sample covariance

*trace*

A straightforward extension to the case of multi-polarimetric data can be achieved by

This section is devoted to reporting the results of the tomographic analysis of the forest site of Remningstorp, Sweden, on the basis of a data-set of *N* = 9 P-Band, fully polarimetric SAR images acquired by the DLR airborne system E-SAR in the frame of the ESA campaign BIOSAR 2007. Prevailing tree species in the imaged scene are Norway spruce (Picea abies), Scots pine (Pinus sylvestris) and birch (Betula spp.). The dominant soil type is till with a field layer, when present, of blueberry (Vaccinium myrtillus) and narrow thinned grass

stay the same. See (26) for details.

**<sup>R</sup>**−1(**R** <sup>−</sup> **<sup>M</sup>**)**R**−1(**R** <sup>−</sup> **<sup>M</sup>**)

*<sup>v</sup>***R***<sup>v</sup>* (*zv*, *ρv*). (24)

only change with polarization, whereas the

(23)

<sup>=</sup> argmin

*σ*2 *<sup>g</sup>* , *σ*<sup>2</sup> *v* 

where **R** is the sample covariance matrix and **M** is the model matrix, i.e.:

**M** = *σ*<sup>2</sup> *<sup>g</sup>***R***<sup>g</sup> zg*, *ρ<sup>g</sup>* + *σ*<sup>2</sup>

*bm* − *bn*

−|*bn*−*bm*|/Δ*<sup>b</sup>*

sin*<sup>θ</sup> zg*,*<sup>v</sup>* (21)

*<sup>g</sup>*,*<sup>v</sup>* (22)

∠ **R***g*,*v* 

As for the amplitudes, instead, the following model will be retained:

 **R***g*,*v nm* = *ρ*

sophisticated physical models, the reader is referred to (13), (27).

**3.3 A simple parametric model**

matrices can be parametrized as:

**3.3.1 Inversion**

matrix and the model . In formula:

*zg*, *<sup>ρ</sup>g*, *zg*, *<sup>ρ</sup>g*, *<sup>σ</sup>*

assuming that backscattered powers

**3.4 A case study: the Remningstorp forest site**

2 *<sup>g</sup>* , *σ* 2 *v* 

*zg*, *ρg*,*zv*, *ρ<sup>v</sup>*

structural parameters

This leads to the definition of the Capon Spectral Estimator as:

$$\mathbf{S}\_{\mathbf{C}}\left(\boldsymbol{\xi}\right) = \mathbf{f}\_{\text{min}}^{H} \hat{\mathbf{R}} \mathbf{f}\_{\text{min}} = \frac{1}{\mathbf{a}\left(\boldsymbol{\xi}\right)^{H} \hat{\mathbf{R}}^{-1} \mathbf{a}\left(\boldsymbol{\xi}\right)}\tag{17}$$

Such an estimator produces super-resolution capabilities and grating lobes reduction without the need for a-priori information about the targets, see for example (17), (18). The main drawback associated with the Capon Spectral Estimator is the poor radiometric accuracy, especially in presence of a small baseline aperture. Accordingly, the employment of the Capon Spectral Estimator has to be considered as a compromise. Indeed it results in non accurate backscattered power values, yet it permits to appreciate details that could not be accessible in presence of a small baseline aperture.

#### **3.2 Representation of multiple SMs**

Dealing with natural scenarios, it is sensible to assume that the target signature changes randomly from one pixel to another, either in the case of distributed or point-like targets. For this reason, the data will be assumed to be a realization of a zero-mean complex process, from which it follows that a unified mathematical treatment of all the SMs described in section 2.4 may be provided by characterizing the second order moments of the data, represented by the expected value of the interferograms. Under the hypothesis of statistical uncorrelation among the different SMs, the expected value of the *nm* − *th* interferogram may be expressed as:

$$E\left[y\_ny\_m^\*\right] = \sum\_{k=1}^K \sigma\_k^2 \gamma\_{nm}^{(k)}\tag{18}$$

where: *K* is the total number of SMs; *σ*<sup>2</sup> *<sup>k</sup>* is the backscattered power associated to the *k* − *th* SM; *<sup>γ</sup>*(*k*) *nm* is the complex interferometric coherence induced by the spatial structure (i.e.: point-like, superficial, or volumetric) associated with the *k* − *th* SM in the *nm* − *th* interferogram (15). After (18), the expression of the data covariance matrix is given by:

$$E\left[\mathbf{y}\_{MB}\mathbf{y}\_{MB}^H\right] \stackrel{def}{=} \mathbf{R} = \sum\_{k=1}^{K} \sigma\_k^2 \mathbf{R}\_k \tag{19}$$

where: **y***MB* = *y*<sup>1</sup> *y*<sup>2</sup> ··· *yN <sup>T</sup>* is the stack of the multi-baseline data at a certain slant range, azimuth location; each matrix **R***<sup>k</sup>* is an *N* × *N* matrix whose entries are given by the interferometric coherences associated to the *<sup>k</sup>* <sup>−</sup> *th* phase center: {**R***k*}*nm* <sup>=</sup> *<sup>γ</sup>*(*k*) *nm*. The role of each matrix **R***<sup>k</sup>* in (19) is to account for the spatial structure associated to each SMs. According to this interpretation, such matrices will be hereinafter referred to as *structure matrices*. Following the arguments presented in section 2.4, the data covariance matrix will be assumed to be contributed by just two SMs, one associated with volume backscattering, and the

other with the ensemble of all SMs whose phase center is ground locked, namely surface backscattering, ground-volume scattering, and ground-trunk scattering. Accordingly, eq. (19) simplifies to:

$$E\left[\mathbf{y}\_{MB}\mathbf{y}\_{MB}^H\right] \stackrel{def}{=} \mathbf{R} = \sigma\_\mathcal{S}^2 \mathbf{R}\_\mathcal{S} + \sigma\_\upsilon^2 \mathbf{R}\_\upsilon \tag{20}$$

the subscripts *g*, *v* standing for *ground* and *volume*.

#### **3.3 A simple parametric model**

10 Will-be-set-by-IN-TECH

Such an estimator produces super-resolution capabilities and grating lobes reduction without the need for a-priori information about the targets, see for example (17), (18). The main drawback associated with the Capon Spectral Estimator is the poor radiometric accuracy, especially in presence of a small baseline aperture. Accordingly, the employment of the Capon Spectral Estimator has to be considered as a compromise. Indeed it results in non accurate backscattered power values, yet it permits to appreciate details that could not be accessible in

Dealing with natural scenarios, it is sensible to assume that the target signature changes randomly from one pixel to another, either in the case of distributed or point-like targets. For this reason, the data will be assumed to be a realization of a zero-mean complex process, from which it follows that a unified mathematical treatment of all the SMs described in section 2.4 may be provided by characterizing the second order moments of the data, represented by the expected value of the interferograms. Under the hypothesis of statistical uncorrelation among the different SMs, the expected value of the *nm* − *th* interferogram may be expressed as:

*E* [*yny*∗

After (18), the expression of the data covariance matrix is given by:

*E* **y***MB***y***<sup>H</sup> MB def* = **R** =

*E* **y***MB***y***<sup>H</sup> MB def*

the subscripts *g*, *v* standing for *ground* and *volume*.

*y*<sup>1</sup> *y*<sup>2</sup> ··· *yN*

*<sup>m</sup>*] =

*K* ∑ *k*=1 *σ*2 *<sup>k</sup> <sup>γ</sup>*(*k*)

*nm* is the complex interferometric coherence induced by the spatial structure (i.e.: point-like, superficial, or volumetric) associated with the *k* − *th* SM in the *nm* − *th* interferogram (15).

range, azimuth location; each matrix **R***<sup>k</sup>* is an *N* × *N* matrix whose entries are given by the

each matrix **R***<sup>k</sup>* in (19) is to account for the spatial structure associated to each SMs. According to this interpretation, such matrices will be hereinafter referred to as *structure matrices*. Following the arguments presented in section 2.4, the data covariance matrix will be assumed to be contributed by just two SMs, one associated with volume backscattering, and the other with the ensemble of all SMs whose phase center is ground locked, namely surface backscattering, ground-volume scattering, and ground-trunk scattering. Accordingly, eq. (19)

= **R** = *σ*<sup>2</sup>

*<sup>g</sup>***R***<sup>g</sup>* + *<sup>σ</sup>*<sup>2</sup>

interferometric coherences associated to the *<sup>k</sup>* <sup>−</sup> *th* phase center: {**R***k*}*nm* <sup>=</sup> *<sup>γ</sup>*(*k*)

*K* ∑ *k*=1 *σ*2

 min <sup>=</sup> <sup>1</sup> **a** (*ξ*)

*<sup>H</sup>* **<sup>R</sup>**<sup>−</sup>1**<sup>a</sup>** (*ξ*) (17)

*nm* (18)

*<sup>k</sup>* **R***<sup>k</sup>* (19)

*<sup>v</sup>***R***<sup>v</sup>* (20)

*nm*. The role of

*<sup>k</sup>* is the backscattered power associated to the *k* − *th* SM;

*<sup>T</sup>* is the stack of the multi-baseline data at a certain slant

This leads to the definition of the Capon Spectral Estimator as:

presence of a small baseline aperture.

**3.2 Representation of multiple SMs**

where: *K* is the total number of SMs; *σ*<sup>2</sup>

*<sup>γ</sup>*(*k*)

where: **y***MB* =

simplifies to:

*SC* (*ξ*) = **f***<sup>H</sup>*

min**Rf**

By letting the ground and volume structure matrices depend upon some physically meaningful parameter, equation (20) provides an easy access point to an effective modeling of the problem. In this chapter I will consider a very simple model, where both the ground and the vegetation layer are represented as phase centers plus the associated decorrelation term arising from the angular spreading described in section 2.4. Hence, the phase of the structure matrices can be parametrized as:

$$
\angle \left\{ \mathbf{R}\_{\mathcal{g}, \mathcal{v}} \right\}\_{nm} = \frac{4\pi}{\lambda r} \frac{b\_m - b\_n}{\sin \theta} z\_{\mathcal{g}, \mathcal{v}} \tag{21}
$$

where *zg*,*<sup>v</sup>* is the elevation of the phase center associated with ground and volume scattering. As for the amplitudes, instead, the following model will be retained:

$$\left| \left\{ \mathbf{R}\_{\mathcal{S},\mathcal{P}} \right\}\_{nm} \right| = \rho\_{\mathcal{S},\mathcal{P}}^{-\left| b\_n - b\_m \right| / \Delta\_{\emptyset}} \tag{22}$$

where *ρg*,*<sup>v</sup>* will be referred to as spreading constant and Δ*<sup>b</sup>* is the average normal baseline sampling. The role of the spreading constant is to describe in a simple fashion the angular spreading that arises from the spatial structure associated to each phase center, avoiding the dependence upon a particular physical model. Given the definition above, the spreading constant ranges from 0 to 1, corresponding to the cases of pure noise and of a perfectly coherent received signal (i.e. a point-like scatterer), respectively. For details on more sophisticated physical models, the reader is referred to (13), (27).

#### **3.3.1 Inversion**

In principle, the availability of a model for the covariance matrix suffices for solving for the unknowns through Maximum Likelihood Estimation (MLE) (28). This solution, however, would not be efficient, since it requires an exhaustive search in a significantly wide parameter space. A significant complexity reduction may be achieved by applying the Covariance Matching Estimation Method (COMET) (29), (30), (31), (32), which allows to solve the problem by minimizing a weighted Frobenius norm of the difference between the sample covariance matrix and the model . In formula:

$$\left\{ \widehat{z}\_{\mathcal{S}'} \widehat{\rho}\_{\mathcal{S}'} \widehat{z}\_{\mathcal{S}'} \widehat{\rho}\_{\mathcal{S}'} \widehat{\sigma^2\_{\mathcal{S}'}} \widehat{\sigma^2\_{\mathcal{V}}} \right\} = \operatorname\*{argmin} \left\{ \operatorname{trace} \left( \widehat{\mathbf{R}}^{-1} (\widehat{\mathbf{R}} - \mathbf{M}) \widehat{\mathbf{R}}^{-1} (\widehat{\mathbf{R}} - \mathbf{M}) \right) \right\} \tag{23}$$

where **R** is the sample covariance matrix and **M** is the model matrix, i.e.:

$$\mathbf{M} = \sigma\_{\mathcal{S}}^2 \mathbf{R}\_{\mathcal{S}} \left( z\_{\mathcal{S}\prime} \rho\_{\mathcal{S}} \right) + \sigma\_{\upsilon}^2 \mathbf{R}\_{\upsilon} \left( z\_{\upsilon\prime} \rho\_{\upsilon} \right) \,. \tag{24}$$

A straightforward extension to the case of multi-polarimetric data can be achieved by assuming that backscattered powers *σ*2 *<sup>g</sup>* , *σ*<sup>2</sup> *v* only change with polarization, whereas the structural parameters *zg*, *ρg*,*zv*, *ρ<sup>v</sup>* stay the same. See (26) for details.

#### **3.4 A case study: the Remningstorp forest site**

This section is devoted to reporting the results of the tomographic analysis of the forest site of Remningstorp, Sweden, on the basis of a data-set of *N* = 9 P-Band, fully polarimetric SAR images acquired by the DLR airborne system E-SAR in the frame of the ESA campaign BIOSAR 2007. Prevailing tree species in the imaged scene are Norway spruce (Picea abies), Scots pine (Pinus sylvestris) and birch (Betula spp.). The dominant soil type is till with a field layer, when present, of blueberry (Vaccinium myrtillus) and narrow thinned grass

Multi-Baseline SARs 13

Forest Structure Retrieval from Multi-Baseline SARs 39

SLC images. The *μ*/*σ* index is widely used in Permanent Scatterers Interferometry (PSI) as a criterion to select the most coherent scatterers in the imaged scene (33). As a result, for both the HH and VV channels the *μ*/*σ* index has resulted to be characterized by extremely high values (*μ*/*σ >* 15), which indicates the presence of a highly stable scattering mechanism in the co-polar channels. Surprisingly, high values of the *μ*/*σ* index (*μ*/*σ >* 10) have been observed

Temporal decorrelation has been evaluated by exploiting the presence of additional zero

relatively to forested areas. Accordingly, the temporal stability of the scene is rather good for all the three polarimetric channels, indicating the presence of a stable scattering mechanism,

The information carried by the co-polar channels has been analyzed by averaging the backscattered powers and the co-polar interferograms (i.e.: the Hermitian product between the HH and VV channels) over all the 9 tracks and inside an estimation window as large as 50 × 50 square meters (ground range, azimuth). The distribution of the HH and VV total backscattered power with respect to the co-polar phase, Δ*ϕ* = *ϕHH* − *ϕVV*, has shown to be substantially bimodal, high and low energy values being concentrated around Δ*ϕ* ≈ 80◦ and Δ*ϕ* ≈ 0◦, respectively, see Fig.(6). The coherence between the HH and VV channels has been observed to be rather high in open areas (*γcopol* ≈ 0.8.), whereas in forested areas it has been

2D Histogram

Co-polar phase [rad]

A first, non parametric, tomographic analysis has been carried out by evaluating the Capon spectra of the three polarimetric channels, see for example (18), (17), reported in Fig. (7). Each spectrum has been obtained by evaluating the sample covariance matrix at each range bin on the basis of an estimation window as large as 50 × 50 square meters (ground range,


Fig. 6. Joint distribution of the total backscattered power and the co-polar phase for the co-polar channels. The total backscattered power has been obtained as the sum between the HH and VV backscattered powers. The color scale is proportional to the number of counts

*temp* � 0.85

*temp* � 0.75 in the HV channel,

baseline images. The temporal coherence at 56 days has been assessed in about *γHH*

*temp* � 0.8 in the VV channel, and *<sup>γ</sup>HV*

in the HV channel as well.

in the HH channel, *γVV*

especially in the co-polar channels.

assessed about *γcopol* ≈ 0.45.

Total backscattered power

within each bin.

**3.4.2 Tomographic analysis**

0

0.4

0.8

1.2

1.6

2

(Deschampsia flexuosa). Tree heights are in the order of 20 m, with peaks up to 30 m. The topography is fairly flat, terrain elevation above sea level ranging between 120 and 145 m. The acquisitions have been carried out from March to May 2007. The horizontal baseline spacing is approximately 10 m, resulting in a maximum horizontal baseline of approximately 80 m. The spatial resolution is approximately 3 m in the slant range direction and 1 m in the azimuth direction. The look angle varies from 25◦ to 55◦ from near to far range, resulting in the vertical resolution to vary from about 10 m in near range to 40 m in far range. See also table (2) in the remainder.

The top row of Fig. (5) shows an optical view of the Remningstorp test site captured from Google Earth, re-sampled onto the SAR slant range, azimuth coordinates, whereas the bottom row of the same figure shows the amplitude of the HH channel, averaged over the 9 images. Backscatter from open areas has turned out to be remarkably lower than that from forested areas (about 25 dB), as expected at longer wavelengths.

Fig. 5. Top row: Optical image of the Remningstorp forest site by Google Earth. Bottom row: Mean reflectivity of the HH channel.

#### **3.4.1 Non tomographic analysis**

An analysis of the amplitude stability of the data has been performed by computing the ratio *μ*/*σ*, where *μ* and *σ* denote the mean and the standard deviation of the amplitudes of the 12 Will-be-set-by-IN-TECH

(Deschampsia flexuosa). Tree heights are in the order of 20 m, with peaks up to 30 m. The topography is fairly flat, terrain elevation above sea level ranging between 120 and 145 m. The acquisitions have been carried out from March to May 2007. The horizontal baseline spacing is approximately 10 m, resulting in a maximum horizontal baseline of approximately 80 m. The spatial resolution is approximately 3 m in the slant range direction and 1 m in the azimuth direction. The look angle varies from 25◦ to 55◦ from near to far range, resulting in the vertical resolution to vary from about 10 m in near range to 40 m in far range. See also

The top row of Fig. (5) shows an optical view of the Remningstorp test site captured from Google Earth, re-sampled onto the SAR slant range, azimuth coordinates, whereas the bottom row of the same figure shows the amplitude of the HH channel, averaged over the 9 images. Backscatter from open areas has turned out to be remarkably lower than that from forested

Optical image by Google Earth

azimuth [m]

Fig. 5. Top row: Optical image of the Remningstorp forest site by Google Earth. Bottom row:

An analysis of the amplitude stability of the data has been performed by computing the ratio *μ*/*σ*, where *μ* and *σ* denote the mean and the standard deviation of the amplitudes of the

<sup>1000</sup> <sup>2000</sup> <sup>3000</sup> <sup>4000</sup> <sup>5000</sup> <sup>0</sup>

Mean Reflectivity (HH)

<sup>1000</sup> <sup>2000</sup> <sup>3000</sup> <sup>4000</sup> <sup>5000</sup> <sup>0</sup>

table (2) in the remainder.

slant range [m]

slant range [m]

500

Mean reflectivity of the HH channel.

**3.4.1 Non tomographic analysis**

1000

1500

2000

500

1000

1500

2000

areas (about 25 dB), as expected at longer wavelengths.

SLC images. The *μ*/*σ* index is widely used in Permanent Scatterers Interferometry (PSI) as a criterion to select the most coherent scatterers in the imaged scene (33). As a result, for both the HH and VV channels the *μ*/*σ* index has resulted to be characterized by extremely high values (*μ*/*σ >* 15), which indicates the presence of a highly stable scattering mechanism in the co-polar channels. Surprisingly, high values of the *μ*/*σ* index (*μ*/*σ >* 10) have been observed in the HV channel as well.

Temporal decorrelation has been evaluated by exploiting the presence of additional zero baseline images. The temporal coherence at 56 days has been assessed in about *γHH temp* � 0.85 in the HH channel, *γVV temp* � 0.8 in the VV channel, and *<sup>γ</sup>HV temp* � 0.75 in the HV channel, relatively to forested areas. Accordingly, the temporal stability of the scene is rather good for all the three polarimetric channels, indicating the presence of a stable scattering mechanism, especially in the co-polar channels.

The information carried by the co-polar channels has been analyzed by averaging the backscattered powers and the co-polar interferograms (i.e.: the Hermitian product between the HH and VV channels) over all the 9 tracks and inside an estimation window as large as 50 × 50 square meters (ground range, azimuth). The distribution of the HH and VV total backscattered power with respect to the co-polar phase, Δ*ϕ* = *ϕHH* − *ϕVV*, has shown to be substantially bimodal, high and low energy values being concentrated around Δ*ϕ* ≈ 80◦ and Δ*ϕ* ≈ 0◦, respectively, see Fig.(6). The coherence between the HH and VV channels has been observed to be rather high in open areas (*γcopol* ≈ 0.8.), whereas in forested areas it has been assessed about *γcopol* ≈ 0.45.

Co-polar phase [rad]

Fig. 6. Joint distribution of the total backscattered power and the co-polar phase for the co-polar channels. The total backscattered power has been obtained as the sum between the HH and VV backscattered powers. The color scale is proportional to the number of counts within each bin.

#### **3.4.2 Tomographic analysis**

A first, non parametric, tomographic analysis has been carried out by evaluating the Capon spectra of the three polarimetric channels, see for example (18), (17), reported in Fig. (7). Each spectrum has been obtained by evaluating the sample covariance matrix at each range bin on the basis of an estimation window as large as 50 × 50 square meters (ground range,

Multi-Baseline SARs 15

Forest Structure Retrieval from Multi-Baseline SARs 41

Results shown here have been obtained by processing all polarizations at once, as suggested

Figure (8) shows the map of the estimates relative ground elevation. The black areas correspond to absence of coherent signals, as it is the case of lakes. The estimates relative to canopy elevation are visible in Fig. (9). In this case, the black areas have been identified by the algorithm as being non-forested. It is worth noting the presence of a road, clearly visible in the optical image, see Fig. (5), crossing the scene along the direction from slant range, azimuth coordinates (1850, 0) to (1000, 5500). Along that road, a periodic series of small targets at an elevation of about 25 m has been found by the algorithm. Since a power line passes above the road, it seems reasonable to relate such targets to the echoes from the equipment on the top of

LIDAR: Ground Elevation

azimuth [m]

Fig. 8. Top row: ground elevation estimated by LIDAR. Bottom row: ground elevation estimated by T-SAR. Black areas correspond to an unstructured scattering mechanism.

500 1000 1500 2000 2500 3000 3500 4000

As a validation tool, we used LIDAR measurements courtesy of the Swedish Defence Research Agency (FOI) and Hildur and Sven Wingquist's Foundation. Concerning ground elevation the dispersion of the difference *zSAR* − *zLIDAR* has been assessed in less than 1 m. Concerning the estimated canopy elevation, the discrepancy with respect to LIDAR is clearly imputable to the fact that the estimates are relative to the average the *phase center* elevation inside the estimation window, whereas LIDAR is sensitive to the top height of the canopy. In particular, canopy elevation provided by T-SAR appears to be under-estimated with respect to LIDAR measurements, as a result of the under foliage penetration capabilities of P-band microwaves. Figure (10) reports the ratios between the estimated backscattered powers from the ground and the canopy (G/C ratio), for each polarimetric channel. As expected, in the co-polar channels the scattered power from the ground is significantly larger than canopy backscatter, the G/C ratio being assessed in about 10 dB. In the HV channel the backscattered powers

SAR: Ground Elevation

500 1000 1500 2000 2500 3000 3500 4000

**3.4.3 Parametric tomographic analysis**

in section 3.3.1.

the poles of the power line.

slant range [m]

slant range [m]

azimuth). The analyzed area corresponds to a stripe of the data along the slant range direction, shown in the top panel of Fig. (7). Almost the whole stripe is forested except for the dark areas in near range, corresponding to bare terrain. It may be observed that the three spectra have very similar characteristics. Each spectrum is characterized by a narrow peak, above which a weak sidelobe is visible. As for the co-polar channels, this result is consistent with the hypothesis that a single scattering mechanism is dominant. It is then reasonable to relate such scattering mechanism to the double bounce contribution from trunk-ground and canopy-ground interactions, and hence the peak of the spectrum can be assumed to be located at ground level. The sidelobe above the main peak is more evident in the HV channel, but the contributions from ground level seem to be dominant as well, the main peak being located almost at the same position as in the co-polar channels. Accordingly, the presence of a relevant contribution from the ground has to be included in the HV channel too.

Fig. 7. Top panel: mean reflectivity of the data (HH channel) within a stripe as wide as 50 m in the azimuth direction. The underlying panels show the Capon Spectra for the three polarimetric channels. At every range bin the signal has been scaled in such a way as to have unitary energy.
