**Meet the editor**

Dr Boris Escalante-Ramírez received his PhD from the Eindhoven University of Technology in 1992. He is currently a full professor in electrical engineering at the National University of Mexico and a member of the National Research System. His research interests embrace computational models of visual information processing and their applications to remote sensing, medical

imaging, video analysis, and computer vision. Dr Escalante has authored more than 90 peer-reviewed research papers and book chapters and has served as a reviewer of several international research journals. He has been participant or responsible for various national and international research projects, several of them in remote sensing. Dr Escalante has been granted several awards, including the National University distinction for junior scholars in exact sciences in 1997.

Contents

**Preface IX** 

**Section 1 Analysis Techniques 1** 

Hooman Latifi

Chapter 3 **Statistical Properties of** 

Chapter 4 **Classification of Pre-Filtered** 

Chapter 6 **Low Rate High Frequency** 

Chapter 5 **Estimation of the Separable MGMRF** 

Chapter 7 **A Contribution to the Reduction of** 

and Hermann Kaufmann

Pau Bergada, RosaMa Alsina-Pages, Carles Vilella and Joan Ramon Regué

Chapter 2 **Fusion of Optical and Thermal** 

Chapter 1 **Characterizing Forest Structure by** 

**Means of Remote Sensing: A Review 3** 

**Imagery and LiDAR Data for Application to** 

Anna Brook, Marijke Vandewal and Eyal Ben-Dor

**Surface Slopes via Remote Sensing 51**  Josué Álvarez-Borrego and Beatriz Martín-Atienza

**Multichannel Remote Sensing Images 75** 

**Parameters for Thematic Classification 99** 

Hans-Ulrich Wetzel, Katia Urata, Andreas Hueni

Vladimir Lukin, Nikolay Ponomarenko, Dmitriy Fevralev, Benoit Vozel, Kacem Chehdi and Andriy Kurekin

**Data Transmission from Very Remote Sensors 123** 

**Radiometric Miscalibration of Pushbroom Sensors 151**  Christian Rogaß, Daniel Spengler, Mathias Bochow, Karl Segl, Angela Lausch, Daniel Doktor, Sigrid Roessner, Robert Behling,

Rolando D. Navarro, Jr., Joselito C. Magadia and Enrico C. Paringit

**3-D Urban Environment and Structure Monitoring 29** 

## Contents


and Hermann Kaufmann

X Contents


Contents VII

Chapter 19 **Progress Research on** 

Assad Anis

**Wireless Communication**

Chapter 20 **Cold Gas Propulsion System –** 

**Systems for Underground Mine Sensors 429**  Larbi Talbi, Ismail Ben Mabrouk and Mourad Nedil

**An Ideal Choice for Remote Sensing Small Satellites 447** 

Albert Lin

Chapter 19 **Progress Research on Wireless Communication Systems for Underground Mine Sensors 429**  Larbi Talbi, Ismail Ben Mabrouk and Mourad Nedil

VI Contents

Chapter 8 **Differential Absorption** 

Chapter 9 **Energy Efficient Data Acquistion** 

Maged Marghany

**Section 2 Sensors and Platforms 233**

Chapter 11 **COMS, the New Eyes in the Sky** 

Chapter 12 **Hyperspectral Remote Sensing –** 

Chapter 13 **CSIR – NLC Mobile LIDAR for** 

Yasser Hassebo

Sivakumar Venkataraman

Chapter 15 **Smart Station for Data Reception** 

Mykhaylo Palamar

Albert Lin

**Using Low Flying Aircraft and** 

**Atmospheric Remote Sensing 289** 

**of the Earth Remote Sensing 341**

Chapter 17 **Road Feature Extraction from High Resolution** 

Hang Jin, Marc Miska, Edward Chung,

Chapter 18 **Hardware Implementation of a Real-Time Image** 

Maoxun Li and Yanming Feng

Chapter 14 **Active Remote Sensing: Lidar SNR Improvements 313**

Chapter 16 **Atmospheric Propagation of Terahertz Radiation 371** Jianquan Yao, Ran Wang, Haixia Cui and Jingli Wang

**Aerial Images Upon Rural Regions Based on**

**Multi-Resolution Image Analysis and Gabor Filters 387** 

**Data Compression for Satellite Remote Sensing 415** 

**Microwave Radar Measurements for** 

**in Wireless Sensor Network 197** Ken C.K. Lee, Mao Ye and Wang-Chien Lee

**Remote Sensing of Barometric Pressure 171** Roland Lawrence, Bin Lin, Steve Harrah and Qilong Min

Chapter 10 **Three-Dimensional Lineament Visualization Using Fuzzy** 

**for Geostationary Remote Sensing 235** 

**Small Vessels in Coastal Littoral Areas 269** 

Charles R. Bostater, Jr., Gaelle Coppin and Florian Levaux

**B-Spline Algorithm from Multispectral Satellite Data 213** 

Han-Dol Kim, Gm-Sil Kang, Do-Kyung Lee, Kyoung-Wook Jin, Seok-Bae Seo, Hyun-Jong Oh, Joo-Hyung Ryu, Herve Lambert, Ivan Laine, Philippe Meyer, Pierre Coste and Jean-Louis Duquesne Chapter 20 **Cold Gas Propulsion System – An Ideal Choice for Remote Sensing Small Satellites 447**  Assad Anis

Preface

Nowadays it is hard to find areas of human activity and development that have not profited from or contributed to remote sensing. Natural, physical and social activities find in remote sensing a common ground for interaction and development. From the end-user point of view, Earth science, geography, planning, resource management, public policy design, environmental studies, and health, are some of the areas whose recent development has been triggered and motivated by remote sensing. From the technological point of view, remote sensing would not be possible without the advancement of basic as well as applied research in areas like physics, space technology, telecommunications, computer science and engineering. This dual conception of remote sensing brought us to the idea of preparing two different books. The present one is devoted to new techniques for data processing, sensors and platforms, while the accompanying book is meant to display recent advances in remote sensing applications.

From a strict perspective, remote sensing consists of collecting data from an object or phenomenon without making physical contact. In practice, most of the time we refer to satellite or aircraft-mounted sensors that use some sort of electromagnetic radiation to gather geospatial information from land, oceans and atmosphere. The growing diversity of human activity has motivated the design of new sensors and platforms as well as the development of new methodologies that can process the enormous amount of information that is being generated daily. Collected information, however, represents only a footprint of the object or the phenomenon we are interested in. In order for the end-user to be able to interpret and use this information, the data has to be processed so that it does not longer represent a digital number, but a physicalrelated value. Among the tasks that usually must be carried out on this data, we find several numerical corrections and calibrations: geometrical, digital elevation, atmospheric, radiometric, etc. Moreover, depending on the end-user application, data may need to be filtered, compressed, transmitted, fused, classified, interpolated, etc. The problem is even more complex when we think of the variety of sensors and satellites that have been designed and launched. We are talking about a large diversity that includes passive or active sensors; panchromatic, multispectral or hyperspectral sensors; all of them with spatial resolutions that range from a couple of centimeters to several kilometers, to mention a few examples. In summary, different methodologies and techniques for data processing must be designed and customized according, not only to the specific application, but also to the sensor and satellite characteristics.

## Preface

Nowadays it is hard to find areas of human activity and development that have not profited from or contributed to remote sensing. Natural, physical and social activities find in remote sensing a common ground for interaction and development. From the end-user point of view, Earth science, geography, planning, resource management, public policy design, environmental studies, and health, are some of the areas whose recent development has been triggered and motivated by remote sensing. From the technological point of view, remote sensing would not be possible without the advancement of basic as well as applied research in areas like physics, space technology, telecommunications, computer science and engineering. This dual conception of remote sensing brought us to the idea of preparing two different books. The present one is devoted to new techniques for data processing, sensors and platforms, while the accompanying book is meant to display recent advances in remote sensing applications.

From a strict perspective, remote sensing consists of collecting data from an object or phenomenon without making physical contact. In practice, most of the time we refer to satellite or aircraft-mounted sensors that use some sort of electromagnetic radiation to gather geospatial information from land, oceans and atmosphere. The growing diversity of human activity has motivated the design of new sensors and platforms as well as the development of new methodologies that can process the enormous amount of information that is being generated daily. Collected information, however, represents only a footprint of the object or the phenomenon we are interested in. In order for the end-user to be able to interpret and use this information, the data has to be processed so that it does not longer represent a digital number, but a physicalrelated value. Among the tasks that usually must be carried out on this data, we find several numerical corrections and calibrations: geometrical, digital elevation, atmospheric, radiometric, etc. Moreover, depending on the end-user application, data may need to be filtered, compressed, transmitted, fused, classified, interpolated, etc. The problem is even more complex when we think of the variety of sensors and satellites that have been designed and launched. We are talking about a large diversity that includes passive or active sensors; panchromatic, multispectral or hyperspectral sensors; all of them with spatial resolutions that range from a couple of centimeters to several kilometers, to mention a few examples. In summary, different methodologies and techniques for data processing must be designed and customized according, not only to the specific application, but also to the sensor and satellite characteristics.

We do not intend this book to cover all aspects of remote sensing techniques and platforms, since it would be an impossible task for a single volume. Instead, we have collected a number of high-quality, original and representative contributions in those areas. The first part of the book is devoted to new methodologies and techniques for data processing in remote sensing. The reader will find interesting contributions in forest characterization, data fusion, surface slopes statistical properties, multichannel and Markovian classification, road feature extraction, miscalibration correction, barometric pressure measurements, wireless sensors networks and lineament visualization. The second part of the book gathers chapters related to new sensors and platforms for remote sensing, including the new COMS satellite, hyperspectral remote sensing, mobile LIDAR for atmospheric remote sensing, SNR improvements in LIDAR, a smart station for data reception, terahertz radiation propagation, HF data transmission for very remote sensing, hardware image compression, wireless communications for underground sensors, and cold gas propulsion for remote sensing satellites.

I wish to express my deepest gratitude to all authors who have contributed to this book. Without their strong commitment this book would not have become such a valuable piece of information. I am also thankful to InTech editorial team who has provided the opportunity to publish this book.

> **Boris Escalante-Ramírez**  National Autonomous University of México, Faculty of Engineering, Mexico City, Mexico

## **Section 1**

**Analysis Techniques** 

**0**

**1**

Hooman Latifi

*Germany*

**Characterizing Forest Structure by Means of**

*Dept. of Remote Sensing and Landscape Information Systems, University of Freiburg*

Forest management comprises of a wide range of planning stages and activities which are highly variable according to the goals and strategies being pursued. Furthermore, those activities often include a requirement for description of condition and dynamics of forests (Koch et al., 2009). Forest ecosystems are often required to be described by a set of general characteristics including composition, function, and structure (Franklin, 1986). Composition is described by presence or dominance of woody species or by relative indices of biodiversity. Forest functional characteristics are related to issues like types and rates of processes such as carbon sequestration. Apart from them, the physical characteristics of forests are essential to be expressed. This description is often accomplished under the general concept of forest structure. However, the entire above-mentioned characteristics are required for timber management/procurement practices, as well as for mapping forests into smaller units or

The definition by (Oliver & Larson, 1996) can be referred to as one of the basic ones, in which forest structure is defined as 'the physical and temporal distribution of trees in a forest stand'. This definition encompasses a set of indicators including species distribution, vertical and horizontal spatial patterns, tree size, tree age and/or combinations of them. Yet, a more geometrical representation of forest stand was previously presented by e.g. (Franklin, 1986) or later by (Kimmins, 1996). They defined stand structure as the vertical and horizontal association of stand elements. Despite the differences between the above-mentioned definitions, they were later used as basis to derive further representative structural indicators which are mainly derived based on the metrics such as diameter at breast height (DBH). The reason is the straightforwardness and (approximately) unbiasedness of its measurement in terrestrial surveys (Stone & Porter, 1998). The interest in applying geometric derivations e.g. standing volume and aboveground biomass was later accomplished thanks to the progresses in computational facilities and simulation techniques. Those attributes are still of great importance to describe forest stand structure. Nevertheless, (McElhinny et al., 2005) stated that the structural, functional and compositional attributes of a stand are highly interdependent and thus cannot be easily divided to such main categories, since the attributes from either of the groups can be considered as alternatives to each other. Thus a new category was created, according to which the structural attributes were in a group comprising of measures such as abundance (e.g. dead wood volume), size variation (e.g. variation in DBH)

**1. Introduction**

compartments.

**1.1 Forest structural attributes**

**Remote Sensing: A Review**

## **Characterizing Forest Structure by Means of Remote Sensing: A Review**

Hooman Latifi

*Dept. of Remote Sensing and Landscape Information Systems, University of Freiburg Germany*

## **1. Introduction**

### **1.1 Forest structural attributes**

Forest management comprises of a wide range of planning stages and activities which are highly variable according to the goals and strategies being pursued. Furthermore, those activities often include a requirement for description of condition and dynamics of forests (Koch et al., 2009). Forest ecosystems are often required to be described by a set of general characteristics including composition, function, and structure (Franklin, 1986). Composition is described by presence or dominance of woody species or by relative indices of biodiversity. Forest functional characteristics are related to issues like types and rates of processes such as carbon sequestration. Apart from them, the physical characteristics of forests are essential to be expressed. This description is often accomplished under the general concept of forest structure. However, the entire above-mentioned characteristics are required for timber management/procurement practices, as well as for mapping forests into smaller units or compartments.

The definition by (Oliver & Larson, 1996) can be referred to as one of the basic ones, in which forest structure is defined as 'the physical and temporal distribution of trees in a forest stand'. This definition encompasses a set of indicators including species distribution, vertical and horizontal spatial patterns, tree size, tree age and/or combinations of them. Yet, a more geometrical representation of forest stand was previously presented by e.g. (Franklin, 1986) or later by (Kimmins, 1996). They defined stand structure as the vertical and horizontal association of stand elements. Despite the differences between the above-mentioned definitions, they were later used as basis to derive further representative structural indicators which are mainly derived based on the metrics such as diameter at breast height (DBH). The reason is the straightforwardness and (approximately) unbiasedness of its measurement in terrestrial surveys (Stone & Porter, 1998). The interest in applying geometric derivations e.g. standing volume and aboveground biomass was later accomplished thanks to the progresses in computational facilities and simulation techniques. Those attributes are still of great importance to describe forest stand structure. Nevertheless, (McElhinny et al., 2005) stated that the structural, functional and compositional attributes of a stand are highly interdependent and thus cannot be easily divided to such main categories, since the attributes from either of the groups can be considered as alternatives to each other. Thus a new category was created, according to which the structural attributes were in a group comprising of measures such as abundance (e.g. dead wood volume), size variation (e.g. variation in DBH)

mature stems within the overstory of the stands. This attribute (stem count of older trees) has been already studied by e.g. (Van Den Meersschaut & Vandekerkhove, 1998) as a structural feature to distinguish the old-growth stands from the early stages of succession. Although some studies combined stem count with measures of diameter distribution e.g. (Tyrrell & Crow, 1994), some studies e.g. (Uuttera et al., 1997) did not suggest diameter distribution to be essentially helpful for describing forest structure, as comparing the diameter distributions

Characterizing Forest Structure by Means of Remote Sensing: A Review 5

All in all, the structural features of forest stands, as stated above, are entirely considered to be useful when describing the horizontal and vertical complexity of the forested areas. However, a relatively limited number of those attributes have been attempted to be modelled by means of remote sensing. Only a few studies have focused on other spatiallymeaningful characteristics such as gaps or coarse woody debris e.g. (Pesonen et al., 2008) which have been almost entirely conducted across Scandinavian boreal forests, where the homogenous composition, single-story stands (consisting mainly of coniferous species) and topographically-gentle landscape minimise the problems of characterizing more complex

Since earth observation data has been applied for forestry applications, the majority of modelling tasks have been accomplished by focusing on standing timber volume, stand height, aboveground biomass (AGB), stem count, and diameter distribution as structural attributes. Whereas some compositional characteristics such as species richness/abundance have also been considered as forest structural attributes (Table 1), this article will not review their related literature, as they follow, in the scope of remote sensing, entirely different methodological strategies and thus require separate review studies with more concentration

Estimation of AGB in forest is obviously of a great importance. The rationale is straightforward: As the available stocks of fossil fuels gradually diminish and the environmental effects of climate change increasingly emerge, a wide range of stakeholders including political, economical and industrial sectors endeavour to adjust to the consequences and adapt the existing energy supply to the ongoing developments. To this aim, a vital step is the assessment of the potential renewable energy sources such as biomass. Germany can be referred as an example, in which approximately 17 million ha of farmland and 11 million ha of forest are potentially reported to be available as bioenergy sources (BMU, 2009). Moreover, according to the results of the German National Forest Inventory, around 1.0 to 1.5 percent of the country's primary energy demand (20 and 25 million *m*3) in 2006 was supplied by timber products. The current models even confirm that an additional 12 to 19 million *m*<sup>3</sup> *year*−<sup>1</sup> of timber can be sustainably used for energy production. This can in turn justify the necessity of an efficient monitoring system for assessing the potential biomass resources in regional and

In Recent years the general interest in forests and the environmental-related issues has exceedingly increased. This, together with the ongoing technological developments such as improved data acquisition and computing techniques, has fostered progresses in forest monitoring processes, where the assessment of environmental processes has been enabled to be carried out by means of advanced methods such as intensive modelling and simulations

from different stands bears some degree of sophistication.

descriptors of forest structure.

local levels.

on pixel-based analysis and spectrometry.

**1.2 Remote sensing for retrieval of forest attributes**

and spatial variation (e.g. variation of distance to a nearest neighbour (Table 1) (McElhinny et al., 2005).

Though canopy cover i.e. the vertical projection of tree crowns is often referred to as an attribute characterizing the distribution of forest biomass, there are further attributes such as basal area, standing timber volume and the height of overstory which are considered as the more representative descriptors of forest biomass. Moreover, a combination of those attributes (especially in accordance with species composition) is also reported by e.g. (Davey, 1984) to represent the biomass and vertical complexity of the stands.


Table 1. Broadly-investigated forest structural attributes, grouped under the stand element under description (after (McElhinny et al., 2005).

In addition, stem count has also been reported as an important indicator of e.g. felled logs or trees with hollows, since they offer potential habitats for the wildlife ((Acker et al., 1998), (McElhinny et al., 2005)). Thus, the frequency of larger stems is considered of more significance as a descriptor of stand structure, as it can mainly characterize the older and 2 Will-be-set-by-IN-TECH

and spatial variation (e.g. variation of distance to a nearest neighbour (Table 1) (McElhinny

Though canopy cover i.e. the vertical projection of tree crowns is often referred to as an attribute characterizing the distribution of forest biomass, there are further attributes such as basal area, standing timber volume and the height of overstory which are considered as the more representative descriptors of forest biomass. Moreover, a combination of those attributes (especially in accordance with species composition) is also reported by e.g. (Davey, 1984) to

Foliage density within different strata

Average gap size and the proportion of canopy in gaps Proportion of crowns with dead and broken tops

represent the biomass and vertical complexity of the stands.

Number of strata

Gap size classes

standard deviation of DBH Diameter distribution Number of large trees

Height classes richness

Total understorey cover Understorey richness

Stem count per ha

Standing volume

Biomass Tree species Species diversity and/or richness

Shrub cover

Dead wood Number, volume or basal area of stags

Standard deviation of tree height

Relative abundance of key species

Saplings (shade tolerant) per ha

Volume of coarse woody debris

Log volume by decay or diameter classes Coefficient of variation of log density Table 1. Broadly-investigated forest structural attributes, grouped under the stand element

In addition, stem count has also been reported as an important indicator of e.g. felled logs or trees with hollows, since they offer potential habitats for the wildlife ((Acker et al., 1998), (McElhinny et al., 2005)). Thus, the frequency of larger stems is considered of more significance as a descriptor of stand structure, as it can mainly characterize the older and

Tree spacing Clark - Evans and Cox indices, percentage of trees in clusters

Tree diameter Diameter at Breast Height (DBH)

**Forest stand element Structural attribute** Foliage Foliage height diversity

Tree height Height of overstorey

Stand biomass Stand basal area

Overstorey vegetation Shrub height

under description (after (McElhinny et al., 2005).

Canopy cover Canopy cover

et al., 2005).

mature stems within the overstory of the stands. This attribute (stem count of older trees) has been already studied by e.g. (Van Den Meersschaut & Vandekerkhove, 1998) as a structural feature to distinguish the old-growth stands from the early stages of succession. Although some studies combined stem count with measures of diameter distribution e.g. (Tyrrell & Crow, 1994), some studies e.g. (Uuttera et al., 1997) did not suggest diameter distribution to be essentially helpful for describing forest structure, as comparing the diameter distributions from different stands bears some degree of sophistication.

All in all, the structural features of forest stands, as stated above, are entirely considered to be useful when describing the horizontal and vertical complexity of the forested areas. However, a relatively limited number of those attributes have been attempted to be modelled by means of remote sensing. Only a few studies have focused on other spatiallymeaningful characteristics such as gaps or coarse woody debris e.g. (Pesonen et al., 2008) which have been almost entirely conducted across Scandinavian boreal forests, where the homogenous composition, single-story stands (consisting mainly of coniferous species) and topographically-gentle landscape minimise the problems of characterizing more complex descriptors of forest structure.

Since earth observation data has been applied for forestry applications, the majority of modelling tasks have been accomplished by focusing on standing timber volume, stand height, aboveground biomass (AGB), stem count, and diameter distribution as structural attributes. Whereas some compositional characteristics such as species richness/abundance have also been considered as forest structural attributes (Table 1), this article will not review their related literature, as they follow, in the scope of remote sensing, entirely different methodological strategies and thus require separate review studies with more concentration on pixel-based analysis and spectrometry.

Estimation of AGB in forest is obviously of a great importance. The rationale is straightforward: As the available stocks of fossil fuels gradually diminish and the environmental effects of climate change increasingly emerge, a wide range of stakeholders including political, economical and industrial sectors endeavour to adjust to the consequences and adapt the existing energy supply to the ongoing developments. To this aim, a vital step is the assessment of the potential renewable energy sources such as biomass. Germany can be referred as an example, in which approximately 17 million ha of farmland and 11 million ha of forest are potentially reported to be available as bioenergy sources (BMU, 2009). Moreover, according to the results of the German National Forest Inventory, around 1.0 to 1.5 percent of the country's primary energy demand (20 and 25 million *m*3) in 2006 was supplied by timber products. The current models even confirm that an additional 12 to 19 million *m*<sup>3</sup> *year*−<sup>1</sup> of timber can be sustainably used for energy production. This can in turn justify the necessity of an efficient monitoring system for assessing the potential biomass resources in regional and local levels.

#### **1.2 Remote sensing for retrieval of forest attributes**

In Recent years the general interest in forests and the environmental-related issues has exceedingly increased. This, together with the ongoing technological developments such as improved data acquisition and computing techniques, has fostered progresses in forest monitoring processes, where the assessment of environmental processes has been enabled to be carried out by means of advanced methods such as intensive modelling and simulations

Characterizing Forest Structure by Means of Remote Sensing: A Review

Characterizing Forest Structure by Means of Remote Sensing: A Review 7

Fig. 1. An example

cloud (right)

 of false colour

demonstrating

 a circular forest

composite

 from Colour Infrared (CIR) aerial images (Left) and

inventory

plot(452.4

*m*2) in a test site in

Karlsruhe,

 Germany.

normalized

first-pulse

 LiDAR point

(Guo, 2005). As described above, assessment and mapping of forest attributes have followed a similar progress as an essential prerequisite for forest management practices.

Information within each forest management unit (e.g. sample plots or segments characterising forest stands) often includes attributes that are measured using direct measurement (e.g. field-based surveys) and indirect measurement (e.g. mathematical derivations and modelled/simulated data). Detailed ground-based survey of each unit is reported by e.g. (LeMay & Temesgen, 2005) to be unlikely, particularly in large-area surveys dealing with limited financial resources or in the inventory of small areas, when those areas are under private ownerships. Such areas are usually associated with financial problems for regular plot-based surveys. However, the plot-based inventory data are considered as being essential as representatives of the current forest inventory or as model inputs to project the future conditions. In order to overcome the mentioned limitations in regular terrestrial surveys, one approach is to combine field measurements with airborne and spaceborne remotelysensed data to retrieve the required information. This can in turn offer combined practical applications of the field data that represent the detailed information on the ground supported by those data which represent the spatial, spectral and temporal merits of satellite or airborne sensors (Figure 1).

Based on this potential cost-effective implications, a range of applications have been developed which enable one to pursue different natural resource planning objectives including retrieval of forest structural attributes. Amongst the most important international forest mapping projects using earth observation data, GMES (Global Monitoring for Environment and Security), TREES (Tropical Ecosystem Environment Observation by Satellite) and FRA (Forest Resource Assessment) can be highlighted (Koch, 2010). Depending on the specific application, the required level of details and especially the required accuracy of output information, variety of remotely sensed data sources can be potentially applied including a wide range of optical data (broadband multispectral and narrowband hyperspectral imagery), Radio Detection and Ranging (RADAR) and recently Light Detection and Ranging (LiDAR) data. Each one of those data sources has been proved to bear potentials and advantages for forestry applications. Whereas LiDAR instruments facilitate collecting detailed information which accurately captures the three-dimensional structure of the earth surface, RADAR data enable one to overcome common atmospheric and shadow effects which often occur in forested areas. Broadband optical data is able to reflect the general spectral responses of natural and manmade objects including vegetation cover over a big scene, while imaging spectroscopy data has been shown to provide a rich source of spectral information for various applications e.g. tree species classification.

Compared to other sources of data, LiDAR data has been successfully validated for studying the structure of forested areas. Laser altimetry is an active remote sensing technology that determines ranges by taking the product of the speed of light and the time required for an emitted laser to travel to a target object. The elapsed time from when a laser is emitted from a sensor and intercepts an object can be measured using either pulsed ranging (where the travel time of a laser pulse from a sensor to a target object is recorded) or continuous wave ranging (where the phase change in a transmitted sinusoidal signal produced by a continuously emitting laser is converted into travel time) (Wehr & Lohr, 1999). LiDAR is capable of providing both horizontal and vertical information with the horizontal and vertical sampling. The quality of sampling depends on the type of LiDAR system used and on whether it is discrete return or full waveform LiDAR system (Lim et al., 2003).

4 Will-be-set-by-IN-TECH

(Guo, 2005). As described above, assessment and mapping of forest attributes have followed

Information within each forest management unit (e.g. sample plots or segments characterising forest stands) often includes attributes that are measured using direct measurement (e.g. field-based surveys) and indirect measurement (e.g. mathematical derivations and modelled/simulated data). Detailed ground-based survey of each unit is reported by e.g. (LeMay & Temesgen, 2005) to be unlikely, particularly in large-area surveys dealing with limited financial resources or in the inventory of small areas, when those areas are under private ownerships. Such areas are usually associated with financial problems for regular plot-based surveys. However, the plot-based inventory data are considered as being essential as representatives of the current forest inventory or as model inputs to project the future conditions. In order to overcome the mentioned limitations in regular terrestrial surveys, one approach is to combine field measurements with airborne and spaceborne remotelysensed data to retrieve the required information. This can in turn offer combined practical applications of the field data that represent the detailed information on the ground supported by those data which represent the spatial, spectral and temporal merits of satellite or airborne

Based on this potential cost-effective implications, a range of applications have been developed which enable one to pursue different natural resource planning objectives including retrieval of forest structural attributes. Amongst the most important international forest mapping projects using earth observation data, GMES (Global Monitoring for Environment and Security), TREES (Tropical Ecosystem Environment Observation by Satellite) and FRA (Forest Resource Assessment) can be highlighted (Koch, 2010). Depending on the specific application, the required level of details and especially the required accuracy of output information, variety of remotely sensed data sources can be potentially applied including a wide range of optical data (broadband multispectral and narrowband hyperspectral imagery), Radio Detection and Ranging (RADAR) and recently Light Detection and Ranging (LiDAR) data. Each one of those data sources has been proved to bear potentials and advantages for forestry applications. Whereas LiDAR instruments facilitate collecting detailed information which accurately captures the three-dimensional structure of the earth surface, RADAR data enable one to overcome common atmospheric and shadow effects which often occur in forested areas. Broadband optical data is able to reflect the general spectral responses of natural and manmade objects including vegetation cover over a big scene, while imaging spectroscopy data has been shown to provide a rich source of spectral information

whether it is discrete return or full waveform LiDAR system (Lim et al., 2003). 6 Remote Sensing – Advanced Techniques and Platforms

Compared to other sources of data, LiDAR data has been successfully validated for studying the structure of forested areas. Laser altimetry is an active remote sensing technology that determines ranges by taking the product of the speed of light and the time required for an emitted laser to travel to a target object. The elapsed time from when a laser is emitted from a sensor and intercepts an object can be measured using either pulsed ranging (where the travel time of a laser pulse from a sensor to a target object is recorded) or continuous wave ranging (where the phase change in a transmitted sinusoidal signal produced by a continuously emitting laser is converted into travel time) (Wehr & Lohr, 1999). LiDAR is capable of providing both horizontal and vertical information with the horizontal and vertical sampling. The quality of sampling depends on the type of LiDAR system used and on

a similar progress as an essential prerequisite for forest management practices.

sensors (Figure 1).

for various applications e.g. tree species classification.

data (e.g. from remote sensing sources) to produce digital layers of measured forest or land use attributes ((Haapanen et al., 2004)). Following the promising results in Scandinavian landscapes achieved by the application of nonparametric methods in prediction/classification of continuous and categorical forest attributes by means of remotely sensed data, the method have recently received a great deal of attention in other parts of the world e.g. in central Europe (Latifi et al., 2011), as the method could be potentially integrated as a cost effective

Characterizing Forest Structure by Means of Remote Sensing: A Review 9

Apart from the forest inventories conducted in larger scales, the k-NN method has been applied in the context of so-called small-scale forest inventory, in which the accurate and unbiased inventory of small datasets is of major interest. The term 'small area ' commonly denotes a small geographical area, but may also be used to describe a small domain, i.e. a small subpopulation in a large geographical area (Ghosh & Rao, 1994). Sample survey data of a small area or subpopulation can be used to derive reliable estimates of totals and means for large areas or domains. However, the usual direct survey estimators based on the sampled data are often likely to return erroneous outcomes due to the improperly small sample size. This is more crucial in regional forest inventories, where the sample size is typically small since e.g. the overall sample size in a survey is commonly determined to provide specific accuracy at a much higher level of aggregation than that of small areas. In central European forestry context, a small-area domain is of fundamental importance, since the occurrence of multiple forest ownership systems are historically well-established and still frequently occur. This variation bears, in turn, various forest areas which are connected with different requirements in terms of financial and technological resources for forest inventory. In such situations, high expenses are associated with the regular terrestrial surveys (Stoffels, 2009) and the integration of remote sensing and modelling is thus a motivation to reduce the costs. For example, aerial survey with large footprint ALS flights is reported to generate costs to the amount of 1Euro per ha in Germany (Nothdurft et al., 2009). Therefore, an effective strategy of forest inventory should mainly focus on the inventory of such small forest datasets using all the available infrastructures and potentially attainable technological means. The goal should be set to producing reliable (i.e. sufficiently accurate), general (i.e. reproducible) and (approximately) unbiased models of prominent forest attributes which support providing an up-to-date and continuous information database within the bigger framework of periodical

However, some issues are crucially required to be taken into consideration, before a remote sensing-supported modelling task of forest attributes can be commenced. These include:

Remote sensing data provides a valuable source of information to the forest modelling process. The advanced use of 2 and 3D data in both single-tree and area-based approaches of attributes retrieval would offer valuable potentials to characterize the (inherently) 3D structure of the forest stands (particularly vertical structure such as mean or top height). The data combination is specific to the objectives being set within the case study, as well as to the level of details which is required by the analyst. As such, different data including broadband optical (both medium and high spatial resolution), hyperspectral, LiDAR (height as well as

intensity), and RADAR data can be combined or fused to reach those goals.

alternative within the regional and national forest inventories.

state-wide forest inventory system.

**1.3.1 Data combination issues**

#### **1.3 Modelling issues**

When the aim is to assess the forest attributes by means of remote sensing data, one may note, again, the importance of estimating forest biomass. (Koch, 2010) states that three main factors of forest height, forest closure and forest type are the most meaningful descriptors for AGB. Remote sensing-derived information from the above-mentioned sources will enable one to successfully assess those three factors which can in turn result in reasonable estimation of forest AGB. By using those auxiliary data as descriptors of forest structure (e.g. AGB), Statistical methods are used to model the forest stand attributes in different scales including regional, stand and individual tree levels. So far, the modelling process has been mostly accomplished by means of parametric regression modelling of the response attributes.

Parametric models generally come with strong assumptions of distributions for the parameters and variables which sometimes may not be met by the data. The application of those models is normally subjected to the scientific, technological, and logistic conditions which constrain their application in many cases (Cabaravdic, 2007). A parametric fitting can yield highly biased models resulted from the possible misspecification of the unknown density function (e.g. (Härdle, 1990)). Nevertheless, those modelling procedures have been widely used for building models of forest stand and single tree attributes by several studies (e.g.(Næsset, 2002), (Breidenbach et al., 2008), (Korhonen et al., 2008), and (Straub et al., 2009)).

In contrast, the so called âAIJnonparametric methodsâ ˘ A˘ ˙ I allow for more flexibility in using the unknown regression relationships. (Härdle, 1990) and (Härdle et al., 2004) discussed four main motivations to start with nonparametric models: 1)they provide flexibility to explore the relationships between the predictor and response variables, 2)they enable predictions which are independent from reference to a fixed parametric model, 3)they can help to find false observations by studying the influence of isolated points, and 4) they can be considered as versatile methods for imputing missing values or interpolations between neighbouring predictor values. However, they require larger sample sizes than parametric counterparts, as the underlying data in a nonparametric approach simultaneously serves as the model input.

The nonparametric methods include a wide range of model-fitting approaches such as smoothing methods (e.g. kernel smoothing, k-nearest neighbour, splines and orthogonal series estimators), Generalized Additive Models (GAMs) and models based on classification and regression trees (CARTs). The k-nearest neighbour (k-NN) method is known as a group of mostly-applied nonparametric methods. In k-NN method, the value of the response variable(s) of interest on a specific target unit is modelled as a weighted average of the values of the most similar observation(s) in its neighbourhood. The neighbour(s) are defined within an n-dimensional feature space consisted of potentially-relevant predictor variables. The chosen neighbour(s) are selected based on a criterion which quantifies and measures the *similarity* from a database of previously measured observations (Maltamo & Eerikäinen, 2001). In the context of forest inventory, the k-NN method was first introduced in the late 1980's (Kilkki & Päivinen, 1987), applied later for the prediction of standing timber volume by e.g. (Tomppo, 1993) and was later examined in a handful of studies to predict forest stand and individual tree attributes. As stated by e.g. (Haapanen et al., 2004), the k-NN method has been further developed for modelling forest variables and is now operational in Scandinavian countries e.g. in Finnish National Forest Inventory (NFI). It was further integrated as a part of Forest Inventory and Analysis (FIA) program in the Unites States (see (McRoberts & Tomppo, 2007)). The method couples field-based inventory and auxiliary 6 Will-be-set-by-IN-TECH

When the aim is to assess the forest attributes by means of remote sensing data, one may note, again, the importance of estimating forest biomass. (Koch, 2010) states that three main factors of forest height, forest closure and forest type are the most meaningful descriptors for AGB. Remote sensing-derived information from the above-mentioned sources will enable one to successfully assess those three factors which can in turn result in reasonable estimation of forest AGB. By using those auxiliary data as descriptors of forest structure (e.g. AGB), Statistical methods are used to model the forest stand attributes in different scales including regional, stand and individual tree levels. So far, the modelling process has been mostly accomplished by means of parametric regression modelling of the response attributes.

Parametric models generally come with strong assumptions of distributions for the parameters and variables which sometimes may not be met by the data. The application of those models is normally subjected to the scientific, technological, and logistic conditions which constrain their application in many cases (Cabaravdic, 2007). A parametric fitting can yield highly biased models resulted from the possible misspecification of the unknown density function (e.g. (Härdle, 1990)). Nevertheless, those modelling procedures have been widely used for building models of forest stand and single tree attributes by several studies (e.g.(Næsset, 2002), (Breidenbach et al., 2008), (Korhonen et al., 2008), and (Straub et al., 2009)).

the unknown regression relationships. (Härdle, 1990) and (Härdle et al., 2004) discussed four main motivations to start with nonparametric models: 1)they provide flexibility to explore the relationships between the predictor and response variables, 2)they enable predictions which are independent from reference to a fixed parametric model, 3)they can help to find false observations by studying the influence of isolated points, and 4) they can be considered as versatile methods for imputing missing values or interpolations between neighbouring predictor values. However, they require larger sample sizes than parametric counterparts, as the underlying data in a nonparametric approach simultaneously serves as the model input. The nonparametric methods include a wide range of model-fitting approaches such as smoothing methods (e.g. kernel smoothing, k-nearest neighbour, splines and orthogonal series estimators), Generalized Additive Models (GAMs) and models based on classification and regression trees (CARTs). The k-nearest neighbour (k-NN) method is known as a group of mostly-applied nonparametric methods. In k-NN method, the value of the response variable(s) of interest on a specific target unit is modelled as a weighted average of the values of the most similar observation(s) in its neighbourhood. The neighbour(s) are defined within an n-dimensional feature space consisted of potentially-relevant predictor variables. The chosen neighbour(s) are selected based on a criterion which quantifies and measures the *similarity* from a database of previously measured observations (Maltamo & Eerikäinen, 2001). In the context of forest inventory, the k-NN method was first introduced in the late 1980's (Kilkki & Päivinen, 1987), applied later for the prediction of standing timber volume by e.g. (Tomppo, 1993) and was later examined in a handful of studies to predict forest stand and individual tree attributes. As stated by e.g. (Haapanen et al., 2004), the k-NN method has been further developed for modelling forest variables and is now operational in Scandinavian countries e.g. in Finnish National Forest Inventory (NFI). It was further integrated as a part of Forest Inventory and Analysis (FIA) program in the Unites States (see (McRoberts & Tomppo, 2007)). The method couples field-based inventory and auxiliary

I allow for more flexibility in using

In contrast, the so called âAIJnonparametric methodsâ ˘ A˘ ˙

**1.3 Modelling issues**

data (e.g. from remote sensing sources) to produce digital layers of measured forest or land use attributes ((Haapanen et al., 2004)). Following the promising results in Scandinavian landscapes achieved by the application of nonparametric methods in prediction/classification of continuous and categorical forest attributes by means of remotely sensed data, the method have recently received a great deal of attention in other parts of the world e.g. in central Europe (Latifi et al., 2011), as the method could be potentially integrated as a cost effective alternative within the regional and national forest inventories.

Apart from the forest inventories conducted in larger scales, the k-NN method has been applied in the context of so-called small-scale forest inventory, in which the accurate and unbiased inventory of small datasets is of major interest. The term 'small area ' commonly denotes a small geographical area, but may also be used to describe a small domain, i.e. a small subpopulation in a large geographical area (Ghosh & Rao, 1994). Sample survey data of a small area or subpopulation can be used to derive reliable estimates of totals and means for large areas or domains. However, the usual direct survey estimators based on the sampled data are often likely to return erroneous outcomes due to the improperly small sample size. This is more crucial in regional forest inventories, where the sample size is typically small since e.g. the overall sample size in a survey is commonly determined to provide specific accuracy at a much higher level of aggregation than that of small areas. In central European forestry context, a small-area domain is of fundamental importance, since the occurrence of multiple forest ownership systems are historically well-established and still frequently occur. This variation bears, in turn, various forest areas which are connected with different requirements in terms of financial and technological resources for forest inventory. In such situations, high expenses are associated with the regular terrestrial surveys (Stoffels, 2009) and the integration of remote sensing and modelling is thus a motivation to reduce the costs. For example, aerial survey with large footprint ALS flights is reported to generate costs to the amount of 1Euro per ha in Germany (Nothdurft et al., 2009). Therefore, an effective strategy of forest inventory should mainly focus on the inventory of such small forest datasets using all the available infrastructures and potentially attainable technological means. The goal should be set to producing reliable (i.e. sufficiently accurate), general (i.e. reproducible) and (approximately) unbiased models of prominent forest attributes which support providing an up-to-date and continuous information database within the bigger framework of periodical state-wide forest inventory system.

However, some issues are crucially required to be taken into consideration, before a remote sensing-supported modelling task of forest attributes can be commenced. These include:

#### **1.3.1 Data combination issues**

Remote sensing data provides a valuable source of information to the forest modelling process. The advanced use of 2 and 3D data in both single-tree and area-based approaches of attributes retrieval would offer valuable potentials to characterize the (inherently) 3D structure of the forest stands (particularly vertical structure such as mean or top height). The data combination is specific to the objectives being set within the case study, as well as to the level of details which is required by the analyst. As such, different data including broadband optical (both medium and high spatial resolution), hyperspectral, LiDAR (height as well as intensity), and RADAR data can be combined or fused to reach those goals.

mainly employed to differentiate amongst e.g. rough biomass classes which show clear distinctions. For example, Simple linear, multiple, and nonlinear regression models were tested by (Rahman et al., 2007) to classify different levels of forest succession in such as primary and secondary forests, where optical band reflectance and vegetation indices from Enhanced Thematic Mapper (ETM+) data were used as predictors. The use of dummy variables was reported to improve the accuracy of forest attribute estimation by ca. 0.3 of *R*<sup>2</sup> (best *R*<sup>2</sup> = 0.542 with 10-13 dummy predictors). In an earlier attempt in central Europe, (Vohland et al., 2007) performed parametric classification for a German test site based on a TM image, where 8 forest types were identified with an overall accuracy of 87.5 %. The Linear Spectral Mixture Analysis (endmember method) was also used to predict stem count, in that the fractions extracted from the spectra were linearly regressed with stem count as response variable. This different approach was also reported to introduce an improved calibration of large-scale forest attribute assessment. Although using parametric approaches, the methodology was (truly) stated to be also helpful in case of using nonparametric approaches. Regarding the observed linear correlations between the response variable of interest (stem count) and spectral indices, this assertion seems to be realistic. The usefulness of Landsat-derived features to model forest attributes (species richness and biodiversity indices) has also been discussed and confirmed by (Mohammadi & Shataee, 2010), in which they reported some positive potentials of multiple regressions (adjusted *R*2=0.59 for richness and

Characterizing Forest Structure by Means of Remote Sensing: A Review 11

*R*2=0.459 for reciprocal of simpson index) in temperate forests of northern Iran.

Attempts toward establishing correlations amongst regional-scale multispectral remote sensing and forest structural attributes in larger scale dates back to some early attempts in the early 1990's, amongst which e.g. (Iverson et al., 1994) can be highlighted. Their empirical regressions between percent forest cover and Advanced Very High Resolution Radiometer (AVHRR) spectral signatures was used based on Landsat-scale smaller calibration centres. Extrapolating forest cover for much bigger scales (state-scale) using AVHRR data resulted in high correlations (*r*=0.89 to 0.96) between county cover estimates. Those attempts to produce large-scale maps of forest attributes continued up to some later studies e.g. (Muukkonen & Heiskanen, 2007) and (Päivinen et al., 2009). Whereas regression modelling of AGB using Adavanced Spaceborne Thermal Emission and Radiometer (ASTER) and Moderate Resolution Imaging Spectrometer (MODIS) data was pursued in the former study (relative Root Mean Square Error (RMSE)% = 9.9), the latter used AVHRR pixel values which were applied to be regressed with the standing volume to produce European-scale growing stock maps. (Gebreslasie et al., 2010) can be noted as a very recent effort to parametrically model the forest structure in local scale, in which the visible and shortwave infrared ASTER features (original bands and vegetation indices) were investigated to build stepwise regressions of standing volume, basal area, stem count and tree height in *Eucalyptus* plantations. Whereas the spectral data was acknowledged to be an insufficient material to be solely used for modelling (*R*2= 0.51, 0.67, 0.65, and 0.52 for standing volume, basal area, stem count and tree height, respectively), integrating age and site index data as predictors showed to notably enhance the models by 42 %, 20.2%, 16.8%, and 42.2% of *R*2. The sole application of multispectral data, regardless of the scale within which the data have been used, seems not to fulfil the practical requirements for accurate regression modelling of forest attributes. Except some very few reports showing highly-correlated spectral indices with stem volume (approximate *R*2= 0.95 for multiple linear regression using SPOT and AVHRR data in provincial level reported by (Gonzalez-Alonso et al., 2006)), most of other reports state

#### **1.3.2 The configuration of models**

Depending on what modelling scheme is aimed to be used to retrieve the response forest attributes, a set of parameters are necessary to be set prior to modelling. These parameters can therefore greatly affect issues such as modelling errors and the retrieved values. In case of parametric regression, the underlying distribution of the data, the type of model in use (e.g. Ordinary Least Squares (OLS) or logarithmic models) and model parameters are crucial to be mentioned (see e.g.(Straub & Koch, 2011)). In nonparametric methods, issues like the selection of smoothing parameter for smoothing methods (e.g. (Wood, 2006)), size of neighbourhood for k-NN models, and number of trees per response variable for CART-based methods are necessary to be optimally set. Specifically in terms of k-NN models, the main difference amongst the various approaches is how the distance to the most similar element(s) is measured, which in turn depends on how the *similarity* is quantified within the feature space formed by the multiple predictors. This causes the main difference amongst the diverse distance measures which work based on k-NN approach including the well-known Euclidean and Mahalanobis distances. The neighbourhood size (known also as the number of NNs or k) can be set to any number from 1 to n (the total number of reference units). The single neighbour can, however, contribute to producing more realistic predictions in small datasets, while avoiding major prediction biases in cases where the responses follow skewed (or non-Gaussian) distributions (Hudak et al., 2008). However, one may note that using multiple neighbours would apparently yield more accurate results through averaging values from multiple response units.

#### **1.3.3 Screening the feature space of candidate predictors**

When dealing with datasets associated with numerous independent variables, one aim is to reduce the dimensionality of the feature space. Even though heuristic approaches may often be used to deal with highly-correlated variable sets, application of appropriate variable screening methods has recently become an important issue in modelling context. In variable screening, the main objective is to optimize the efficiency of models by achieving a certain performance level with maximum degree of freedom (Latifi et al., 2010). When building models in small scale geographical domains using several (and often strongly inter-correlated) remote sensing metrics, one would most probably come up with the question of how the most relevant information could be extracted from the enormous information content stored in the dataset. This is of major importance when the aim is to build parsimonious models being valid not only across the underlying region of parameterization, but also in further domains which show the (relatively) similar conditions. It also plays a crucial role in k-NN modelling approaches, since the majority of those methods lack an effective built-in scheme for feature space screening. The performances of different deterministic (e.g. forward, backward and stepwise selection methods) and stochastic (e.g. genetic algorithm) have been investigated in various studies available in the literature.

#### **2. Remote sensing for modelling forest structure**

#### **2.1 Forest attribute modelling using optical data**

Due to the lack of required 3D information for characterisation of vertical structure of forest stands, the pure use of multispectral optical remote sensing for forest structure has severe limitations. (Koch, 2010) addresses this issue and states that those data sources have been 8 Will-be-set-by-IN-TECH

Depending on what modelling scheme is aimed to be used to retrieve the response forest attributes, a set of parameters are necessary to be set prior to modelling. These parameters can therefore greatly affect issues such as modelling errors and the retrieved values. In case of parametric regression, the underlying distribution of the data, the type of model in use (e.g. Ordinary Least Squares (OLS) or logarithmic models) and model parameters are crucial to be mentioned (see e.g.(Straub & Koch, 2011)). In nonparametric methods, issues like the selection of smoothing parameter for smoothing methods (e.g. (Wood, 2006)), size of neighbourhood for k-NN models, and number of trees per response variable for CART-based methods are necessary to be optimally set. Specifically in terms of k-NN models, the main difference amongst the various approaches is how the distance to the most similar element(s) is measured, which in turn depends on how the *similarity* is quantified within the feature space formed by the multiple predictors. This causes the main difference amongst the diverse distance measures which work based on k-NN approach including the well-known Euclidean and Mahalanobis distances. The neighbourhood size (known also as the number of NNs or k) can be set to any number from 1 to n (the total number of reference units). The single neighbour can, however, contribute to producing more realistic predictions in small datasets, while avoiding major prediction biases in cases where the responses follow skewed (or non-Gaussian) distributions (Hudak et al., 2008). However, one may note that using multiple neighbours would apparently yield more accurate results through averaging values

When dealing with datasets associated with numerous independent variables, one aim is to reduce the dimensionality of the feature space. Even though heuristic approaches may often be used to deal with highly-correlated variable sets, application of appropriate variable screening methods has recently become an important issue in modelling context. In variable screening, the main objective is to optimize the efficiency of models by achieving a certain performance level with maximum degree of freedom (Latifi et al., 2010). When building models in small scale geographical domains using several (and often strongly inter-correlated) remote sensing metrics, one would most probably come up with the question of how the most relevant information could be extracted from the enormous information content stored in the dataset. This is of major importance when the aim is to build parsimonious models being valid not only across the underlying region of parameterization, but also in further domains which show the (relatively) similar conditions. It also plays a crucial role in k-NN modelling approaches, since the majority of those methods lack an effective built-in scheme for feature space screening. The performances of different deterministic (e.g. forward, backward and stepwise selection methods) and stochastic (e.g. genetic algorithm) have been investigated in

Due to the lack of required 3D information for characterisation of vertical structure of forest stands, the pure use of multispectral optical remote sensing for forest structure has severe limitations. (Koch, 2010) addresses this issue and states that those data sources have been

**1.3.2 The configuration of models**

from multiple response units.

various studies available in the literature.

**2. Remote sensing for modelling forest structure**

**2.1 Forest attribute modelling using optical data**

**1.3.3 Screening the feature space of candidate predictors**

mainly employed to differentiate amongst e.g. rough biomass classes which show clear distinctions. For example, Simple linear, multiple, and nonlinear regression models were tested by (Rahman et al., 2007) to classify different levels of forest succession in such as primary and secondary forests, where optical band reflectance and vegetation indices from Enhanced Thematic Mapper (ETM+) data were used as predictors. The use of dummy variables was reported to improve the accuracy of forest attribute estimation by ca. 0.3 of *R*<sup>2</sup> (best *R*<sup>2</sup> = 0.542 with 10-13 dummy predictors). In an earlier attempt in central Europe, (Vohland et al., 2007) performed parametric classification for a German test site based on a TM image, where 8 forest types were identified with an overall accuracy of 87.5 %. The Linear Spectral Mixture Analysis (endmember method) was also used to predict stem count, in that the fractions extracted from the spectra were linearly regressed with stem count as response variable. This different approach was also reported to introduce an improved calibration of large-scale forest attribute assessment. Although using parametric approaches, the methodology was (truly) stated to be also helpful in case of using nonparametric approaches. Regarding the observed linear correlations between the response variable of interest (stem count) and spectral indices, this assertion seems to be realistic. The usefulness of Landsat-derived features to model forest attributes (species richness and biodiversity indices) has also been discussed and confirmed by (Mohammadi & Shataee, 2010), in which they reported some positive potentials of multiple regressions (adjusted *R*2=0.59 for richness and *R*2=0.459 for reciprocal of simpson index) in temperate forests of northern Iran.

Attempts toward establishing correlations amongst regional-scale multispectral remote sensing and forest structural attributes in larger scale dates back to some early attempts in the early 1990's, amongst which e.g. (Iverson et al., 1994) can be highlighted. Their empirical regressions between percent forest cover and Advanced Very High Resolution Radiometer (AVHRR) spectral signatures was used based on Landsat-scale smaller calibration centres. Extrapolating forest cover for much bigger scales (state-scale) using AVHRR data resulted in high correlations (*r*=0.89 to 0.96) between county cover estimates. Those attempts to produce large-scale maps of forest attributes continued up to some later studies e.g. (Muukkonen & Heiskanen, 2007) and (Päivinen et al., 2009). Whereas regression modelling of AGB using Adavanced Spaceborne Thermal Emission and Radiometer (ASTER) and Moderate Resolution Imaging Spectrometer (MODIS) data was pursued in the former study (relative Root Mean Square Error (RMSE)% = 9.9), the latter used AVHRR pixel values which were applied to be regressed with the standing volume to produce European-scale growing stock maps. (Gebreslasie et al., 2010) can be noted as a very recent effort to parametrically model the forest structure in local scale, in which the visible and shortwave infrared ASTER features (original bands and vegetation indices) were investigated to build stepwise regressions of standing volume, basal area, stem count and tree height in *Eucalyptus* plantations. Whereas the spectral data was acknowledged to be an insufficient material to be solely used for modelling (*R*2= 0.51, 0.67, 0.65, and 0.52 for standing volume, basal area, stem count and tree height, respectively), integrating age and site index data as predictors showed to notably enhance the models by 42 %, 20.2%, 16.8%, and 42.2% of *R*2. The sole application of multispectral data, regardless of the scale within which the data have been used, seems not to fulfil the practical requirements for accurate regression modelling of forest attributes. Except some very few reports showing highly-correlated spectral indices with stem volume (approximate *R*2= 0.95 for multiple linear regression using SPOT and AVHRR data in provincial level reported by (Gonzalez-Alonso et al., 2006)), most of other reports state

in case of small datasets) and acknowledged that "The key to success is the access to (enough) ground samples to cover all variations in tree size and stand density for each cover type". (Katila, 2002) integrated TM and forest inventory data to model forest parameters including landuse classes. The results were verified using the Leave-one-Out (LOO) cross validation (Efron & Tibshirani, 1993) on the pixel level. The method was assessed to be statistically straightforward comparing to the conventional landcover estimation. (Hölmstrom, 2002) used a set of panchromatic aerial photos and field based information from 255 circular sample plots measured within the boreal forests of Sweden. Stem volume and age were modelled and validated, through which 14 % and 17 % of prediction errors (*RMSE*) for volume and age of the trees were observed, respectively. The k-NN method was thus proposed for stand level applications. However, they highlighted the importance of sufficient and representative reference material and the considerations in selecting the number of neighbours in small

Characterizing Forest Structure by Means of Remote Sensing: A Review 13

The application of RADAR data in forest assessments has been reported to be associated with some major constraints due to signal saturation (Imhoff, 1995) which can also occur in optical images when the forest canopy is fully closed (Holmström & Fransson, 2003). However, RADAR reflectance has been reported to be linearly related to standwise stem volume (Fransson et al., 2000). Therefore, multispectral data has been combined, though in relatively few experiences, with active data from RADAR platforms for retrieval of forest attributes. For example, (Holmström & Fransson, 2003) tested the fusion of optical SPOT-4 and airborne CARABAS-II VHF Synthetic Aperture RADAR (SAR) datasets to estimate forest variables in Spruce/Pine stands. The single use of each data was compared to the combined use, and the combined data was expectedly assessed to surpass the single one for modelling stem volume and age (*RMSE*=37 *m*3*ha*−<sup>1</sup> of combined set compared to *RMSE*=50 *m*3*ha*−<sup>1</sup> of the best single-data models). The relationship between the reference target units was reported to be "substantially strengthened" when using the two data sources in combination. Later on, (Thessler et al., 2008) investigated the joint application of multispectral and RADAR data in an alternative workflow to the one explained above, in that they applied TM-derived features combined with predictors extracted from the Digital Elevation Model (DEM) of a shuttle RADAR data to classify the tropical forest types in Costa Rica. Some cover type classes were consequently merged to aggregate the classes and improve the results, which led to the overall accuracy of 91 % from the segmented image data based on k-NN classification. (Treuhaft et al., 2003) combined C-band SAR interferometry with Leaf Area Index (LAI) extracted from hyperspectral data to estimate AGB. They introduced their resulted 'forest canopy leaf area

Though the conventional k-NN models of stand-scale forest attributes have been positively supported in the studies like those mentioned above, some other studies e.g. (Finley et al., 2003) acknowledge that the analysts may face the challenge of compromising between increased mapping efficiency and a loss of information accuracy. This is particularly the case when dealing with the question of selecting the optimal number of neighbours (also known as *k*). Different neighbourhood sizes have been studies in several works ((Franco-Lopez et al., 2001), (Haapanen et al., 2004),(Holmström & Fransson, 2003), (Packalén & Maltamo, 2006), (Packalén & Maltamo, 2007), (Finley & McRoberts, 2008) and (Vauhkonen et al., 2010)), in some of which the optimum number of *k* were discussed ((Franco-Lopez et al., 2001), (Haapanen et al., 2004), (Finley & McRoberts, 2008)). Whereas the above- mentioned studies reported an improved accuracy of k-NN predictions along with the increment of *k* (up to a

datasets as potential drawbacks.

density' to be a representative for AGB of forest.

moderate correlations. However, the majority of the studies have acknowledged the potentials in using such spectral data for regression modelling of forest structural attributes.

In context of nonparametric methods, as documented earlier, the initial introduction of k-NN methods to forestry context commenced in the late 1980's and early 1990 's, as a number of preliminary studies were carried out in the Nordic region. The method was initially in use only based on field measurements (Tomppo, 1991) and was later adapted for prediction of stem volume using spaceborne images. At that time, the most feasible satellite image data included Landsat Thematic Mapper (TM) and SPOT images, from which mainly TM and, to a minor extent, SPOT data were employed (Tomppo, 1993). The reported results have confirmed the suitability of the method based on remote sensing data. The method was further developed through various experiences. The further Finnish experiences with pure optical data include a range of studies in which the k-NN method was attempted to be adapted to practical applications in wood and timber industry. Amongst them, (Tommola et al., 1999) used k-NN method as a tool for wood procurement planning to estimate the characteristics of cutting areas in Finland. They found it to be a useful tool compared to the traditional inventory method. (Tomppo et al., 2001) utilized the approach to estimate/classify growth, main tree species, and forest type by means of multispectral TM data in China. The authors found the method to be helpful in classifying tree types and stand ages, though the stand-level predictions were reported to underestimate the growing stock.

As mentioned above, k-NN estimators include a range of distance-weighting approaches such as conventional distances (Euclidean and Mahalanobis) and Most Similar Neighbour (MSN) method. Due to the importance of those methods in the context of spatial modelling, a brief verbal explanation of those distance metrics seems to be essential: In general, the distance between the target units with a vector of predictor variables to any neighbouring unit having the multi-dimensional vector of predictors can be measured by a distance function, in which the weight matrix of predictors plays a central role to weight the predictors according to their predictive power. Whereas this weight matrix turns to be a multi-dimensional identity matrix (in the Euclidian distance) or the inverse of the covariance matrix of the predictor variables (in the Mahalanobis distance), the MSN inference uses canonical correlation analysis to produce a weighting matrix used to select neighbours from reference units. That is, according to (Crookston et al., 2002), the weight matrix is filled with the linear product of the squared canonical coefficients and their canonical correlation coefficients. The MSN method was described by e.g. (Maltamo & Eerikäinen, 2001) as a closely- related method to the basic k-NN based on Euclidean distance, whereas the main difference is that the coefficients of the variables in the distance function are searched using canonical correlations in MSN. Thus, one should bear in mind that a linear correlation between response(s) and predictor(s) can play a key role in the MSN method. The majority of attempts to construct MSN models of forest structure made use of 3D LiDAR data, either alone or in combination with spectral metrics. Therefore, the literature regarding MSN modelling will further be reviewed in the LiDAR section.

To the best of author's knowledge, Efforts to bring the analytical features of k-NN method to the US NFI system (called Forest Inventory and Analysis, FIA) were accomplished by studies such as (Franco-Lopez et al., 2001) who used the method to simultaneously predict basal area, volume and cover types based on FIA field inventory data and TM features. They truly mentioned a common small-scale problem (i.e. the critical performance of k-NN methods 10 Will-be-set-by-IN-TECH

moderate correlations. However, the majority of the studies have acknowledged the potentials

In context of nonparametric methods, as documented earlier, the initial introduction of k-NN methods to forestry context commenced in the late 1980's and early 1990 's, as a number of preliminary studies were carried out in the Nordic region. The method was initially in use only based on field measurements (Tomppo, 1991) and was later adapted for prediction of stem volume using spaceborne images. At that time, the most feasible satellite image data included Landsat Thematic Mapper (TM) and SPOT images, from which mainly TM and, to a minor extent, SPOT data were employed (Tomppo, 1993). The reported results have confirmed the suitability of the method based on remote sensing data. The method was further developed through various experiences. The further Finnish experiences with pure optical data include a range of studies in which the k-NN method was attempted to be adapted to practical applications in wood and timber industry. Amongst them, (Tommola et al., 1999) used k-NN method as a tool for wood procurement planning to estimate the characteristics of cutting areas in Finland. They found it to be a useful tool compared to the traditional inventory method. (Tomppo et al., 2001) utilized the approach to estimate/classify growth, main tree species, and forest type by means of multispectral TM data in China. The authors found the method to be helpful in classifying tree types and stand ages, though the

As mentioned above, k-NN estimators include a range of distance-weighting approaches such as conventional distances (Euclidean and Mahalanobis) and Most Similar Neighbour (MSN) method. Due to the importance of those methods in the context of spatial modelling, a brief verbal explanation of those distance metrics seems to be essential: In general, the distance between the target units with a vector of predictor variables to any neighbouring unit having the multi-dimensional vector of predictors can be measured by a distance function, in which the weight matrix of predictors plays a central role to weight the predictors according to their predictive power. Whereas this weight matrix turns to be a multi-dimensional identity matrix (in the Euclidian distance) or the inverse of the covariance matrix of the predictor variables (in the Mahalanobis distance), the MSN inference uses canonical correlation analysis to produce a weighting matrix used to select neighbours from reference units. That is, according to (Crookston et al., 2002), the weight matrix is filled with the linear product of the squared canonical coefficients and their canonical correlation coefficients. The MSN method was described by e.g. (Maltamo & Eerikäinen, 2001) as a closely- related method to the basic k-NN based on Euclidean distance, whereas the main difference is that the coefficients of the variables in the distance function are searched using canonical correlations in MSN. Thus, one should bear in mind that a linear correlation between response(s) and predictor(s) can play a key role in the MSN method. The majority of attempts to construct MSN models of forest structure made use of 3D LiDAR data, either alone or in combination with spectral metrics. Therefore, the literature regarding MSN modelling will further be reviewed in the LiDAR

To the best of author's knowledge, Efforts to bring the analytical features of k-NN method to the US NFI system (called Forest Inventory and Analysis, FIA) were accomplished by studies such as (Franco-Lopez et al., 2001) who used the method to simultaneously predict basal area, volume and cover types based on FIA field inventory data and TM features. They truly mentioned a common small-scale problem (i.e. the critical performance of k-NN methods

in using such spectral data for regression modelling of forest structural attributes.

stand-level predictions were reported to underestimate the growing stock.

section.

in case of small datasets) and acknowledged that "The key to success is the access to (enough) ground samples to cover all variations in tree size and stand density for each cover type".

(Katila, 2002) integrated TM and forest inventory data to model forest parameters including landuse classes. The results were verified using the Leave-one-Out (LOO) cross validation (Efron & Tibshirani, 1993) on the pixel level. The method was assessed to be statistically straightforward comparing to the conventional landcover estimation. (Hölmstrom, 2002) used a set of panchromatic aerial photos and field based information from 255 circular sample plots measured within the boreal forests of Sweden. Stem volume and age were modelled and validated, through which 14 % and 17 % of prediction errors (*RMSE*) for volume and age of the trees were observed, respectively. The k-NN method was thus proposed for stand level applications. However, they highlighted the importance of sufficient and representative reference material and the considerations in selecting the number of neighbours in small datasets as potential drawbacks.

The application of RADAR data in forest assessments has been reported to be associated with some major constraints due to signal saturation (Imhoff, 1995) which can also occur in optical images when the forest canopy is fully closed (Holmström & Fransson, 2003). However, RADAR reflectance has been reported to be linearly related to standwise stem volume (Fransson et al., 2000). Therefore, multispectral data has been combined, though in relatively few experiences, with active data from RADAR platforms for retrieval of forest attributes. For example, (Holmström & Fransson, 2003) tested the fusion of optical SPOT-4 and airborne CARABAS-II VHF Synthetic Aperture RADAR (SAR) datasets to estimate forest variables in Spruce/Pine stands. The single use of each data was compared to the combined use, and the combined data was expectedly assessed to surpass the single one for modelling stem volume and age (*RMSE*=37 *m*3*ha*−<sup>1</sup> of combined set compared to *RMSE*=50 *m*3*ha*−<sup>1</sup> of the best single-data models). The relationship between the reference target units was reported to be "substantially strengthened" when using the two data sources in combination. Later on, (Thessler et al., 2008) investigated the joint application of multispectral and RADAR data in an alternative workflow to the one explained above, in that they applied TM-derived features combined with predictors extracted from the Digital Elevation Model (DEM) of a shuttle RADAR data to classify the tropical forest types in Costa Rica. Some cover type classes were consequently merged to aggregate the classes and improve the results, which led to the overall accuracy of 91 % from the segmented image data based on k-NN classification. (Treuhaft et al., 2003) combined C-band SAR interferometry with Leaf Area Index (LAI) extracted from hyperspectral data to estimate AGB. They introduced their resulted 'forest canopy leaf area density' to be a representative for AGB of forest.

Though the conventional k-NN models of stand-scale forest attributes have been positively supported in the studies like those mentioned above, some other studies e.g. (Finley et al., 2003) acknowledge that the analysts may face the challenge of compromising between increased mapping efficiency and a loss of information accuracy. This is particularly the case when dealing with the question of selecting the optimal number of neighbours (also known as *k*). Different neighbourhood sizes have been studies in several works ((Franco-Lopez et al., 2001), (Haapanen et al., 2004),(Holmström & Fransson, 2003), (Packalén & Maltamo, 2006), (Packalén & Maltamo, 2007), (Finley & McRoberts, 2008) and (Vauhkonen et al., 2010)), in some of which the optimum number of *k* were discussed ((Franco-Lopez et al., 2001), (Haapanen et al., 2004), (Finley & McRoberts, 2008)). Whereas the above- mentioned studies reported an improved accuracy of k-NN predictions along with the increment of *k* (up to a

selected test sites in Finland and Italy. Despite the moderate accuracy obtained out of the sole analysis of spectral data (e.g. max. Kappa statistics of approximately 0.65 and relatively higher Kappa values of species dominance compared to soil fertility), this study highlighted the importance of how an efficient strategy for feature space screening can contribute to reducing the prediction errors in k-NN models. Whereas the majority of pearlier studies used deterministic approaches (e.g. stepwise methods) to prune the candidate predictors, this study (which followed an earlier attempt by (Tomppo & Halme, 2004) used an evolutionary Genetic Algorithm (GA) to screen the feature space which reduced the modelling errors in slight rates. The idea of using GA was further applied for a number of LiDAR-supported

Characterizing Forest Structure by Means of Remote Sensing: A Review 15

Height information from airborne laser scanner data has been validated to provide the most accurate input data related to the topography of land surface as well as to the structure of forested areas. Whereas (Lim et al., 2003), (Hyyppä et al., 2008) and (Koch, 2010) provide comprehensive reviews on the background and history of LiDAR data application in forest inventories, this section focuses on the methodological background concerning pure

LiDAR instruments include three main categories of profiling, discrete return, and waveform devices. Profiling devices record one return at low densities along a narrow swath (Evans et al., 2009) and were mainly used in the earlier studies such as (Nelson et al., 1988). Later, discrete-return (Pulse form) laser scanners enabled to use LiDAR in remote sensing where scanning over large areas was needed (Næsset, 2004). Such devices collect multiple returns (often three to five returns) based on intensity of the emitted laser energy from the earth surface. In terms of waveform data, the devices digitize the total amount of emitted energy in intervals and therefore are able to characterize the distribution of emitted laser from the objects. Although small footprint waveform sensors are most commonly available, they are reported to be computationally intensive and thus associated with restrictions when used in fine-scale (i.e. high resolution) environmental applications (Evans et al., 2009). They provide data featuring high point densities and enable one to broader representation of the surface and forest canopy. The importance of using pulse form data for studies concerning forest structure

LiDAR data can be used in two main approaches to retrieve forest structural attributes. In "area-based methods", the statistical metrics and other nonphysical distribution-related features of LiDAR height measurements are extracted either from the laser point clouds or from a rasterized representation of laser hits. They are then used to predict forest attributes e.g. mean tree height, mean DBH, basal area, volume and AGB at an area-level such as the plot or stand level (Yu et al., 2010). This method enables one to retrieve canopy height information by means of a relatively coarse resolution LiDAR data e.g. satellite or airborne data featuring <5 measurements per *m*<sup>2</sup> e.g. (Korhonen et al., 2008), (Jochem et al., 2011), though data with higher point density can also be used to derive the metrics at an aggregated level (e.g. (Maltamo, Eerikäinen, Packalén & Hyyppä, 2006) (Heurich & Thoma, 2008), (Straub et al., 2009) and (Latifi et al., 2010)). A key to success in area-based methods, when the metrics are extracted from a rasterized form of LiDAR data such as normalized Digital Surface Model (nDSM), has been stated to be the quality of extracted Digital Terrain Model (DTM) and Digital

forest modelling studies by e.g. (Latifi et al., 2010) and (Latifi et al., 2011).

is already stated in the relevant literature e.g. (Sexton et al., 2009).

**2.2 LiDAR-based models of forest structural attributes**

LiDAR-based models of forest structure.

Surface Model (DSM) (Hyyppä et al., 2008).

limited number varying amongst the studies), some acknowledge that increasing k leads to a stronger shift of the predictions towards the sample mean which could cause serious biases, particularly in cases where the distribution of observations is skewed ((Hudak et al., 2008), (Latifi et al., 2010)). However, the choice of neighbourhood size is an arbitrary issue in which the expertise of the analyst (e.g. the prior knowledge on the properties and variance of the population) plays a functional role. By using multiple *k* for imputation, the majority of studies carried out within the framework of FIA program in US (characterized by a cluster sampling design using 4 subplots in each cluster) have shown to yield relatively high accuracies. The study of (Haapanen et al., 2004) can be exemplified, in which three classes of forest, non-forest and water were classified by a conventional k-NN approach (Euclidean distance) and ETM+ features as predictors. They increased the neighbourhood size up to 10 neighbours, which caused an enhancement of overall accuracy up to the use of 4th neighbour, a sudden drop, and a consequent improvement up to *k*=8. The Majority of other studies in this realm have reported the improvement of accuracy along with increment in the neighbourhood size. Some studies noticed that the selection of other parameters such as weighting distances also depends on the choice of image dates and other associated data ((Franco-Lopez et al., 2001), (Finley & McRoberts, 2008)). (Mäkelä & Pekkarinen, 2004) made a relatively preliminary effort to use field data of stand volume from an inventoried area to make predictions in a neighbouring region which was considered to suffer from lack of field data. However, their poor accuracy yielded from the estimation led them to assess the method as an inappropriate one for stand level predictions. Yet, some of their best volume estimates were reported to be useful for the stands where no (or few) field information is available. In a study conducted in a central Europe, (Stümer, 2004) developed a k-NN application in Germany to model and map basal area (i.e. metric data) and deadwood (i.e. categorical data) using TM, hyperspectral, and field datasets as predictors. The best results showed the RMSE between 35 % and 67 % (for TM data) and 65 % and 67 % (for hyperspectral data). As for the deadwood, the accuracy ranged between 60 % and 73 % (for TM) and 60 % and 63 % (for hyperspectral). The two data sets were separately assessed, in which no combinations were tested.

Using various configurations of k-NN methods, (LeMay & Temesgen, 2005) compared some combinations (e.g. varying number of neighbours) to predict basal area and standing volume in Canadian forests. They reported MSN method (even in a single-neighbour setting) as the most accurate approach compared to the Euclidean distance models based on 3 neighbours. In a relatively similar study in Bosnian forests in Europe, (Cabaravdic, 2007) also achieved relatively accurate k-NN estimates of growing stock using TM-extracted features and a broad range of field survey information. In terms of the configuration, *k*=5 and Mahalanobis distance were assessed to be optimal for growing stock models. (Kutzer, 2008) tested the selected bands in visible and infrared domain of multispectral ASTER image together with a set of terrestrial data to differentiate the landuse types and the Non Wood Forest Products in Ghana. The results were assessed, though with some exceptions, to be promising for application as a practical forest monitoring tool within the study area.

The majority of forest-related studies using k-NN method have been conducted with the aim of modelling continuous attributes of forest structure, whereas little attention has been paid to predicting categorical forest variables such as site quality or vegetation type. One of the few attempts to introduce such new potentials to the remote sensing society was carried out by (Tomppo et al., 2009), in which TM-derived spectral features were used to predict site fertility, species dominance and coniferous/deciduous dominance as categorical responses across 12 Will-be-set-by-IN-TECH

limited number varying amongst the studies), some acknowledge that increasing k leads to a stronger shift of the predictions towards the sample mean which could cause serious biases, particularly in cases where the distribution of observations is skewed ((Hudak et al., 2008), (Latifi et al., 2010)). However, the choice of neighbourhood size is an arbitrary issue in which the expertise of the analyst (e.g. the prior knowledge on the properties and variance of the population) plays a functional role. By using multiple *k* for imputation, the majority of studies carried out within the framework of FIA program in US (characterized by a cluster sampling design using 4 subplots in each cluster) have shown to yield relatively high accuracies. The study of (Haapanen et al., 2004) can be exemplified, in which three classes of forest, non-forest and water were classified by a conventional k-NN approach (Euclidean distance) and ETM+ features as predictors. They increased the neighbourhood size up to 10 neighbours, which caused an enhancement of overall accuracy up to the use of 4th neighbour, a sudden drop, and a consequent improvement up to *k*=8. The Majority of other studies in this realm have reported the improvement of accuracy along with increment in the neighbourhood size. Some studies noticed that the selection of other parameters such as weighting distances also depends on the choice of image dates and other associated data ((Franco-Lopez et al., 2001), (Finley & McRoberts, 2008)). (Mäkelä & Pekkarinen, 2004) made a relatively preliminary effort to use field data of stand volume from an inventoried area to make predictions in a neighbouring region which was considered to suffer from lack of field data. However, their poor accuracy yielded from the estimation led them to assess the method as an inappropriate one for stand level predictions. Yet, some of their best volume estimates were reported to be useful for the stands where no (or few) field information is available. In a study conducted in a central Europe, (Stümer, 2004) developed a k-NN application in Germany to model and map basal area (i.e. metric data) and deadwood (i.e. categorical data) using TM, hyperspectral, and field datasets as predictors. The best results showed the RMSE between 35 % and 67 % (for TM data) and 65 % and 67 % (for hyperspectral data). As for the deadwood, the accuracy ranged between 60 % and 73 % (for TM) and 60 % and 63 % (for hyperspectral). The two data

sets were separately assessed, in which no combinations were tested.

practical forest monitoring tool within the study area.

Using various configurations of k-NN methods, (LeMay & Temesgen, 2005) compared some combinations (e.g. varying number of neighbours) to predict basal area and standing volume in Canadian forests. They reported MSN method (even in a single-neighbour setting) as the most accurate approach compared to the Euclidean distance models based on 3 neighbours. In a relatively similar study in Bosnian forests in Europe, (Cabaravdic, 2007) also achieved relatively accurate k-NN estimates of growing stock using TM-extracted features and a broad range of field survey information. In terms of the configuration, *k*=5 and Mahalanobis distance were assessed to be optimal for growing stock models. (Kutzer, 2008) tested the selected bands in visible and infrared domain of multispectral ASTER image together with a set of terrestrial data to differentiate the landuse types and the Non Wood Forest Products in Ghana. The results were assessed, though with some exceptions, to be promising for application as a

The majority of forest-related studies using k-NN method have been conducted with the aim of modelling continuous attributes of forest structure, whereas little attention has been paid to predicting categorical forest variables such as site quality or vegetation type. One of the few attempts to introduce such new potentials to the remote sensing society was carried out by (Tomppo et al., 2009), in which TM-derived spectral features were used to predict site fertility, species dominance and coniferous/deciduous dominance as categorical responses across selected test sites in Finland and Italy. Despite the moderate accuracy obtained out of the sole analysis of spectral data (e.g. max. Kappa statistics of approximately 0.65 and relatively higher Kappa values of species dominance compared to soil fertility), this study highlighted the importance of how an efficient strategy for feature space screening can contribute to reducing the prediction errors in k-NN models. Whereas the majority of pearlier studies used deterministic approaches (e.g. stepwise methods) to prune the candidate predictors, this study (which followed an earlier attempt by (Tomppo & Halme, 2004) used an evolutionary Genetic Algorithm (GA) to screen the feature space which reduced the modelling errors in slight rates. The idea of using GA was further applied for a number of LiDAR-supported forest modelling studies by e.g. (Latifi et al., 2010) and (Latifi et al., 2011).

### **2.2 LiDAR-based models of forest structural attributes**

Height information from airborne laser scanner data has been validated to provide the most accurate input data related to the topography of land surface as well as to the structure of forested areas. Whereas (Lim et al., 2003), (Hyyppä et al., 2008) and (Koch, 2010) provide comprehensive reviews on the background and history of LiDAR data application in forest inventories, this section focuses on the methodological background concerning pure LiDAR-based models of forest structure.

LiDAR instruments include three main categories of profiling, discrete return, and waveform devices. Profiling devices record one return at low densities along a narrow swath (Evans et al., 2009) and were mainly used in the earlier studies such as (Nelson et al., 1988). Later, discrete-return (Pulse form) laser scanners enabled to use LiDAR in remote sensing where scanning over large areas was needed (Næsset, 2004). Such devices collect multiple returns (often three to five returns) based on intensity of the emitted laser energy from the earth surface. In terms of waveform data, the devices digitize the total amount of emitted energy in intervals and therefore are able to characterize the distribution of emitted laser from the objects. Although small footprint waveform sensors are most commonly available, they are reported to be computationally intensive and thus associated with restrictions when used in fine-scale (i.e. high resolution) environmental applications (Evans et al., 2009). They provide data featuring high point densities and enable one to broader representation of the surface and forest canopy. The importance of using pulse form data for studies concerning forest structure is already stated in the relevant literature e.g. (Sexton et al., 2009).

LiDAR data can be used in two main approaches to retrieve forest structural attributes. In "area-based methods", the statistical metrics and other nonphysical distribution-related features of LiDAR height measurements are extracted either from the laser point clouds or from a rasterized representation of laser hits. They are then used to predict forest attributes e.g. mean tree height, mean DBH, basal area, volume and AGB at an area-level such as the plot or stand level (Yu et al., 2010). This method enables one to retrieve canopy height information by means of a relatively coarse resolution LiDAR data e.g. satellite or airborne data featuring <5 measurements per *m*<sup>2</sup> e.g. (Korhonen et al., 2008), (Jochem et al., 2011), though data with higher point density can also be used to derive the metrics at an aggregated level (e.g. (Maltamo, Eerikäinen, Packalén & Hyyppä, 2006) (Heurich & Thoma, 2008), (Straub et al., 2009) and (Latifi et al., 2010)). A key to success in area-based methods, when the metrics are extracted from a rasterized form of LiDAR data such as normalized Digital Surface Model (nDSM), has been stated to be the quality of extracted Digital Terrain Model (DTM) and Digital Surface Model (DSM) (Hyyppä et al., 2008).

information in infrared domain which may potentially share some values to the modelling of forest attributes e.g. (Boyd & Hill, 2007), especially when dealing with species-specific models (Koch, 2010). Regardless of some exceptions e.g. (Vauhkonen et al., 2010),(Latifi et al., 2010), most of the pure LiDAR-based models of forest attributes solely made use of height metrics

Characterizing Forest Structure by Means of Remote Sensing: A Review 17

Using nonparametric methods greatly contributed to the studies aiming at retrieval of forest attributes by means of LiDAR metrics. Those methods have been applied in various scales, using numerous metrics, and combined, in some cases, with additional methods for screening the high-dimensional feature space or for estimating the prediction variance. (Falkowski et al., 2010) evaluated k-NN imputation models to predict individual tree-level height, diameter at breast height, and species in northeastern Oregon in USA. Topographic variables were added to LiDAR-extracted height percentiles and other descriptive statistics to accomplish the task. Whereas 5 and 16 *m*3*ha*<sup>1</sup> of *RMSE* were achieved for basal area and volume estimates, occurrence of small trees or the dense understory showed to be the main source of prediction errors. Similarly, promising results have been reported by e.g.(Nothdurft et al., 2009) in central Europe for area-based models of stem volume using LiDAR height metrics (approximately 20

(Hudak et al., 2008) compared different imputation methods to impute a range of forest inventory attributes in plot level using height metrics from LiDAR data and additional topographical attributes in Idaho, USA. They found the Random Forest (RF) to be superior to other imputation methods such as MSN, Euclidean distance and Mahalanobis distance. They used the selected RF outputs for final wall-to-wall mapping of forest structural attributes at pixel level. The dominance of RF model was further confirmed by studies such as (Latifi et al., 2010) and (Breidenbach, Nothdurft & Kändler, 2010) and led to a wider application of RF as a leading nonparametric method in combination with LiDAR metrics e.g. (Yu et al., 2011). The RF method (Breiman, 2001) works based on ensembles of CARTs for resampled predictor variable sets. It starts with evolving bootstrap samples from the original data. It then grows, for each bootstrap sample, an unpruned regression tree. The best splits are chosen from the randomly sampled variables at each node or the trees. The new predictions are then made by aggregating the predictions of the total number of trees. That is, the mode votes (the most frequent values) from the total trees will be the predicted value of the respective variable ((Liaw & Wiener, 2002), (Latifi et al., 2011)). Though the former studies e.g. (Hudak et al., 2008) and (Vauhkonen et al., 2010) have shown that the RF approach generally surpasses other imputation methods including MSN, (Breidenbach, Nothdurft & Kändler, 2010) reported an approximately similar performance of RF and MSN, as their study yielded e.g. the *RMSE* of 32.41 % (for MSN) and 32.81 % (for RF) when predicting the total standing timber volume by

In addition to those stated above, the nonparametric methods were also tested to predict further structural characteristics of forest stands e.g. diameter distributions by the sole use of laser scanner data (e.g. (Maltamo et al., 2009)), yielding some potentials towards further

As explained earlier, the application of ALS-extracted metrics (height and intensity features) has been validated as a being helpful and thus required for most practices regarding forest

application of 3D topographic remote sensing for forest monitoring.

**2.3 Combining LiDAR and optical data for modelling**

as input variables for modelling.

averaging *k*=8.

% of*RMSE* for MSN models of stem volume in Germany).

The focus in the so called "Single tree-based methods" is on the recognition of individual trees. Here, the tree attributes e.g. tree height, crown dimensions and species information are measured. The measured attributes can further be applied to retrieve other attributes such as DBH, standing volume and AGB by means of various modelling approaches (Yu et al., 2010). The retrieved attributes are either presented as single-tree attributes or can be aggregated into a higher level e.g. stand or sample plot level.

In some earlier studies, one of the main goals in applying 3D data was to facilitate an accurate estimation of stand height, in which correlating the laser-derived height information to those measured in the field was of major interest. This often yielded notably promising results which strongly supported the accuracy of LiDAR instruments for precise height measurements. For example, (Maltamo, Hyyppä & Malinen, 2006) used airborne laser data to retrieve crown height information i.e. basal area, mean diameter and height at both tree and plot levels using linear regression methods in Finland. The results indicated the superiority of LiDAR-based attributes over the field-based ones in area-level, though a contrasting result was reported in single-tree level. Better result was hypothesized to be achieved when data with higher point density would be obtained with large swaths. The roughly similar result was later reported by (Maltamo, Eerikäinen, Packalén & Hyyppä, 2006), in which the plot-level stem volume estimates calculated from field assessments were reported to be less accurate than the methods in which volume had been predicted by LiDAR measures.(Maltamo et al., 2010) further studied different methods including regression models to retrieve crown height information. Regardless of the differences amongst the methods, they all yielded RMSEs between 1.0 and 1.5 m in predicting crown height.

Application of laser scanner data to enhance volume and AGB models dates back to some preliminary experiments in 1980 's e.g. (MacLean & Krabill, 1986), (Nelson et al., 1988) which demonstrated the usefulness of LiDAR-extracted canopy profiles to improve stem volume and AGB estimates (e.g.*R*2=0.72 to 0.92 achieved in regression analysis by (MacLean & Krabill, 1986)). In the recent years, except some cases, the investigations on further developments in the retrieval of model-derived volume and AGB attributes has considerably grown. (Heurich & Thoma, 2008) built linear models to predict plot-level stem volume, height, and stem count in Bavarian National Park, where they reported *RMSE*% =5, 10 and 60 for LiDAR-estimated height, volume and stem count, respectively. The forest areas were stratified into three main deciduous, coniferous, and mixed strata. Despite achieving relatively accurate results in their models, they acknowledged that factors such as occurrence of deadwoods and complexities in forest structure constrain the achievement of better results. As stated earlier, derivation of model-based estimates of stem volume (in different assortments) have recently formed a major field of research in LiDAR-related studies. The Sawlogs can be exemplified as vital timber assortments in Nordic forest utilization context. Therefore, the accurate estimation of their volume can lead to an added value in forest management. (Korhonen et al., 2008) studied this by using parametric models, in that they used LiDAR canopy height metrics i.e. percentiles to make linear models of sawlog volume, which yielded relatively favourable accuracies (*RMSE*%=9.1 and 18 for theoretical and factual volumes). In other examples, regression modelling of individual trees using the multi-return, pulse-form LiDAR metrics has been reported to be accurate for standing volume (*R*2=0.77) (Dalponte et al., 2009) as well as for AGB (Max. *R*2=0.71) (Jochem et al., 2011).

In terms of the type of metrics extracted from laser scanner data, one important issue cannot be neglected: In addition to height metrics, the LiDAR intensity data is reported to contain some 14 Will-be-set-by-IN-TECH

The focus in the so called "Single tree-based methods" is on the recognition of individual trees. Here, the tree attributes e.g. tree height, crown dimensions and species information are measured. The measured attributes can further be applied to retrieve other attributes such as DBH, standing volume and AGB by means of various modelling approaches (Yu et al., 2010). The retrieved attributes are either presented as single-tree attributes or can be aggregated into

In some earlier studies, one of the main goals in applying 3D data was to facilitate an accurate estimation of stand height, in which correlating the laser-derived height information to those measured in the field was of major interest. This often yielded notably promising results which strongly supported the accuracy of LiDAR instruments for precise height measurements. For example, (Maltamo, Hyyppä & Malinen, 2006) used airborne laser data to retrieve crown height information i.e. basal area, mean diameter and height at both tree and plot levels using linear regression methods in Finland. The results indicated the superiority of LiDAR-based attributes over the field-based ones in area-level, though a contrasting result was reported in single-tree level. Better result was hypothesized to be achieved when data with higher point density would be obtained with large swaths. The roughly similar result was later reported by (Maltamo, Eerikäinen, Packalén & Hyyppä, 2006), in which the plot-level stem volume estimates calculated from field assessments were reported to be less accurate than the methods in which volume had been predicted by LiDAR measures.(Maltamo et al., 2010) further studied different methods including regression models to retrieve crown height information. Regardless of the differences amongst the

methods, they all yielded RMSEs between 1.0 and 1.5 m in predicting crown height.

Application of laser scanner data to enhance volume and AGB models dates back to some preliminary experiments in 1980 's e.g. (MacLean & Krabill, 1986), (Nelson et al., 1988) which demonstrated the usefulness of LiDAR-extracted canopy profiles to improve stem volume and AGB estimates (e.g.*R*2=0.72 to 0.92 achieved in regression analysis by (MacLean & Krabill, 1986)). In the recent years, except some cases, the investigations on further developments in the retrieval of model-derived volume and AGB attributes has considerably grown. (Heurich & Thoma, 2008) built linear models to predict plot-level stem volume, height, and stem count in Bavarian National Park, where they reported *RMSE*% =5, 10 and 60 for LiDAR-estimated height, volume and stem count, respectively. The forest areas were stratified into three main deciduous, coniferous, and mixed strata. Despite achieving relatively accurate results in their models, they acknowledged that factors such as occurrence of deadwoods and complexities in forest structure constrain the achievement of better results. As stated earlier, derivation of model-based estimates of stem volume (in different assortments) have recently formed a major field of research in LiDAR-related studies. The Sawlogs can be exemplified as vital timber assortments in Nordic forest utilization context. Therefore, the accurate estimation of their volume can lead to an added value in forest management. (Korhonen et al., 2008) studied this by using parametric models, in that they used LiDAR canopy height metrics i.e. percentiles to make linear models of sawlog volume, which yielded relatively favourable accuracies (*RMSE*%=9.1 and 18 for theoretical and factual volumes). In other examples, regression modelling of individual trees using the multi-return, pulse-form LiDAR metrics has been reported to be accurate for standing volume (*R*2=0.77) (Dalponte et al., 2009) as well

In terms of the type of metrics extracted from laser scanner data, one important issue cannot be neglected: In addition to height metrics, the LiDAR intensity data is reported to contain some

a higher level e.g. stand or sample plot level.

as for AGB (Max. *R*2=0.71) (Jochem et al., 2011).

information in infrared domain which may potentially share some values to the modelling of forest attributes e.g. (Boyd & Hill, 2007), especially when dealing with species-specific models (Koch, 2010). Regardless of some exceptions e.g. (Vauhkonen et al., 2010),(Latifi et al., 2010), most of the pure LiDAR-based models of forest attributes solely made use of height metrics as input variables for modelling.

Using nonparametric methods greatly contributed to the studies aiming at retrieval of forest attributes by means of LiDAR metrics. Those methods have been applied in various scales, using numerous metrics, and combined, in some cases, with additional methods for screening the high-dimensional feature space or for estimating the prediction variance. (Falkowski et al., 2010) evaluated k-NN imputation models to predict individual tree-level height, diameter at breast height, and species in northeastern Oregon in USA. Topographic variables were added to LiDAR-extracted height percentiles and other descriptive statistics to accomplish the task. Whereas 5 and 16 *m*3*ha*<sup>1</sup> of *RMSE* were achieved for basal area and volume estimates, occurrence of small trees or the dense understory showed to be the main source of prediction errors. Similarly, promising results have been reported by e.g.(Nothdurft et al., 2009) in central Europe for area-based models of stem volume using LiDAR height metrics (approximately 20 % of*RMSE* for MSN models of stem volume in Germany).

(Hudak et al., 2008) compared different imputation methods to impute a range of forest inventory attributes in plot level using height metrics from LiDAR data and additional topographical attributes in Idaho, USA. They found the Random Forest (RF) to be superior to other imputation methods such as MSN, Euclidean distance and Mahalanobis distance. They used the selected RF outputs for final wall-to-wall mapping of forest structural attributes at pixel level. The dominance of RF model was further confirmed by studies such as (Latifi et al., 2010) and (Breidenbach, Nothdurft & Kändler, 2010) and led to a wider application of RF as a leading nonparametric method in combination with LiDAR metrics e.g. (Yu et al., 2011). The RF method (Breiman, 2001) works based on ensembles of CARTs for resampled predictor variable sets. It starts with evolving bootstrap samples from the original data. It then grows, for each bootstrap sample, an unpruned regression tree. The best splits are chosen from the randomly sampled variables at each node or the trees. The new predictions are then made by aggregating the predictions of the total number of trees. That is, the mode votes (the most frequent values) from the total trees will be the predicted value of the respective variable ((Liaw & Wiener, 2002), (Latifi et al., 2011)). Though the former studies e.g. (Hudak et al., 2008) and (Vauhkonen et al., 2010) have shown that the RF approach generally surpasses other imputation methods including MSN, (Breidenbach, Nothdurft & Kändler, 2010) reported an approximately similar performance of RF and MSN, as their study yielded e.g. the *RMSE* of 32.41 % (for MSN) and 32.81 % (for RF) when predicting the total standing timber volume by averaging *k*=8.

In addition to those stated above, the nonparametric methods were also tested to predict further structural characteristics of forest stands e.g. diameter distributions by the sole use of laser scanner data (e.g. (Maltamo et al., 2009)), yielding some potentials towards further application of 3D topographic remote sensing for forest monitoring.

#### **2.3 Combining LiDAR and optical data for modelling**

As explained earlier, the application of ALS-extracted metrics (height and intensity features) has been validated as a being helpful and thus required for most practices regarding forest

of aerial photography and ALS data. The procedure consisted of two methods including 1) simultaneous k-MSN estimation and 2) a two- phase prediction (prediction of the responses using regression analysis of ALS data and then allocation of the variables using a fuzzy classification approach). The k-MSN achieved better results than the fuzzy classifications. Although the study still proposed some further developments of the predictor variables from both datasets, the results were assessed satisfactory in cases of Norway spruce (*Picea abies* L.) and Scots pine (*Pinus sylvestris* L.). Soon after, (Packalén & Maltamo, 2007) made stand level models of volume and height using the similar dataset as before. A set of Haralick textural features(Haralick, 1979) from the optical data were additionally combined with the calculated ALS height features to produce predictive models. Accuracy of the predicted responses was finally found to be comparable to stand-level field assessments, though the attributes of conifers were estimated more accurately than those from the deciduous stands. In a further study by those authors, (Packalén & Maltamo, 2008) made use of the similar data to develop k-MSN models of diameter distribution by tree species. Based on the results of growing stock estimation in the previous research work(s), two approaches were compared including 1) field-based modelling using the Weibull distribution and 2) k-MSN prediction, in which the latter was assessed to outperform the former method. Nevertheless, the need to have more comprehensive reference field data (i.e. a common small-scale problem) to cover the spectral variations of the remote sensing data was highlighted as a major concern which supports those already acknowledged by precedent studies. (Nothdurft et al., 2009) represents an attempt towards solving this, in which bootstrap-simulated prediction errors of MSN inferences of volume based on sole use of LiDAR height metrics were smaller than those

Characterizing Forest Structure by Means of Remote Sensing: A Review 19

Few studies e.g. (Straub et al., 2010) and (Latifi et al., 2011) compared parametric and nonparametric methods for forest attribute estimation in presence of both LiDAR and multispectral datasets. Whereas the former study compared Ordinary Least Squares (OLS) regression and a yield table-estimated stem volume with that from Euclidean distance-based k-NN method, the latter made a comparison between RF and OLS outputs. Nevertheless, both studies made relatively similar conclusions, in that they stated that using nonparametric methods cannot b expected to remarkably contribute to the improvement of forest attribute estimates. Besides, it supports (Yu et al., 2011)who also tested pure LiDAR metrics and achieved a similar performance of RF and OLS in a single tree scale. The rationale behind this is that non-parametric imputations do not share the same mix of error components as regression predictions. Imputation errors are often greater than regression errors because the errors do not result from a least-squares minimisation, but from selection of a most similar element in a pool of neighbouring observations (Stage & Crookston, 2007). However, K-NN methods (especially in single- neighbour setting) yield predictions with similar variance structure to that of the observations (Moeur & Stage, 1995), and are thus advantageous over

The selection of proper predictor variables for a k-NN model (i.e. an absent element of conventional k-NN approaches) is a time-consuming task which and needs to be automated. (Packalén & Maltamo, 2007) used an iterative cost- minimizing variable selection algorithm which aimed at minimizing the weighted average of the relative *RMSE*. In contrast, studies like (Hudak et al., 2008) and (Straub et al., 2009)applied stepwise selection methods, where the former study based its stepwise iteration on the *Gini* index of variable importance used by (Breiman, 2001) as a built-in feature in RF. As such, other variable screening methods such as

the higher accuracies achievable by the use of OLS (Hudak et al., 2008).

of design-based sampling.

inventory. This is because the data has previously been proved to be potentially applicable in several environmental and natural resource planning tasks, particularly where the vertical structure of the respective phenomena is dealt with. Nevertheless, the use of multi-sensorial data may enable one to make use of advanced methods of data analysis and thus overcome some problems faced by using single datasets (Koch, 2010). The use of multispectral data can contribute to the analysis of vegetation cover by adding spectral information from visible and infrared domains. In this way, the information required for species-specific tasks will be provided by the spectral data, while the LiDAR data contributes an enormous amount of information in terms of 3D structural attributes (see e.g. (Packalén & Maltamo, 2007), (Heinzel et al., 2008), (Straub et al., 2009)).

When combining spectral and LiDAR data, the parametric models have been quite rarely used for predicting forest attributes. In contrast, relatively more studies were carried out using combined data made use of nonparametric methods (especially MSN and RF), probably as the models are generally assumed as rather 'distribution-free methods' which can potentially be applied regardless of the underlying distribution of the population. A further reason could be the ability of more advanced methods such as MSN and RF to handle high-dimensional feature spaces. However, examples of the joint use of spectral and laser scanner data for parametric modelling can be e.g. (Fransson et al., 2004) and (Hudak et al., 2006), in both of which the magnitude of candidate predictors were notably less than those making use of nonparametric methods. (Fransson et al., 2004) built regression models to predict stem volume using SPOT5 data aided by TopEye laser scanner data in Swedish coniferous landscapes. The SPOT5 data was used to develop features including multi-spectral bands, ditto squared, and the band ratios. LiDAR- derived features included height and forest density measures at stand level. The single as well as combined datasets were tested, from which the combined use of laser height data with the spectral features surpassed the individual use of the datasets. Later on,(Hudak et al., 2006) linearly regressed basal area and tree density on 26 predictors derived from height/intensity of LiDAR and Advanced Land Imager (ALI) multispectral data. They found laser height (to a higher extent) added by laser intensity metrics as most relevant predictors of both responses (The LiDAR-dominated models explained around 90 % of variance for both response variables).

In terms of applying conventional distance-based k-NN methods, (McInerney et al., 2010) can be referred who combined airborne laser scanner and spaceborne Indian Remote Sensing (IRS) multispectral data to model stand canopy height using k-NN method. They apparently reported laser height data as the major means of canopy height retrieval, and achieved a relative *RMSE* between 28 and 31 %. (Maltamo, Malinen, Packalén, Suvanto & Kangas, 2006) applied a k-MSN (MSN using multiple *k*) method to combine the LiDAR data with aerial images and terrestrial stand information in Finland. The laser-based models were reported to outperform aerial photography in stand volume estimation, and the combination improved the models at plot and stand levels. (Wallerman & Holmgren, 2007) have also highlighted the combined application of predictive features derived from optical (SPOT) and laser (TopEye) data, according to which the combined dataset yielded the mean standing volume and stem density models with *RMSE* = 20% and *RMSE* = 22%, respectively. Combining satellite-based (TM) spectral features with laser metrics was also carried out by (Latifi et al., 2010) who reported that TM-extracted metrics can be used as alternatives to those derived from aerial photography for area-based models. Using k-MSN approach, (Packalén & Maltamo, 2006) conducted a survey to achieve species-specific stand information using sets 16 Will-be-set-by-IN-TECH

inventory. This is because the data has previously been proved to be potentially applicable in several environmental and natural resource planning tasks, particularly where the vertical structure of the respective phenomena is dealt with. Nevertheless, the use of multi-sensorial data may enable one to make use of advanced methods of data analysis and thus overcome some problems faced by using single datasets (Koch, 2010). The use of multispectral data can contribute to the analysis of vegetation cover by adding spectral information from visible and infrared domains. In this way, the information required for species-specific tasks will be provided by the spectral data, while the LiDAR data contributes an enormous amount of information in terms of 3D structural attributes (see e.g. (Packalén & Maltamo, 2007), (Heinzel

When combining spectral and LiDAR data, the parametric models have been quite rarely used for predicting forest attributes. In contrast, relatively more studies were carried out using combined data made use of nonparametric methods (especially MSN and RF), probably as the models are generally assumed as rather 'distribution-free methods' which can potentially be applied regardless of the underlying distribution of the population. A further reason could be the ability of more advanced methods such as MSN and RF to handle high-dimensional feature spaces. However, examples of the joint use of spectral and laser scanner data for parametric modelling can be e.g. (Fransson et al., 2004) and (Hudak et al., 2006), in both of which the magnitude of candidate predictors were notably less than those making use of nonparametric methods. (Fransson et al., 2004) built regression models to predict stem volume using SPOT5 data aided by TopEye laser scanner data in Swedish coniferous landscapes. The SPOT5 data was used to develop features including multi-spectral bands, ditto squared, and the band ratios. LiDAR- derived features included height and forest density measures at stand level. The single as well as combined datasets were tested, from which the combined use of laser height data with the spectral features surpassed the individual use of the datasets. Later on,(Hudak et al., 2006) linearly regressed basal area and tree density on 26 predictors derived from height/intensity of LiDAR and Advanced Land Imager (ALI) multispectral data. They found laser height (to a higher extent) added by laser intensity metrics as most relevant predictors of both responses (The LiDAR-dominated models explained around 90 %

In terms of applying conventional distance-based k-NN methods, (McInerney et al., 2010) can be referred who combined airborne laser scanner and spaceborne Indian Remote Sensing (IRS) multispectral data to model stand canopy height using k-NN method. They apparently reported laser height data as the major means of canopy height retrieval, and achieved a relative *RMSE* between 28 and 31 %. (Maltamo, Malinen, Packalén, Suvanto & Kangas, 2006) applied a k-MSN (MSN using multiple *k*) method to combine the LiDAR data with aerial images and terrestrial stand information in Finland. The laser-based models were reported to outperform aerial photography in stand volume estimation, and the combination improved the models at plot and stand levels. (Wallerman & Holmgren, 2007) have also highlighted the combined application of predictive features derived from optical (SPOT) and laser (TopEye) data, according to which the combined dataset yielded the mean standing volume and stem density models with *RMSE* = 20% and *RMSE* = 22%, respectively. Combining satellite-based (TM) spectral features with laser metrics was also carried out by (Latifi et al., 2010) who reported that TM-extracted metrics can be used as alternatives to those derived from aerial photography for area-based models. Using k-MSN approach, (Packalén & Maltamo, 2006) conducted a survey to achieve species-specific stand information using sets

et al., 2008), (Straub et al., 2009)).

of variance for both response variables).

of aerial photography and ALS data. The procedure consisted of two methods including 1) simultaneous k-MSN estimation and 2) a two- phase prediction (prediction of the responses using regression analysis of ALS data and then allocation of the variables using a fuzzy classification approach). The k-MSN achieved better results than the fuzzy classifications. Although the study still proposed some further developments of the predictor variables from both datasets, the results were assessed satisfactory in cases of Norway spruce (*Picea abies* L.) and Scots pine (*Pinus sylvestris* L.). Soon after, (Packalén & Maltamo, 2007) made stand level models of volume and height using the similar dataset as before. A set of Haralick textural features(Haralick, 1979) from the optical data were additionally combined with the calculated ALS height features to produce predictive models. Accuracy of the predicted responses was finally found to be comparable to stand-level field assessments, though the attributes of conifers were estimated more accurately than those from the deciduous stands. In a further study by those authors, (Packalén & Maltamo, 2008) made use of the similar data to develop k-MSN models of diameter distribution by tree species. Based on the results of growing stock estimation in the previous research work(s), two approaches were compared including 1) field-based modelling using the Weibull distribution and 2) k-MSN prediction, in which the latter was assessed to outperform the former method. Nevertheless, the need to have more comprehensive reference field data (i.e. a common small-scale problem) to cover the spectral variations of the remote sensing data was highlighted as a major concern which supports those already acknowledged by precedent studies. (Nothdurft et al., 2009) represents an attempt towards solving this, in which bootstrap-simulated prediction errors of MSN inferences of volume based on sole use of LiDAR height metrics were smaller than those of design-based sampling.

Few studies e.g. (Straub et al., 2010) and (Latifi et al., 2011) compared parametric and nonparametric methods for forest attribute estimation in presence of both LiDAR and multispectral datasets. Whereas the former study compared Ordinary Least Squares (OLS) regression and a yield table-estimated stem volume with that from Euclidean distance-based k-NN method, the latter made a comparison between RF and OLS outputs. Nevertheless, both studies made relatively similar conclusions, in that they stated that using nonparametric methods cannot b expected to remarkably contribute to the improvement of forest attribute estimates. Besides, it supports (Yu et al., 2011)who also tested pure LiDAR metrics and achieved a similar performance of RF and OLS in a single tree scale. The rationale behind this is that non-parametric imputations do not share the same mix of error components as regression predictions. Imputation errors are often greater than regression errors because the errors do not result from a least-squares minimisation, but from selection of a most similar element in a pool of neighbouring observations (Stage & Crookston, 2007). However, K-NN methods (especially in single- neighbour setting) yield predictions with similar variance structure to that of the observations (Moeur & Stage, 1995), and are thus advantageous over the higher accuracies achievable by the use of OLS (Hudak et al., 2008).

The selection of proper predictor variables for a k-NN model (i.e. an absent element of conventional k-NN approaches) is a time-consuming task which and needs to be automated. (Packalén & Maltamo, 2007) used an iterative cost- minimizing variable selection algorithm which aimed at minimizing the weighted average of the relative *RMSE*. In contrast, studies like (Hudak et al., 2008) and (Straub et al., 2009)applied stepwise selection methods, where the former study based its stepwise iteration on the *Gini* index of variable importance used by (Breiman, 2001) as a built-in feature in RF. As such, other variable screening methods such as

regional-scale spatial resolution such as Landsat images, has been constantly approved to bear practical values when combined with laser scanner data ((Fransson et al., 2004), (McInerney et al., 2010)) and even as an alternative to aerial photography for area-based applications (Latifi et al., 2010). Furthermore, image spectroscopy data showed positive potentials for forest modelling ((Foster et al., 2002), (Schlerf et al., 2005)) and could potentially complement LiDAR-based models. However, one should bear in mind that the experimental results of surveys is by no means an eventual justification for the small- scale end users to take the

Characterizing Forest Structure by Means of Remote Sensing: A Review 21

In terms of various modelling methods used, both parametric and nonparametric modelling categories were frequently employed to describe the forest structural attributes. However, the latter approaches received more attention during the recent years to be run for high dimensional predictor datasets as well as for simultaneous predictions. The k-NN methods (especially MSN and RF) have been successfully coupled with LiDAR information and thus caused a rapid increase in the number of research projects during recent years. As it was shown here, much work has been done on area-based methods e.g. stand and plot levels, whereas single-tree approaches still lack some research, mainly due to high computational

In terms of handling predictor feature space induced by remote sensing features, some examples were previously referred. Whereas studies such as (Breidenbach, Nothdurft & Kändler, 2010)made the general necessity of variable screening in k-NN context questionable, some other studies acknowledge the requirement to selecting an effective strategy of pruning of predictor dataset (e.g. (Hudak et al., 2008), (Latifi et al., 2010)) and showed some decisive influences on the outcomes of the forest attribute models. The proper pruning of predictor feature space has been proved to help producing robust models (Latifi & Koch, 2011). Reducing the sensitivity of models has been also shown to greatly contribute to increasing the robustness of the models. Using resampling methods e.g. bootstrapping to reproduce the underlying population (e.g. (Nothdurft et al., 2009),(Breidenbach, Nothdurft & Kändler, 2010), and (Latifi et al., 2011) increases the potential and robustness of applying nonparametric models in small-scale forest inventory, where the shortage of reference data for validating the models is a major constraint. Robust models enable the analyst to apply them under other natural growing conditions except of the underlying test site, and can thus open up new

Along with the rapid advancements in launching the active/passive remote sensing instruments, the general access to high resolution products (particularly to laser scanner data) at reasonable costs is increasing. Therefore, the efforts towards thorough description of tree and forest stand structure are currently following a boosting trend all over the world. However, it is necessary to emphasize, again, that much care should be taken in terms of producing valid and robust results, as well as to get the best out of the available data and modelling facilities. Whereas the rapid and accurate modelling of standing volume, biomass and tree density is still important, some remaining open areas of research still require further research. These include, for example, efforts towards advanced classification tasks (especially on single-tree level or in complicated mixed stands), modelling understory and regenerations (e.g. important for intermediate silvicultural practices), and modelling rare and

acquisition of (relatively) expensive airborne hyperspectral data for granted.

requirements and the need for high resolution data.

operational applications for the yielded models (e.g. (Koch, 2010)).

ecologically-valuable populations.

parametric univariate correlation analysis (Breidenbach, Næsset, Lien, Gobakken & Solberg, 2010), Built-in schemes of RF such as stepwise iterative method (Vauhkonen et al., 2010) and forward selection (Breidenbach, Nothdurft & Kändler, 2010)were also used to complete this task in the recent literature. Each of those screening methods has been reported to be satisfying in terms of reducing the dimensionality of the feature space, though no rationale (e.g. comparison to other methods) has been presented. (Latifi et al., 2010) used a GA on categorised response variables to optimise the high-dimensional feature space formed by numerous correlated predictors. Even though this GA prototype was evaluated to efficiently reduce the relative RMSE of standing volume and AGB compared to the stepwise selection of predictors, the method was reported to produce unstable subsets attributed to strong correlations amongst the predictors. By using a Tau-squared index on continuous responses, GA was later shown to yield stable parsimonious variable subsets (Latifi et al., 2011). GA is a search algorithm which works via numerous solutions and generations and thus explores the entire possible combinations of candidate predictor variables. It provides the consequent NN models with the optimum range of refined, pre-processed feature space formed of relevant (and uncorrelated) remote sensing descriptors and is shown to be able to be adjusted to the k-NN modelling approaches (e.g. (Tomppo & Halme, 2004)). In this context, fitness functions to optimise continuous responses are preferable for regression scenarios. Those functions can even be linear as long as no highly non-linear trend/prediction is observed in the entire underlying dataset.

In a review by (Koch, 2010), the importance of combined use of laser and optical data for such purposes was highlighted. She stated that combining the altimetric height information with physical values derived from laser intensity is appropriate for modelling forest structure. As 3D data has already been shown to be plausible for AGB modelling, and due to the expected future technical innovations of those data for biomass assessments, it is assumed that it will further play a prominent role in major forest monitoring tasks e.g. those related to AGB modelling.

## **3. Conclusion**

Amongst the available active/passive remote sensing instruments, information derived from laser scanner (especially the height information) is definitely of major importance for studies regarding forest structure. According to (Koch, 2010), the significance of using LiDAR data for biomass assessment has been confirmed by variety of investigations which repeatedly showed comparatively higher performance of those data. However, the use of LiDAR intensity data is still limited. The intensity data has been shown to be able to add useful complementary information to LiDAR height data for forest attribute modelling (e.g. (Hudak et al., 2006)). Yet, a direct physical connection between those intensity metrics and forest structure still cannot be drawn. The reason for this complication is stated to be the dependency of intensity on a range of factors affecting reflected laser data including range, incidence angle, bidirectional reflectance function effects, and transmission of atmosphere (Hyyppä et al., 2008).

Apart from few exceptional studies which reported the incapability of spectral data for explaining the variation beyond the variation that could be explained by laser metrics (Hudak et al., 2008), adding spectral information to pure LiDAR-based models has been confirmed to be useful, as they provide continuous information over long time series and are spectrally sensitive for differentiating tree species. The ability of multispectral data, even in 18 Will-be-set-by-IN-TECH

parametric univariate correlation analysis (Breidenbach, Næsset, Lien, Gobakken & Solberg, 2010), Built-in schemes of RF such as stepwise iterative method (Vauhkonen et al., 2010) and forward selection (Breidenbach, Nothdurft & Kändler, 2010)were also used to complete this task in the recent literature. Each of those screening methods has been reported to be satisfying in terms of reducing the dimensionality of the feature space, though no rationale (e.g. comparison to other methods) has been presented. (Latifi et al., 2010) used a GA on categorised response variables to optimise the high-dimensional feature space formed by numerous correlated predictors. Even though this GA prototype was evaluated to efficiently reduce the relative RMSE of standing volume and AGB compared to the stepwise selection of predictors, the method was reported to produce unstable subsets attributed to strong correlations amongst the predictors. By using a Tau-squared index on continuous responses, GA was later shown to yield stable parsimonious variable subsets (Latifi et al., 2011). GA is a search algorithm which works via numerous solutions and generations and thus explores the entire possible combinations of candidate predictor variables. It provides the consequent NN models with the optimum range of refined, pre-processed feature space formed of relevant (and uncorrelated) remote sensing descriptors and is shown to be able to be adjusted to the k-NN modelling approaches (e.g. (Tomppo & Halme, 2004)). In this context, fitness functions to optimise continuous responses are preferable for regression scenarios. Those functions can even be linear as long as no highly non-linear trend/prediction is observed in the entire

In a review by (Koch, 2010), the importance of combined use of laser and optical data for such purposes was highlighted. She stated that combining the altimetric height information with physical values derived from laser intensity is appropriate for modelling forest structure. As 3D data has already been shown to be plausible for AGB modelling, and due to the expected future technical innovations of those data for biomass assessments, it is assumed that it will further play a prominent role in major forest monitoring tasks e.g. those related to AGB

Amongst the available active/passive remote sensing instruments, information derived from laser scanner (especially the height information) is definitely of major importance for studies regarding forest structure. According to (Koch, 2010), the significance of using LiDAR data for biomass assessment has been confirmed by variety of investigations which repeatedly showed comparatively higher performance of those data. However, the use of LiDAR intensity data is still limited. The intensity data has been shown to be able to add useful complementary information to LiDAR height data for forest attribute modelling (e.g. (Hudak et al., 2006)). Yet, a direct physical connection between those intensity metrics and forest structure still cannot be drawn. The reason for this complication is stated to be the dependency of intensity on a range of factors affecting reflected laser data including range, incidence angle, bidirectional

reflectance function effects, and transmission of atmosphere (Hyyppä et al., 2008).

Apart from few exceptional studies which reported the incapability of spectral data for explaining the variation beyond the variation that could be explained by laser metrics (Hudak et al., 2008), adding spectral information to pure LiDAR-based models has been confirmed to be useful, as they provide continuous information over long time series and are spectrally sensitive for differentiating tree species. The ability of multispectral data, even in

underlying dataset.

modelling.

**3. Conclusion**

regional-scale spatial resolution such as Landsat images, has been constantly approved to bear practical values when combined with laser scanner data ((Fransson et al., 2004), (McInerney et al., 2010)) and even as an alternative to aerial photography for area-based applications (Latifi et al., 2010). Furthermore, image spectroscopy data showed positive potentials for forest modelling ((Foster et al., 2002), (Schlerf et al., 2005)) and could potentially complement LiDAR-based models. However, one should bear in mind that the experimental results of surveys is by no means an eventual justification for the small- scale end users to take the acquisition of (relatively) expensive airborne hyperspectral data for granted.

In terms of various modelling methods used, both parametric and nonparametric modelling categories were frequently employed to describe the forest structural attributes. However, the latter approaches received more attention during the recent years to be run for high dimensional predictor datasets as well as for simultaneous predictions. The k-NN methods (especially MSN and RF) have been successfully coupled with LiDAR information and thus caused a rapid increase in the number of research projects during recent years. As it was shown here, much work has been done on area-based methods e.g. stand and plot levels, whereas single-tree approaches still lack some research, mainly due to high computational requirements and the need for high resolution data.

In terms of handling predictor feature space induced by remote sensing features, some examples were previously referred. Whereas studies such as (Breidenbach, Nothdurft & Kändler, 2010)made the general necessity of variable screening in k-NN context questionable, some other studies acknowledge the requirement to selecting an effective strategy of pruning of predictor dataset (e.g. (Hudak et al., 2008), (Latifi et al., 2010)) and showed some decisive influences on the outcomes of the forest attribute models. The proper pruning of predictor feature space has been proved to help producing robust models (Latifi & Koch, 2011). Reducing the sensitivity of models has been also shown to greatly contribute to increasing the robustness of the models. Using resampling methods e.g. bootstrapping to reproduce the underlying population (e.g. (Nothdurft et al., 2009),(Breidenbach, Nothdurft & Kändler, 2010), and (Latifi et al., 2011) increases the potential and robustness of applying nonparametric models in small-scale forest inventory, where the shortage of reference data for validating the models is a major constraint. Robust models enable the analyst to apply them under other natural growing conditions except of the underlying test site, and can thus open up new operational applications for the yielded models (e.g. (Koch, 2010)).

Along with the rapid advancements in launching the active/passive remote sensing instruments, the general access to high resolution products (particularly to laser scanner data) at reasonable costs is increasing. Therefore, the efforts towards thorough description of tree and forest stand structure are currently following a boosting trend all over the world. However, it is necessary to emphasize, again, that much care should be taken in terms of producing valid and robust results, as well as to get the best out of the available data and modelling facilities. Whereas the rapid and accurate modelling of standing volume, biomass and tree density is still important, some remaining open areas of research still require further research. These include, for example, efforts towards advanced classification tasks (especially on single-tree level or in complicated mixed stands), modelling understory and regenerations (e.g. important for intermediate silvicultural practices), and modelling rare and ecologically-valuable populations.

Foster, J., Kingdon, C. & Townsend, P. (2002). Predicting tropical forest carbon from

Characterizing Forest Structure by Means of Remote Sensing: A Review 23

Franco-Lopez, H., Ek, A. R. & Bauer, M. E. (2001). Estimation and mapping of forest stand

Franklin, J. (1986). Thematic mapper analysis of coniferous forest structure and composition,

Fransson, J., Gustavsson, A., Ulander, L. & Walter, F. t. (2000). Towards an operational use of

Fransson, J., Magnusson, M. & Holmgren, J. (2004). Estimation of forest stem volume using

Gebreslasie, M. T., Ahmed, F, B. & Van Aardt, J. (2010). Predicting forest structural attributes

Ghosh, M. & Rao, J, N. K. (1994). Small area estimation: An appraisal, *Statistical Science*

Gonzalez-Alonso, F., Marino-De-Miguel, S., Roldan-Zamarron, A., Garcia-Gigorro, S. &

Guo, X. J. A. (2005). *climate- sensitive analysis of lodgepole pine site index in alberta*, Master's

Haapanen, R., Ek, A. R., Bauer, M. E. & Finley, A. O. (2004). Delineation of forest/nonforest

Haralick, R. M. (1979). Statistical and structural approaches to texture. proceedings,

Heinzel, J., Weinacker, H. & Koch, B. (2008). Full automatic detection of tree species based

Heurich, M. & Thoma, F. (2008). Estimation of forestry stand parameters using laser scanning

Hölmstrom, H. (2002). Estimation of single tree characteristics using the knn method

Holmström, H. & Fransson, E. S. (2003). Combining remotely sensed optical and radar data

Härdle, W. (1990). *Econometric society monographs*, Econometric society monographs, Cambridge University Press, chapter Applied nonparametric regression. Härdle, W., Müller, M., Sperlich, S. & Werwatz, A. (2004). *Non-parametric and semiparametric*

in knn estimation of forest variables, *Forest Science* 49(3): 409–418.

norway spruce (picea abies) forests, *Forestry* 81(5): 645–661.

pp. 3108–3110.

pp. 2318–2322.

9(1): 55–76.

89: 265–271.

167: 303–314.

*models*, Springer, New York.

*Sensing of Environment* 77: 251?274.

*International Journal of Remote Sensing* 7: 1287 – 1301.

*IGARSS 2000*, IEEE, Piscataway, NJ., p. 399U401. ˝

*Observation and Geoinformation* 125: 523–526.

*Proceedings of the IEEE*, Vol. 67(5), pp. 786–804.

*of Remote Sensing* 27(24): 5409–5415.

eo-1 hyperspectral imagery in noel kempff mercado national park, bolivia, *. IEEE International Geoscience and Remote Sensing Symposium, 2002. IGARSS '02. Vol. 6,*,

density, volume, and cover type using the k-nearest neighbours method, *Remote*

vhf sar data for forest mapping and forest management, *in* T. Stein (ed.), *Proceedigs of*

optical spot-5 satellite and laser data in combination, *Proceedings of IGARSS 2004*,

using ancillary data and aster satellite data, *International Journal of Applied Earth*

Cuevas, J. M. (2006). Forest biomass estimation through ndvi composites. the role of remotely sensed data to assess spanish forests as carbon sinks, *International Journal*

thesis, Dept. of Mathematics and Statistics. Concordia University, Montreal-Canada.

land use classes using nearest neighbour methods, *Remote Sensing of Environment*

on delineated single tree crowns - a data fusion approach for airborne laser scanning data and aerial photographs, *Proceedings of SilviLaser 2008*, Edinburgh, UK, pp. 76–85.

data in temperate, structurally rich natural european beech (fagus sylvatica) and

and plotwise aerial photograph interpretations, *Forest Ecology and Management*

#### **4. References**


20 Will-be-set-by-IN-TECH

Acker, S., Sabin, T., Ganio, L. & McKee, W. (1998). Development of old-growth structure

BMU (2009). National biomass action plan for germany, *Technical report*, Bundesministerium für Umwelt, Naturschutz und Reaktorsicherheit (BMU), 11055 Berlin, Germany. Boyd, D. S. & Hill, R. A. (2007). Validation of airborne lidar intensity values from a forested

*SLaser Scanning 2007 and SilviLaser 2007Š Part 3 / W52, Espoo-Finland ´* . Breidenbach, J., Kublin, E., McGaughey, R., Andersen, H. & Reutebuch, S. (2008).

airborne laser scanner data, *Photogrammetric Journal of Finland* 21(1): 4–15. Breidenbach, J., Nothdurft, A. & Kändler, G. (2010). Comparison of nearest neighbour

Breidenbach, J., Næsset, E., Lien, V., Gobakken, T. & Solberg, S. (2010). Prediction of species

Cabaravdic, A, A. (2007). *Efficient Estimation of Forest Attributes with k NN*, PhD thesis, Faculty

Crookston, N. L., Moeur, M. & Renner, D. (2002). *Users guide to the most similar neighbor*

Dalponte, M., Coops, N. C., Bruzzone, L. & Gianelle, D. (2009). Analysis on the use of multiple

Davey, S. (1984). *Possums and Gliders*, Australian Mammal Society, Sydney, chapter Habitat

Efron, B. & Tibshirani, R. J. (1993). *An introduction to the bootstrap*, New York: Chapman &

Evans, J. S., Hudak, A. T., Faux, R. & Smith, M. (2009). Discrete return lidar in natural

Falkowski, M. J., Hudak, A. J., Crookston, N. L., Gessler, P. E., Uebler, E. H. & Smith, A.

Finley, A., Ek, A. R., Bai, Y. & Bauer, M. E. K. (2003). Nearest neighbour estimation of

Finley, A. O. & McRoberts, R. E. (2008). Efficient k-nearest neighbour searches for multi-source forest attribute mapping, *Remote Sensing of Environment* 112: 2203–2211.

*Topics in Applied Earth Observations and Remote Sensing* 2(4): 310–318.

of Forest and Environmental Studies, University of Freiburg.

and timber volume growth trends in maturing douglas-fir stands, *Forest Ecology and*

landscape using hymap data: preliminary analysis, *Proceedings of the ISPRS Workshop*

Mixed-effects models for estimating stand volume by means of small footprint

approaches for small area estimation of tree species-specific forest inventory attributes in central europe using airborne laser scanner data, *European Journal of*

specific forest inventory attributes using a nonparametric semi-individual tree crown approach based on fused airborne laser scanning and multispectral data, *Remote*

*imputation program version 2.00*, RMRS-GTR-96.Ogden, UT: USDA Forest Service

returns lidar data for the estimation of tree stems volume, *IEEE Journal of Selected*

preferences of arboreal marsupials within a coastal forest in southern New South

resources: Recommendations for project planning, data processing, and deliverables,

M. S. (2010). Landscape-scale parameterization of a tree-level forest growth model: a k-nearest neighbor imputation approach incorporating lidar data, *Canadian Journal of*

forest attributes: Improving mapping efficiency, *Proceedings of the fifth Annual Forest*

**4. References**

*Management* 104: 265– 280.

*Forest Research* 129(5): 833–846.

*Sensing of Environment* 114: 911–924.

Rocky Mountain Research Station.

Wales, pp. 509– 516.

*Remote Sensing* 1: 776–794.

*Forest Research* 40: 184–199.

*Inventory and Analysis Symposium*, pp. 61–68.

Hall.

Breiman, L. (2001). Random forests, *Machine Learning* 45: 5–32.


Latifi, H., Nothdurft, A., Straub, C. & Koch, B. (2011). Modelling stratified forest attributes

Characterizing Forest Structure by Means of Remote Sensing: A Review 25

LeMay, V. & Temesgen, H. (2005). Camparison of nearest neighbour methods for estimating

Liaw, A. & Wiener, M. (2002). Classification and regression by randomforest, *R News* 2: 18–22. Lim, K., Treitz, P., Wulder, M., St-Onge, B. & Flood, M. (2003). Lidar remote sensing of forest

MacLean, G. & Krabill, W. (1986). Gross merchantable timber volume estimation using an

Maltamo, M., Bollandsås, O. M., Vauhkonen, J., Breidenbach, J., Gobakken, T. & E, N. (2010).

Maltamo, M. & Eerikäinen, K. (2001). The most similar neighbour reference in the yield prediction of pinus kesiya stands in zambia, *Silva Fennica* 35(4): 437–451. Maltamo, M., Eerikäinen, K., Packalén, P. & Hyyppä, J. a. (2006). Estimation of stem volume using laser scanning-based canopy height metrics, *Forestry* 79(2): 217–229. Maltamo, M., Hyyppä, J. & Malinen, J. (2006). A comparative study of the use of laser scanner

Maltamo, M., Malinen, J., Packalén, P., Suvanto, A. & Kangas, J. (2006). Non-parametric

Maltamo, M., Næsset, E., Bollandsås, O., Gobakken, T. & Packalén, P. (2009). Non-parametric

McElhinny, C., Gibbons, P., Brack, C. & Bauhus, J. (2005). Forest and woodland stand

McInerney, D. O., Suarez-Minguez, J., Valbuena, R. & Nieuwenhuis, M. (2010). Forest

McRoberts, R. E. & Tomppo, E. O. (2007). Remote sensing support for national forest

Mäkelä, H. & Pekkarinen, A. (2004). Estimation of forest stand volumes by landsat tm imagery and stand-level field-inventory data, *Forest Ecology and Managament* 196: 245–255. Moeur, M. & Stage, A. R. (1995). Most similar neighbour: An improved sampling inference

Mohammadi, J. & Shataee, S. (2010). Possibility investigation of tree diversity mapping using

Muukkonen, P. & Heiskanen, A. J. (2007). Biomass estimation over a large area based on

verify carbon inventories, *Remote Sensing of Environment* 107: 607–624. Nelson, R., Krabill, W. & Tonelli, J. (1988). Estimating forest biomass and volume using airborne laser scanner data, *Remote Sensing of Environment* 24(2): 247–267.

landsat etm+ data in the hyrcanian forests of iran, *Remote Sensing of Environment*

standwise forest inventory data and aster and modis satellite data: A possibility to

procedure for natural resource planning, *Forest Science* 41: 337U359. ˝

stand-register data, *Canadian Journal of Forest Research* 36: 426–436.

knn estimation in aberfoyle, scotland, *Forestry* 83(2): 195–206.

inventories, *Remote Sensing of Environment* 110: 412–419.

airborne lidar system, *Canadian Journal of Remote Sensing* 12: 7U18. ˝

stands using airborne laser scanner data, *Forestry* 83(3): 257–268.

*Digital Earth* DOI:10.1080/17538947.2011.583992.

structure, *Progress in Physical Geography* 27(1): 88–106.

*Scandinavian Journal of Forest Research* 21: 231–238.

*Journal of Forest Research* 24: 541–553.

218: 1–24.

104(7): 1504–1512.

51(2): 109–119.

using optical/lidar features in a central european landscape, *International Journal of*

basal area and stems per hectare using aerial auxiliary variables, *Forest Science*

Comparing different methods for prediction of mean crown height in norway spruce

data and field measurements in the prediction of crown height in boreal forests,

estimation of stem volume using airborne laser scanning, aerial photography, and

prediction of diameter distribution using airborne laser scanner data, *Scandinavian*

structural complexity: Its definition and measurement, *Forest Ecology and Management*

canopy height retrieval using lidar data, medium resolution satellite imagery and


22 Will-be-set-by-IN-TECH

Hudak, A., Crookston, N., Evans, J., Hall, D. & Falkowski, M. (2008). Nearest neighbour

Hudak, A. T., Crookston, N. L., Evans, J. S., Falkowski, M. J., Smith, A. M. S. & Gessler, P.

Hyyppä, J., Hyyppä, H., Leckie, D., Gougon, F., Yu, X. & Maltamo, M. (2008). Review of

data in boreal forests, *International Journal of Remote Sensing* 29(5): 1339–1336. Imhoff, M. (1995). Radar backscatter and biomass saturation: ramifications for global biomass inventory, *IEEE Transactions on Geoscience and Remote Sensing* 33(2): 510–518. Iverson, L. R., Cook, E. A. & Graham, R. L. (1994). Regional forest cover estimation via remote sensing: the calibration center concept, *Landscape Ecology* 9(3): 159–174. Jochem, A., Hollaus, M., Rutzinger, M. & Höfle, B. (2011). Estimation of aboveground biomass

Katila, M., T. E. (2002). Stratification by ancillary data in multisource forest inventories

Kilkki, P. & Päivinen, R. (1987). Reference sample plots to combine field measurements and

Koch, B. (2010). Status and future of laser scanning, synthetic aperture radar and hyperspectral

Koch, B., Straub, C., Dees, M., Wang, Y. & Weinacker, H. (2009). Airborne laser data for

Korhonen, L., Peuhkurinen, J., Malinen, J., Suvanto, A., Malatamo, M., Packalén, P. & Kangas,

Kutzer, C. (2008). *Potential of the kNN Method for Estimation and Monitoring off-Reserve*

Latifi, H. & Koch, B. (2011). Generalized spatial models of forest structure using airborne

Latifi, H., Nothdurft, A. & Koch, B. (2010). Non-parametric prediction and mapping of

˝ *Forestry* 83(4): 395–407.

*Department of Forest Mensuration and Management, University of Helsinki*.

*Remote Sensing of Environment* 112: 2232–2245.

derived from airborne lidar data, *Sensors* 11: 278–295.

Kimmins, J. (1996). *Forest ecology*, Macmillan Inc., New York.

*and Remote Sensing* 65: 581–590.

*of Remote Sensing* 32: 126–138.

32: 1548–1561.

30(4): 935–963.

81(4): 499–510.

Germany.

University of Freiburg.

optical/lidar Uderived predictors,

imputation of species-level, plot-scale forest structure attributes from lidar data,

(2006). Regression modeling and mapping of coniferous forest basal area and tree density from discrete- return lidar and multispectral satellite data, *Canadian Journal*

methods of small-footprint airborne laser scanning for extracting forest inventory

in alpine forests: A semi-empirical approach considering canopy transparency

employing k-nearest neighbour estimation, *Canadian Journal of Forest Research*

satellite data in forest inventory, *Remote Sensing-Aided Forest Inventory. Proceedings of Seminars organised by SNS, 10-12 Dec. 1986, Hyytiälä, Finland. Research Notes No 19.*

remote sensing data for forest biomass assessment, *ISPRS Journal of Photogrammetry*

stand delineation and information extraction, *International Journal of Remote Sensing*

J. (2008). The use of airborne laser scanning to estimate sawlog volumes, *Forestry*

*Forest Resources in Ghana*, PhD thesis, Faculty of Forest and Environmental Studies,

multispectral and laser scanner data, *Proceedings of ISPRS Workshop: High resolution earth imaging for geospatial information,*, Vol. XXXVIII-4/W19. of *International Archives of the Photogrammetry, Remote sensing and Spatial Information Sciences,*, Hannover,

standing timber volume and biomass in a temperate forest: application of multiple


Straub, C., Weinacker, H. & Koch, B. (2010). A comparison of different methods for

Characterizing Forest Structure by Means of Remote Sensing: A Review 27

Thessler, S., Sesnie, S., Bendana, Z., Ruokolainen, K., Tomppo, E. & Finegan, B. (2008). Using

Tomppo, E. (1991). Satellite image-based national forest inventory of finland, *International*

Tomppo, E., Gagliano, C., De Natale, F., Katila, M. & McRoberts, R. E. (2009). Predicting

Tomppo, E. & Halme, M. (2004). Using coarse scale forest variables as ancillary information

Tomppo, E., Korhonen, K. T., Heikkinen, J. & Yli-Kojola, H. (2001). Multi-source inventory

Treuhaft, R. N., Asner, G. P. & Law, B. E. (2003). Structure-based forest biomass from fusion of radar and hyperspectral observations, *Geophysical Research Letters* 30(9): 1472. Tyrrell, L. & Crow, T. (1994). Structural characteristics of old-growth hemlock-hardwood

Uuttera, J., Maltamo, M. & Hotanen, J. (1997). The structure of forest stands in virgin

Van Den Meersschaut, D. & Vandekerkhove, K. (1998). Development of a standscale forest

Vauhkonen, J., Korpela, I., Maltamo, M. & Tokola, T. (2010). Imputation of single-tree

Vohland, M., Stoffels, J., Hau, C. & Schüler, G. (2007). Remote sensing techniques for

Wallerman, J. & Holmgren, J. (2007). Estimating field-plot data of forest stands using airborne laser scanning and spot hrg data, *Remote Sensing of Environment* 110: 501–508. Wehr, A. & Lohr, O. (1999). Airborne laser scanningUan introduction and overview, ˚ *ISPRS*

Wood, S. (2006). *Generalized additive models: an introduction with R*, Chapman & Hall/CRC,

over northern costa rica, *Remote Sensing of Environment* 112: 2485– 2494. Tommola, M., Tynkkynen, M., Lemmetty, J., Herstela, P. & Sikanen, L. (1999). Estimating the

orthophotos, *European Journal of Forest Research* 129: 1069–1080.

*Archives of Photogrammetry and Remote Sensing* 28 (7-1): 419U 424. ˝ Tomppo, E. (1993). Multi-source national forest inventory of finland, *in* J. R. A. Nyysso´

landsat imagery, *Remote Sensing of Environment* 113(3): 500–517.

*Forest Engineering* pp. 75–81.

*Sensing of Environment* 92: 1–20.

forests in relation to age, *Ecology* 75(2): 370–386.

metrics, *Remote Sensing of Environment* 114: 1263–1276.

*Journal of Photogrammetry and Remote Sensing* 54: 68–82.

*Ecology and Management* 96: 125–138.

analysis, *Silva Fennica* 41(3): 441–456.

Idaho, USA, pp. 340–34.

Boca Raton, Florida.

35(3): 309U328. ˝

forest resource estimation using information from airborne laser scanning and cir

k-nn and discriminant analyses to classify rain forest types in a landsat tm image

characteristics of a marked stand using k-nearest- neighbour regression, *Journal of*

S. Poso (ed.), *Proceedings of Ilvessalo symposium on national forest inventories*, p. 53 U 61. ˝

categorical forest variables using an improved k-nearest neighbour estimator and

and weighting of variables in k-nn estimation: a genetic algorithm approach, *Remote*

of the forests of the hebei forestry bureau, heilongjiang, china, *Silva Fennica*

and managed peat-lands: a comparison between finnish and russian keralia, *Forest*

biodiversity index based on the state forest inventory, *in* M. Hansen & T. Burk (eds), *Integrated Tools for Natural Resources Inventories in the 21st Century*, USDA, Boise,

attributes using airborne laser scanning-based height, intensity, and alpha shape

forest parameter assessment: Multispectral classification and linear spectral mixture

lnen,


24 Will-be-set-by-IN-TECH

Nothdurft, A., Soborowski, J. & Breidenbach, J. (2009). Spatial prediction of forest stand

Næsset, E. (2002). Predicting forest stand characteristics with airborne scanning laser

Næsset, E. (2004). Practical large-scale forest stand inventory using a small airborne scanning

Packalén, P. & Maltamo, M. (2006). Predicting the plot volume by tree species using airborne

Packalén, P. & Maltamo, M. (2007). The k-msn method for the prediction of species-specific

Packalén, P. & Maltamo, M. (2008). Estimation of species-specific diameter distributions using

Pesonen, A., Maltamo, M., Packalén, P. & Eerikäinen, K. (2008). Airborne laser scanning-based

Päivinen, R., Van Brusselen, J. & Schuck (2009). A the growing stock of european forests using

Rahman, M., Csaplovics, E. & Koch, B. (2007). An efficient regression strategy for extracting

Schlerf, M., Atzberger, C. & Hill, J. (2005). Remote sensing of forest biophysical variables using hymap imaging spectrometer data, *Remote Sensing of Environment* 95(2): 177–194. Sexton, J. O., Bax, T., Siquiera, P., Swenson, J. J. & Hensley, S. (2009). comparison of lidar,

southeastern north america, *Forest Ecology and Management* 257: 1136U1147.

Stage, A. R. & Crookston, N. L. (2007). Partitioning error components for accuracy-assessment of near- neighbor methods of imputation, *Forest Science* 53(1): 62?72. Stümer, W. . D. (2004). *Kombination vor terrestischen Aufnahmen und Fernerkundungsdaten mit*

Stoffels, J. (2009). *Einsatz einer lokal adaptiven Klassifikationsstrategie zur satellitengestützten*

Stone, J. & Porter, J. (1998). What is forest stand structure and how to measure it?, *Northwest*

Straub, C., Dees, M., Weinacker, H. & Koch, B. (2009). Using airborne laser scanner data

Straub, C. & Koch, B. (2011). Estimating single tree stem volume of pinus sylvestris

Geography/Geesciences, University of Trier.

*Fernerkundung, GeoInformation* 3/2009: 277–287.

remote sensing and forest inventory data, *Forestry* 82(5): 479–490.

using a practical two-stage procedure and field data, *Remote Sensing of Environment*

stand attributes using airborne laser scanning and aerial photographs, *Remote Sensing*

airborne laser scanning and aerial photographs, *Canadian Journal of Forest Research*

prediction of coarse woody debris volumes in a conservation area, *Forest Ecology and*

forest biomass information from satellite sensor data, *International Journal of Remote*

radar, and field measurements of canopy height in pine and hardwood forests of

*Hilfe der kNN-Methode zur Klassifizierung und Kartierung von Wäldern*, PhD thesis, Fakultät für Forst-, Geo- und Hydrowissenschaften der Technischen Universität

*Waldinventur in einem heterogenen Mittelgebirgsraum.*, PhD thesis, Faculty of

and cir orthophotos to estimate the stem volume of forest stands, *Photogrammetrie,*

using airborne laser scanner and multispectral line scanner data, *Remote Sensing*

˝

variables, *European Journal of Forest Research* 128(3): 241–251.

laser, *Scandinavian Journal of Forest Research* 19: 164–179.

Oliver, C. & Larson, B. (1996). *Forest Stand Dynamics*, McGraw-Hill Inc., New York.

laser scanning and aerial photographs, *Forest Science* 52(6): 611–622.

80(1): 88–99.

38: 1750–1760.

Dresden.

*Science* 72(2): 25–26.

3(5): 929–944.

*of Environment* 109: 328–341.

*Management* 255: 3288–3296.

*Sensing* 26(7): 1511–1519.


**2** 

**Fusion of Optical and Thermal Imagery and** 

**LiDAR Data for Application to 3-D Urban** 

**Environment and Structure Monitoring** 

*2Remote Sensing Laboratory, Department of Geography and Environment,* 

For many years, panchromatic aerial photographs have been the main source of remote sensing data for detailed inventories of urban areas. Traditionally, building extraction relies mainly on manual photo-interpretation which is an expensive process, especially when a large amount of data must be processed (Ameri, 2000). The characterization of a given object bases on its visible information, such as: shape (external form, outline, or configuration), size, patterns (spatial arrangement of an object into distinctive forms), shadow (indicates the outlines, length, and is useful to measure height, or slopes of the terrain), tone (color or brightness of an object, smoothness of the surface, etc.) (Ridd 1995). Automated assessment of urban surface characteristics has been investigated due to the high costs of visual interpretation. Most of those studies used multispectral satellite imagery of medium to low spatial resolution (Landsat-TM, SPOT-HRV, IRS-LISS, ALI and CHRIS-PROBA) and were based on common image-analysis techniques (e.g. maximum likelihood (ML) classification, principal components analysis (PCA) or spectral indices (Richards and Jia 1999)). The problems of limited spatial resolution over urban areas have been overcome with the wider availability of space-borne systems, which characterized by large swath and high spatial and temporal resolutions (e.g. Worl-View2). However, the limits on spectral information of nonvegetative material render their exact identification difficult. In this regard, the hyperspectral remote sensing (HRS) technology, using data from airborne sensors (e.g. AVIRIS, GER, DAIS, HyMap, AISA-Dual), has opened up a new frontier for surface differentiation of homogeneous material based on spectral characteristics (Heiden et al. 2007). This capability also offers the potential to extract quantitative information on biochemical, geochemical and chemical parameters of the targets in question (Roessner et al.

The most common approach to characterizing urban environments from remote sensing imagery is land-use classification, i.e. assigning all pixels in the image to mutually exclusive classes, such as residential, industrial, recreational, etc. (Ridd 1995, Price 1998). In contrast, mapping the urban environment in terms of its physical components preserves the

**1. Introduction** 

1998).

Anna Brook1, Marijke Vandewal1 and Eyal Ben-Dor2

*1Royal Military Academy, CISS Department, Brussels* 

*Tel-Aviv University, Tel-Aviv* 

*1Belgium 2Israel* 


## **Fusion of Optical and Thermal Imagery and LiDAR Data for Application to 3-D Urban Environment and Structure Monitoring**

Anna Brook1, Marijke Vandewal1 and Eyal Ben-Dor2 *1Royal Military Academy, CISS Department, Brussels 2Remote Sensing Laboratory, Department of Geography and Environment, Tel-Aviv University, Tel-Aviv 1Belgium 2Israel* 

## **1. Introduction**

26 Will-be-set-by-IN-TECH

28 Remote Sensing – Advanced Techniques and Platforms

Yu, X., Hyyppä, J., Holopainen, M. & Vastaranta, M.... (2010). Comparison of area-based

Yu, X., Hyyppä, J., Vstarana, M., Holopainen, M. & Viitala, R. (2011). Predicting individual tree

*ISPRS Journal of Photogrammetry and Remote Sensing* 66(1): 28–37.

*Sensing* 2: 1481–1495.

and individual tree-based methods for predicting plot-level forest attributes, *Remote*

attributes from airborne laser point clouds based on the random forests technique,

For many years, panchromatic aerial photographs have been the main source of remote sensing data for detailed inventories of urban areas. Traditionally, building extraction relies mainly on manual photo-interpretation which is an expensive process, especially when a large amount of data must be processed (Ameri, 2000). The characterization of a given object bases on its visible information, such as: shape (external form, outline, or configuration), size, patterns (spatial arrangement of an object into distinctive forms), shadow (indicates the outlines, length, and is useful to measure height, or slopes of the terrain), tone (color or brightness of an object, smoothness of the surface, etc.) (Ridd 1995). Automated assessment of urban surface characteristics has been investigated due to the high costs of visual interpretation. Most of those studies used multispectral satellite imagery of medium to low spatial resolution (Landsat-TM, SPOT-HRV, IRS-LISS, ALI and CHRIS-PROBA) and were based on common image-analysis techniques (e.g. maximum likelihood (ML) classification, principal components analysis (PCA) or spectral indices (Richards and Jia 1999)). The problems of limited spatial resolution over urban areas have been overcome with the wider availability of space-borne systems, which characterized by large swath and high spatial and temporal resolutions (e.g. Worl-View2). However, the limits on spectral information of nonvegetative material render their exact identification difficult. In this regard, the hyperspectral remote sensing (HRS) technology, using data from airborne sensors (e.g. AVIRIS, GER, DAIS, HyMap, AISA-Dual), has opened up a new frontier for surface differentiation of homogeneous material based on spectral characteristics (Heiden et al. 2007). This capability also offers the potential to extract quantitative information on biochemical, geochemical and chemical parameters of the targets in question (Roessner et al. 1998).

The most common approach to characterizing urban environments from remote sensing imagery is land-use classification, i.e. assigning all pixels in the image to mutually exclusive classes, such as residential, industrial, recreational, etc. (Ridd 1995, Price 1998). In contrast, mapping the urban environment in terms of its physical components preserves the

Fusion of Optical and Thermal Imagery and LiDAR Data for

faces some challenges: it is time-consuming and expensive.

data type or analysis approach (Allen & Lu 2003).

reference system WGS 84.

reliable (Margarit et al. 2007).

information of the environment.

Application to 3-D Urban Environment and Structure Monitoring 31

altitude of it determined by GPS/INS, therefore, the raw data are collected in the GPS

Generally, 3-D urban built environment models are created using CAD (computer-aided design) tools. There have been many successful projects which have produced detailed and realistic 3-D models for a diverse range of cities (Dodge et al. 1998, Bulmer 2001, Jepson et al. 2001). These city models were created with accurate building models compiled with orthophotographs and exhibited an impressive, realistic urban environment (Chan et al. 1998). However, the creation of 3-D city models using CAD tools and orthophotographs

The analysis of InSAR (Interferometric Synthetic Aperture Radar) and SAR (Synthetic Aperture Radar) data for urban built targets has several important benefits, such as the ability to adopt numerical tools, and the ability to provide results resembling the real-world situation. In addition, a relation can be found between target geometry and the measured scattering, and according to target-scattering properties, height-retrieval algorithms can be developed. The limitation of this method is that the targets in urban models have to be as detailed as possible; otherwise the results obtained in the modeled environment will be not

The use of 3-D high-spatial-resolution applications in urban built environments is a mainstay of architecture and engineering practice. However, engineering practices are increasingly incorporating different data sets and alternative dissemination systems. Understanding, modeling and forecasting the trends in urban environments are important to recognize and assess the impact of urbanization for resource managers and urban planners. Many applications are suitable sources of reliable information on the multiple facets of the urban environment (Jensen & Cowen 1999, Donnay et al. 2001, Herold et al. 2003). These models have provided simulations of urban dynamics and an understanding of the patterns and processes associated with urbanization (Herold et al. 2005). However, the complexity of urban systems makes it difficult to adequately address changes using a single

This chapter presents techniques for data fusion and data registration. The ability to include an accurate and realistic 3-D position, quantitative spectral information, thermal properties and temporal changes provides a near-real-time monitoring system for photogrammetric and urban planning purposes. The method is focusing on registration of multi-sensor and multitemporal information for 3-D urban environment monitoring applications. Generally, data registration is a critical pre-processing procedure in all remote-sensing applications that utilizes multiple sensors inputs, including multi-sensor data fusion, temporal change detection, and data mosaicking. The main objective of this research is a fully controlled, nearreal-time, natural and realistic monitoring system for an urban environment. This task led us first to combine the image-processing and map-matching procedures, and then to incorporate remote sensing and GIS tools into an integrative method for data fusion and registration. To support this new data model, traditional spatial databases were extended to support 5-D data. This chapter is organized as follows. Section 2 describes the materials and methods, which are implemented in the 3-D urban environment model presented in Section 3. Section 4 addresses to the generic 3-D urban application, which involves data fusion and contextual

heterogeneity of urban land cover better than traditional land-use classification (Jensen & Cowen, 1999), characterizes urban land cover independent of analyst-imposed definitions and more accurately captures changes with time (Rashed et al. 2001).

Hyperspectral thermal infrared (TIR) remote sensing has rapidly advanced with the development of airborne systems and follows years of laboratory studies (Hunt & Vincent 1968, Conel 1969, Vincent & Thomson 1972, Logan et al. 1975, Salisbury et al. 1987). The radiance emitted from a surface in thermal infrared (4-13μm) is a function of its temperature and emissivity. Emittance and reflectance are complex processes that depend not only on the absorption coefficient of materials but also on their reflective index, physical state and temperature. Most urban built environment studies are taking into account both temperature and emissivity variations, since these relate to the targets identification, mapping and monitoring and provide a mean for practical application.

The hyperspectral thermal imagery provides the ability for mapping and monitoring temperatures related to the man-made materials. The urban heat island (UHI) has been one of the most studied and the best-known phenomena of urban climate investigated by thermal imagery (Carlson et al., 1981; Vukovich, 1983; Kidder & Wu, 1987; Roth et al., 1989; Nichol, 1996). The preliminary studies have reported similarities between spatial patterns of air temperature and remotely sensed surface temperature (Henry et al., 1989; Nichol 1994), whereas progress studies suggest significant differences, including the time of day and season of maximum UHI development and the relationship between land use and UHI intensity (Roth et al., 1989). The recent high-resolution airborne systems determine the thermal performance of the building that can be used to identify heating and cooling loss due to poor construction, missing or inadequate insulation and moisture intrusion.

The spectral (reflective and thermal) characteristics of the urban surfaces are known to be rather complex as they are composed of many materials. Given the high degree of spatial and spectral heterogeneity within various artificial and natural land cover categories, the application of remote sensing technology to mapping built urban environments requires specific attention to both 3-D and spectral domains (Segl et al. 2003). Segl confirms that profiling hyperspectral TIR can successfully identify and discriminate a variety of silicates and carbonates, as well as variations in the chemistry of some silicates. The integration of VNIR-SWIR and TIR results can provide useful information to remove possible ambiguous interpretations in unmixed sub-pixel surfaces and materials. The image interpretation is based on the thematic categories (Roessner et al. 2001), which are defined by the rules of urban mapping and land-uses.

The ultimate aim in photogrammetry in generating an urban landscape model is to show the objects in an urban area in 3-D (Juan et al. 2007). As the most permanent features in the urban environment, an accurate extraction of buildings and roads is significant for urban planning and cartographic mapping. Acquisition and integration of data for the built urban environment has always been a challenge due to the high cost and heterogeneous nature of the data sets (Wang 2008). Thus, over the last few years, LiDAR (LIght Detection And Ranging) has been widely applied in the field of photogrammetry and urban 3-D analysis (Tao 2001, Zhou 2004). Airborne LiDAR technique provides geo-referenced 3-D dense points ("cloud") measured roughly perpendicular to the direction of flight over a reflective surface on the ground. This system integrates three basic data-collection tools: a laser scanner, a global positioning system (GPS) and an inertial measuring unit (IMU). The position and

heterogeneity of urban land cover better than traditional land-use classification (Jensen & Cowen, 1999), characterizes urban land cover independent of analyst-imposed definitions

Hyperspectral thermal infrared (TIR) remote sensing has rapidly advanced with the development of airborne systems and follows years of laboratory studies (Hunt & Vincent 1968, Conel 1969, Vincent & Thomson 1972, Logan et al. 1975, Salisbury et al. 1987). The radiance emitted from a surface in thermal infrared (4-13μm) is a function of its temperature and emissivity. Emittance and reflectance are complex processes that depend not only on the absorption coefficient of materials but also on their reflective index, physical state and temperature. Most urban built environment studies are taking into account both temperature and emissivity variations, since these relate to the targets identification,

The hyperspectral thermal imagery provides the ability for mapping and monitoring temperatures related to the man-made materials. The urban heat island (UHI) has been one of the most studied and the best-known phenomena of urban climate investigated by thermal imagery (Carlson et al., 1981; Vukovich, 1983; Kidder & Wu, 1987; Roth et al., 1989; Nichol, 1996). The preliminary studies have reported similarities between spatial patterns of air temperature and remotely sensed surface temperature (Henry et al., 1989; Nichol 1994), whereas progress studies suggest significant differences, including the time of day and season of maximum UHI development and the relationship between land use and UHI intensity (Roth et al., 1989). The recent high-resolution airborne systems determine the thermal performance of the building that can be used to identify heating and cooling loss

due to poor construction, missing or inadequate insulation and moisture intrusion.

urban mapping and land-uses.

The spectral (reflective and thermal) characteristics of the urban surfaces are known to be rather complex as they are composed of many materials. Given the high degree of spatial and spectral heterogeneity within various artificial and natural land cover categories, the application of remote sensing technology to mapping built urban environments requires specific attention to both 3-D and spectral domains (Segl et al. 2003). Segl confirms that profiling hyperspectral TIR can successfully identify and discriminate a variety of silicates and carbonates, as well as variations in the chemistry of some silicates. The integration of VNIR-SWIR and TIR results can provide useful information to remove possible ambiguous interpretations in unmixed sub-pixel surfaces and materials. The image interpretation is based on the thematic categories (Roessner et al. 2001), which are defined by the rules of

The ultimate aim in photogrammetry in generating an urban landscape model is to show the objects in an urban area in 3-D (Juan et al. 2007). As the most permanent features in the urban environment, an accurate extraction of buildings and roads is significant for urban planning and cartographic mapping. Acquisition and integration of data for the built urban environment has always been a challenge due to the high cost and heterogeneous nature of the data sets (Wang 2008). Thus, over the last few years, LiDAR (LIght Detection And Ranging) has been widely applied in the field of photogrammetry and urban 3-D analysis (Tao 2001, Zhou 2004). Airborne LiDAR technique provides geo-referenced 3-D dense points ("cloud") measured roughly perpendicular to the direction of flight over a reflective surface on the ground. This system integrates three basic data-collection tools: a laser scanner, a global positioning system (GPS) and an inertial measuring unit (IMU). The position and

and more accurately captures changes with time (Rashed et al. 2001).

mapping and monitoring and provide a mean for practical application.

altitude of it determined by GPS/INS, therefore, the raw data are collected in the GPS reference system WGS 84.

Generally, 3-D urban built environment models are created using CAD (computer-aided design) tools. There have been many successful projects which have produced detailed and realistic 3-D models for a diverse range of cities (Dodge et al. 1998, Bulmer 2001, Jepson et al. 2001). These city models were created with accurate building models compiled with orthophotographs and exhibited an impressive, realistic urban environment (Chan et al. 1998). However, the creation of 3-D city models using CAD tools and orthophotographs faces some challenges: it is time-consuming and expensive.

The analysis of InSAR (Interferometric Synthetic Aperture Radar) and SAR (Synthetic Aperture Radar) data for urban built targets has several important benefits, such as the ability to adopt numerical tools, and the ability to provide results resembling the real-world situation. In addition, a relation can be found between target geometry and the measured scattering, and according to target-scattering properties, height-retrieval algorithms can be developed. The limitation of this method is that the targets in urban models have to be as detailed as possible; otherwise the results obtained in the modeled environment will be not reliable (Margarit et al. 2007).

The use of 3-D high-spatial-resolution applications in urban built environments is a mainstay of architecture and engineering practice. However, engineering practices are increasingly incorporating different data sets and alternative dissemination systems. Understanding, modeling and forecasting the trends in urban environments are important to recognize and assess the impact of urbanization for resource managers and urban planners. Many applications are suitable sources of reliable information on the multiple facets of the urban environment (Jensen & Cowen 1999, Donnay et al. 2001, Herold et al. 2003). These models have provided simulations of urban dynamics and an understanding of the patterns and processes associated with urbanization (Herold et al. 2005). However, the complexity of urban systems makes it difficult to adequately address changes using a single data type or analysis approach (Allen & Lu 2003).

This chapter presents techniques for data fusion and data registration. The ability to include an accurate and realistic 3-D position, quantitative spectral information, thermal properties and temporal changes provides a near-real-time monitoring system for photogrammetric and urban planning purposes. The method is focusing on registration of multi-sensor and multitemporal information for 3-D urban environment monitoring applications. Generally, data registration is a critical pre-processing procedure in all remote-sensing applications that utilizes multiple sensors inputs, including multi-sensor data fusion, temporal change detection, and data mosaicking. The main objective of this research is a fully controlled, nearreal-time, natural and realistic monitoring system for an urban environment. This task led us first to combine the image-processing and map-matching procedures, and then to incorporate remote sensing and GIS tools into an integrative method for data fusion and registration. To support this new data model, traditional spatial databases were extended to support 5-D data.

This chapter is organized as follows. Section 2 describes the materials and methods, which are implemented in the 3-D urban environment model presented in Section 3. Section 4 addresses to the generic 3-D urban application, which involves data fusion and contextual information of the environment.

Fusion of Optical and Thermal Imagery and LiDAR Data for

the frame rate is 33 fps with adjustable spectral sampling.

are processed by applying the surface-based clustering methods.

**2.3.1 Hyperspectral airborne and ground imagery** 

assessed and quantified for each mission.

(Brook & Ben-Dor 2011b).

infrared camera (FLIR Systems, Inc.).

**2.3 Data processing** 

Application to 3-D Urban Environment and Structure Monitoring 33

The ground spectral camera HS (Specim Ltd.) is a pushbroom scan camera that integrate ImSpector imaging spectrograph and an area monochrome camera. The camera's sensitive high speed interlaced CCD (Charge-Coupled Device) detector simultaneously acquires images in 850 contiguous spectral bands and covers the 0.4 to 1 µm spectral region with bandwidths of 2.8 nm. The spatial resolution is 1600 pixels in the cross-track direction, and

The ground truth reflectance data were measured for the calibration/validation targets by the ASD "FieldSpec Pro" (ASD.Inc, Boulder, CO) VNIR-SWIR spectrometer. Internally averaged scans were 100 ms each. The wavelength-dependent signal-to-noise ratio (S/N) is estimated by taking repeat measurements of a Spectralon white-reference panel over a 10 min interval and analyzing the spectral variation across this period. For each sample, three spectral replicates were acquired and the average was used as the representative spectrum. The ground truth thermal data were collected by a thermometer and thermocouples installed within calibration/validation targets (water bodies) and a thermal radiometer

This research integrates multi-sensor (airborne sensor, ground camera and field devises) and multi-temporal information into fully operational monitoring application. The aim of this sub-paragraph is to present several techniques for imagery and LiDAR data processing. The classification approaches for airborne and ground hyperspectral imagery are firstly presented. The radiance measured by these sensors strongly depends on the atmospheric conditions, which might bias the results of material identification/classification algorithms that rely on hyperspectral image data. The desire to relate imagery data to intrinsic surface properties has led to the development of atmospheric correction algorithms that attempt to recover surface reflectance or emission from at-sensor radiance. Secondly, the LiDAR data

Accurate spectral reflectance information is a key factor in retrieving correct thematic results. In general, the quality of HRS sensors varies from very high to moderate (and even very poor) in terms of signal-to-noise ratio, radiometric accuracy and sensor stability. Instability of the sensors' radiometric performance (stripes, saturation, etc.) might be caused by either known or unknown factors encountered during sensor transport, installation and/or even data acquisition. As part of data pre-processing, these distortions have to be

A full-chain atmospheric calibration SVC (supervised vicarious calibration) method (Brook & Ben-Dor 2011a) is applied to extract reflectance information from hyperspectral imagery. This method is based on a mission-by-mission approach, followed by a unique vicarious calibration site. In this study, the acquired AISA-Dual and HS images were subjected to the SVC method, which includes two radiometric recalibration techniques (F1 and F2) and two atmospheric correction approaches (F3 and F4). The atmospheric correction incorporate deshadow algorithm, which is applied o the map provided by the boresight ratio band

## **2. Materials and methods**

## **2.1 Study area**

Two separate datasets were utilized in this study. The first dataset was acquired over the suburban Mediterranean area on 10 Oct 2006 at 03h37 UTC and at 11h20 UTC. This area combines natural and engineered terrains (average elevation of 560m above sea level), a hill in the north of the studied polygon area and a valley in the center. The entire scene consists of rows of terraced houses located at the center of the image. The neighborhood consists of cottage houses (two and three floors) with tile roofs, flat white-colored concrete roofs and balconies, asphalt roads and parking lots, planted and natural vegetation, gravel paths and bare brown forest soil. The height of large buildings ranges from 8 to 16 m. A group of tall pine trees with various heights and shapes are located on the streets and the Mediterranean forest can be found in the corner of the scene.

The second dataset was acquired over urban settlement, on 15 Aug 2007 at 02h54 UTC and at 12h30 UTC. This area combines natural, agriculture and engineered terrains (average elevation of 30m above sea level). The urban settlement consists of houses (two and three floors) and public buildings (schools and municipalities buildings) with flat concrete, asphalt or whitewash roofing, asphalt roads and parking lots, planted and natural vegetation, gravel paths, bare brown reddish Mediterranean and agriculture soils, greenhouses and whitewash henhouse roofing. The height of large buildings ranges from 3 to 21 m.

## **2.2 Data-acquisition systems**

The research combines airborne and ground data collected from different platforms and different operated systems. The collected imagery data were validated and compared to the ground truth in situ measurements collected during the campaigns.

The first airborne platform carries AISA-Dual hyperspectral system. The airborne imaging spectrometer AISA-Dual (Specim Ltd.) is a dual hyperspectral pushbroom system, which combines the Aisa EAGLE (VNIR region) and Aisa HAWK (SWIR region) sensors. For the selected campaigns, the sensor simultaneously acquired images in 198 contiguous spectral bands, covering the 0.4 to 2.5 µm spectral region with bandwidths of ~10 nm for Aisa EAGLE and ~5 nm for Aisa HAWK. The sensor altitude was 10,000 ft, providing a 1.6 m spatial resolution for 286 pixels in the cross-track direction. A standard AISA-Dual data set is a 3-D data cube in a non-earth coordinate system (raw matrix geometry).

The second airborne platform carries hyperspectral TIR system, which is a line-scanner with 28 spectral bands in the thermal ranges 3-5 μm and 8-13 μm. It has 328 pixels in the crosstrack direction and hundreds of pixels in the along-track direction with a spatial resolution of 1.4m.

The third airborne platform carries the LiDAR system. This system operates at 1500 nm wavelength with a 165 kHz laser repetition rate and 100 Hz scanning rate and provides a spatial/footprint resolution of 0.5 m and an accuracy of 0.1 m. The scanner has a multi-pulse system that could record up to five different returns, but in this study, only the first return was recorded and analyzed.

The ground spectral camera HS (Specim Ltd.) is a pushbroom scan camera that integrate ImSpector imaging spectrograph and an area monochrome camera. The camera's sensitive high speed interlaced CCD (Charge-Coupled Device) detector simultaneously acquires images in 850 contiguous spectral bands and covers the 0.4 to 1 µm spectral region with bandwidths of 2.8 nm. The spatial resolution is 1600 pixels in the cross-track direction, and the frame rate is 33 fps with adjustable spectral sampling.

The ground truth reflectance data were measured for the calibration/validation targets by the ASD "FieldSpec Pro" (ASD.Inc, Boulder, CO) VNIR-SWIR spectrometer. Internally averaged scans were 100 ms each. The wavelength-dependent signal-to-noise ratio (S/N) is estimated by taking repeat measurements of a Spectralon white-reference panel over a 10 min interval and analyzing the spectral variation across this period. For each sample, three spectral replicates were acquired and the average was used as the representative spectrum. The ground truth thermal data were collected by a thermometer and thermocouples installed within calibration/validation targets (water bodies) and a thermal radiometer infrared camera (FLIR Systems, Inc.).

## **2.3 Data processing**

32 Remote Sensing – Advanced Techniques and Platforms

Two separate datasets were utilized in this study. The first dataset was acquired over the suburban Mediterranean area on 10 Oct 2006 at 03h37 UTC and at 11h20 UTC. This area combines natural and engineered terrains (average elevation of 560m above sea level), a hill in the north of the studied polygon area and a valley in the center. The entire scene consists of rows of terraced houses located at the center of the image. The neighborhood consists of cottage houses (two and three floors) with tile roofs, flat white-colored concrete roofs and balconies, asphalt roads and parking lots, planted and natural vegetation, gravel paths and bare brown forest soil. The height of large buildings ranges from 8 to 16 m. A group of tall pine trees with various heights and shapes are located on the streets and the Mediterranean

The second dataset was acquired over urban settlement, on 15 Aug 2007 at 02h54 UTC and at 12h30 UTC. This area combines natural, agriculture and engineered terrains (average elevation of 30m above sea level). The urban settlement consists of houses (two and three floors) and public buildings (schools and municipalities buildings) with flat concrete, asphalt or whitewash roofing, asphalt roads and parking lots, planted and natural vegetation, gravel paths, bare brown reddish Mediterranean and agriculture soils, greenhouses and whitewash henhouse roofing. The height of large buildings ranges from 3

The research combines airborne and ground data collected from different platforms and different operated systems. The collected imagery data were validated and compared to the

The first airborne platform carries AISA-Dual hyperspectral system. The airborne imaging spectrometer AISA-Dual (Specim Ltd.) is a dual hyperspectral pushbroom system, which combines the Aisa EAGLE (VNIR region) and Aisa HAWK (SWIR region) sensors. For the selected campaigns, the sensor simultaneously acquired images in 198 contiguous spectral bands, covering the 0.4 to 2.5 µm spectral region with bandwidths of ~10 nm for Aisa EAGLE and ~5 nm for Aisa HAWK. The sensor altitude was 10,000 ft, providing a 1.6 m spatial resolution for 286 pixels in the cross-track direction. A standard AISA-Dual data set

The second airborne platform carries hyperspectral TIR system, which is a line-scanner with 28 spectral bands in the thermal ranges 3-5 μm and 8-13 μm. It has 328 pixels in the crosstrack direction and hundreds of pixels in the along-track direction with a spatial resolution

The third airborne platform carries the LiDAR system. This system operates at 1500 nm wavelength with a 165 kHz laser repetition rate and 100 Hz scanning rate and provides a spatial/footprint resolution of 0.5 m and an accuracy of 0.1 m. The scanner has a multi-pulse system that could record up to five different returns, but in this study, only the first return

ground truth in situ measurements collected during the campaigns.

is a 3-D data cube in a non-earth coordinate system (raw matrix geometry).

**2. Materials and methods** 

forest can be found in the corner of the scene.

**2.1 Study area** 

to 21 m.

of 1.4m.

was recorded and analyzed.

**2.2 Data-acquisition systems** 

This research integrates multi-sensor (airborne sensor, ground camera and field devises) and multi-temporal information into fully operational monitoring application. The aim of this sub-paragraph is to present several techniques for imagery and LiDAR data processing.

The classification approaches for airborne and ground hyperspectral imagery are firstly presented. The radiance measured by these sensors strongly depends on the atmospheric conditions, which might bias the results of material identification/classification algorithms that rely on hyperspectral image data. The desire to relate imagery data to intrinsic surface properties has led to the development of atmospheric correction algorithms that attempt to recover surface reflectance or emission from at-sensor radiance. Secondly, the LiDAR data are processed by applying the surface-based clustering methods.

## **2.3.1 Hyperspectral airborne and ground imagery**

Accurate spectral reflectance information is a key factor in retrieving correct thematic results. In general, the quality of HRS sensors varies from very high to moderate (and even very poor) in terms of signal-to-noise ratio, radiometric accuracy and sensor stability. Instability of the sensors' radiometric performance (stripes, saturation, etc.) might be caused by either known or unknown factors encountered during sensor transport, installation and/or even data acquisition. As part of data pre-processing, these distortions have to be assessed and quantified for each mission.

A full-chain atmospheric calibration SVC (supervised vicarious calibration) method (Brook & Ben-Dor 2011a) is applied to extract reflectance information from hyperspectral imagery. This method is based on a mission-by-mission approach, followed by a unique vicarious calibration site. In this study, the acquired AISA-Dual and HS images were subjected to the SVC method, which includes two radiometric recalibration techniques (F1 and F2) and two atmospheric correction approaches (F3 and F4). The atmospheric correction incorporate deshadow algorithm, which is applied o the map provided by the boresight ratio band (Brook & Ben-Dor 2011b).

Fusion of Optical and Thermal Imagery and LiDAR Data for

Application to 3-D Urban Environment and Structure Monitoring 35

The validation of the thematic map is performed by comparing ground truth and image reflectance data of the selected targets. The ten well-known targets (areas of approximately 30-40 pixels) were spectrally measured (using ASD SpecPro) and documented. The overall accuracy for the Ma'alot Tarshiha images was 96.8 and for the Qalansawe images it was 97.4. The exact location of each target within the scenes was captured using aerial orthophoto and ground truth field survey. The confusion matrices (Tables 1 and 2) and ROC (receiver operating characteristic) curve (Table 3) were calculated by comparison between number of pixels in each class (concrete, asphalt, scuffed asphalt) and ground truth maps. The overall accuracy of both images stands in good agreement, thus it can be concluded that the

Ground truth (%)

Ground truth (%)

suggested classification algorithm (Figure 1) performance is stable and accurate.

Unclassified 0 0 0 Concrete **96.2** 1.7 2.1 Asphalt 1.1 **98** 0.9 Scuffed Asphalt 2.8 0.2 **97** 

Table 1. Confusion matrix of the Ma'alot Tarshiha image for selected classes

(Correspondence accuracies are in bold.)

accuracies are in bold.)

Class Concrete Asphalt Scuffed asphalt

Class Concrete Asphalt Scuffed asphalt

Table 2. Confusion matrix of the Qalansawe image for selected classes (Correspondence

DR 0.97 0.97 0.94 0.98 0.96 0.93 Area 0.99 0.99 0.96 0.99 0.98 0.95 Table 3. Detection rates (DR) of concrete, asphalt and scuffed asphalt for false alarm

Atmospheric correction is a key processing step for extracting information from thermal infrared imagery. The ground-leaving radiance combined with temperature/emissivity separation (TES) algorithms are generated and supplied to in-scene atmospheric

probability 0.1 according to ROC and area under the curve

**2.3.2 Thermal airborne and ground imagery** 

Ma'alot image Qalansawe image Concrete Asphalt Scuffed asphalt Concrete Asphalt Scuffed asphalt

Unclassified 0 0 0 Concrete **96.3** 0.8 2.9 Asphalt 0 **98.4** 1.6 Scuffed Asphalt 4.5 0 **95.5** 

The hyperspectral reflectance images are subjected to the data processing stage, which is operated in four steps (Figure 1). First step is a general coarse classification. Each "pure" pixel is assigned to a class in order to predefine the threshold of the probabilistic output of a support vector machine (SVM) algorithm, or remains unclassified (Villa et al., 2011). The unclassified pixels might associate with mixed spectra pixels, thus their classification is addressed at the third stage by the unmixing method in order to obtain the abundance fraction of each endmember class. Prior to this step, a second step is applied, where spectral data are reduced by the selected algorithm. The input variables in terms of absorption features can be reduced through a sequential forward selection (SFS) algorithm (Whitney, 1971). This method starts with the inclusion of feature sets one by one to minimize the prediction error of a linear regression model and focuses on conditional exclusion based on feature significance (Pudil et al., 1994). This step is proven to enhance overall performance of spectral models.

Fig. 1. Flow chart scheme of the classification approach for hyperspectral airborne and ground data

The nonnegative matrix factorization (NMF) was offered as an alternative method for linear unmixing (Lee et al., 2000). This algorithm search for the source and the transform by factorizing a matrix subject to positive constraints based on gradient optimization and Euclidean norm designation (Pauca et al, 2006; Robila & Maciak, 2006). We generated an algorithm that starts with the random linear transform to the nonnegative source data. The algorithm is continuously computing scalar factors that are chosen to produce the "best" intermediate source and transform. At each step of the algorithm the source and transform should remain positive. The final stage is a method for image segmentation combined with a Markov random field (MRF) model under a Bayesian framework (Yang & Jiang, 2003).

The hyperspectral reflectance images are subjected to the data processing stage, which is operated in four steps (Figure 1). First step is a general coarse classification. Each "pure" pixel is assigned to a class in order to predefine the threshold of the probabilistic output of a support vector machine (SVM) algorithm, or remains unclassified (Villa et al., 2011). The unclassified pixels might associate with mixed spectra pixels, thus their classification is addressed at the third stage by the unmixing method in order to obtain the abundance fraction of each endmember class. Prior to this step, a second step is applied, where spectral data are reduced by the selected algorithm. The input variables in terms of absorption features can be reduced through a sequential forward selection (SFS) algorithm (Whitney, 1971). This method starts with the inclusion of feature sets one by one to minimize the prediction error of a linear regression model and focuses on conditional exclusion based on feature significance (Pudil et

al., 1994). This step is proven to enhance overall performance of spectral models.

Fig. 1. Flow chart scheme of the classification approach for hyperspectral airborne and

The nonnegative matrix factorization (NMF) was offered as an alternative method for linear unmixing (Lee et al., 2000). This algorithm search for the source and the transform by factorizing a matrix subject to positive constraints based on gradient optimization and Euclidean norm designation (Pauca et al, 2006; Robila & Maciak, 2006). We generated an algorithm that starts with the random linear transform to the nonnegative source data. The algorithm is continuously computing scalar factors that are chosen to produce the "best" intermediate source and transform. At each step of the algorithm the source and transform should remain positive. The final stage is a method for image segmentation combined with a Markov random field (MRF) model under a Bayesian framework (Yang & Jiang, 2003).

ground data

The validation of the thematic map is performed by comparing ground truth and image reflectance data of the selected targets. The ten well-known targets (areas of approximately 30-40 pixels) were spectrally measured (using ASD SpecPro) and documented. The overall accuracy for the Ma'alot Tarshiha images was 96.8 and for the Qalansawe images it was 97.4. The exact location of each target within the scenes was captured using aerial orthophoto and ground truth field survey. The confusion matrices (Tables 1 and 2) and ROC (receiver operating characteristic) curve (Table 3) were calculated by comparison between number of pixels in each class (concrete, asphalt, scuffed asphalt) and ground truth maps. The overall accuracy of both images stands in good agreement, thus it can be concluded that the suggested classification algorithm (Figure 1) performance is stable and accurate.


Table 1. Confusion matrix of the Ma'alot Tarshiha image for selected classes (Correspondence accuracies are in bold.)


Table 2. Confusion matrix of the Qalansawe image for selected classes (Correspondence accuracies are in bold.)


Table 3. Detection rates (DR) of concrete, asphalt and scuffed asphalt for false alarm probability 0.1 according to ROC and area under the curve

## **2.3.2 Thermal airborne and ground imagery**

Atmospheric correction is a key processing step for extracting information from thermal infrared imagery. The ground-leaving radiance combined with temperature/emissivity separation (TES) algorithms are generated and supplied to in-scene atmospheric

Fusion of Optical and Thermal Imagery and LiDAR Data for

0.9

0.92

0.94

0.96

**Emissivity**

0.98

1

A

Application to 3-D Urban Environment and Structure Monitoring 37

**Tile Roof**

3 5 7 911 13

סידרה1

Image Ground Truth

סידרה2

סידרה 1 סידרה 2

Image Ground Truth

**Wavelength (µm)**

**Bitumen**

Fig. 3. Emissivity calculated from the thermal radiance. A is a tile roof and B is a bitumen

origins in mining and geostatistical applications involving spatially and temporally

3 5 7 9 11 13

**Wavelength (µm)**

The surface analysis (Figure 4) is first represented as a DEM (digital elevation model) of the scanned scene, where data are separated into on-terrain and off-terrain points (Masaharu and Ohtsubo 2002). In this study, the Kriging Gaussian correlation function was utilized to visualize and illustrate the edited DEM as a surface-response function. Note that the

The DTM (digital terrain model) was created by a morphological scale-opening filter, using square structural elements (Rottensteiner et al., 2003). Then, according to the filter, the slope

interpolation converts irregularly spaced LiDAR data to a self-adaptive DSM.

roof

correlated data (Cressie 1993).

0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1

**Emissivity**

B

compensation ISAC1 (Young et al., 2002). This model requires only the calibrated, ataperture radiance data to estimate the upwelling radiance and transmittance of the atmosphere. It is an effective atmospheric correction that produces spectra that compare favorably to the Planck function.

The ground truth must include several targets as water, sand or soil continuously measured by installed thermocouples. The generating atmospheric data cube may be used as an input to a temperature emissivity separation algorithm (normalized emissivity method). The proposed thermal classification method follows the same four stages of data processing (SVM's probabilistic map; data reduction; umnixing; classification) applied to the preprocessed emissivity imagery (Figure 2).

Fig. 2. Flow chart scheme of the thermal airborne and ground data preprocessing

From the physical definition, the spectral characteristics of urban materials in the reflective and thermal ranges are related. Segl (Segl et al., 2003) showed that materials with high albedos in the reflective range produce low albedos in the thermal range and vice versa, due to a better energy absorption in the reflective region. However, it is reported that bitumen roofing and asphalt pavement generate distinct spectral differences in the thermal wavelength range. The thermal measurements remain a compelling focus on a climate research in the built urban areas. However, the thermal airborne and ground imagery permit definition of UHI (for the ground surface) and resolve streets, roofs and walls. The successful numerical model of the urban areas is acquired during night-time conditions, when solar shading is absent and turbulent interactions are minimal.

The validation of the thematic map is performed by comparing ground truth and image emissivity data. The five targets (concrete, sand lot, bitumen, tile roof and polyethylene) were measured and documented. The resulting emissivity signatures are in good agreement with ground-truth data (two examples in Figure 3A and 3B). The results presented here confirm the robustness and stability of the suggested algorithm.

## **2.3.3 Airborne LiDAR data**

LiDAR data provides precise information about the geometrical properties of the surfaces and can reflect the different shapes and formations in the complex urban environment. The point cloud (irregularly spaced points) was interpolated into the digital surface model (DSM) by applying the Kriging technique (Sacks et al. 1989). The Kriging model has its

<sup>1</sup> ISAC (in-scene atmospheric compensations) model is implemented in ENVI®

compensation ISAC1 (Young et al., 2002). This model requires only the calibrated, ataperture radiance data to estimate the upwelling radiance and transmittance of the atmosphere. It is an effective atmospheric correction that produces spectra that compare

The ground truth must include several targets as water, sand or soil continuously measured by installed thermocouples. The generating atmospheric data cube may be used as an input to a temperature emissivity separation algorithm (normalized emissivity method). The proposed thermal classification method follows the same four stages of data processing (SVM's probabilistic map; data reduction; umnixing; classification) applied to the pre-

Fig. 2. Flow chart scheme of the thermal airborne and ground data preprocessing

Input Images: Airborne Thermal Ground Thermal

Preprocessing: Radiometric correction by system operator

when solar shading is absent and turbulent interactions are minimal.

confirm the robustness and stability of the suggested algorithm.

1 ISAC (in-scene atmospheric compensations) model is implemented in ENVI®

**2.3.3 Airborne LiDAR data** 

From the physical definition, the spectral characteristics of urban materials in the reflective and thermal ranges are related. Segl (Segl et al., 2003) showed that materials with high albedos in the reflective range produce low albedos in the thermal range and vice versa, due to a better energy absorption in the reflective region. However, it is reported that bitumen roofing and asphalt pavement generate distinct spectral differences in the thermal wavelength range. The thermal measurements remain a compelling focus on a climate research in the built urban areas. However, the thermal airborne and ground imagery permit definition of UHI (for the ground surface) and resolve streets, roofs and walls. The successful numerical model of the urban areas is acquired during night-time conditions,

Atmospheric correction Temperature Image

Emissivity Image

The validation of the thematic map is performed by comparing ground truth and image emissivity data. The five targets (concrete, sand lot, bitumen, tile roof and polyethylene) were measured and documented. The resulting emissivity signatures are in good agreement with ground-truth data (two examples in Figure 3A and 3B). The results presented here

LiDAR data provides precise information about the geometrical properties of the surfaces and can reflect the different shapes and formations in the complex urban environment. The point cloud (irregularly spaced points) was interpolated into the digital surface model (DSM) by applying the Kriging technique (Sacks et al. 1989). The Kriging model has its

favorably to the Planck function.

processed emissivity imagery (Figure 2).

Fig. 3. Emissivity calculated from the thermal radiance. A is a tile roof and B is a bitumen roof

origins in mining and geostatistical applications involving spatially and temporally correlated data (Cressie 1993).

The surface analysis (Figure 4) is first represented as a DEM (digital elevation model) of the scanned scene, where data are separated into on-terrain and off-terrain points (Masaharu and Ohtsubo 2002). In this study, the Kriging Gaussian correlation function was utilized to visualize and illustrate the edited DEM as a surface-response function. Note that the interpolation converts irregularly spaced LiDAR data to a self-adaptive DSM.

The DTM (digital terrain model) was created by a morphological scale-opening filter, using square structural elements (Rottensteiner et al., 2003). Then, according to the filter, the slope

Fusion of Optical and Thermal Imagery and LiDAR Data for

**Input**

Thematic Map

with no need for human intervention.

handling multi-sensor and multi-type data sets.

Fig. 5. Flow chart scheme of the input data and registration techniques

Hyperspectral / Thermal Airborne Ground Imagery

> Facades Thematic Map

Chung 2004, Fan et al., 2005, Zavorin & Moigne 2005, Xu & Chen 2007).

Application to 3-D Urban Environment and Structure Monitoring 39

**Input** LiDAR

Urban 3-D Objects

detection, and data mosaicking (Moigne et al., 2002). In manual registration, the selection of control points (CPs) is usually performed by a human operator. This has proven to be inaccurate, time-consuming, and unfeasible due to data complexity, which makes it cumbersome or even impossible for the human eye to discern the suitable CPs. Therefore, researchers have focused on automating feature detection to align two or more data sets

*AIRTop Algorithm*

*Manual Registration*

The automatic registration of data sets has generated extensive research interest in the fields of computer vision, medical imaging and remote sensing. Comprehensive reviews have been published by Brown (1992) and Zitova and Flusser (2003). Many proposed schemes for automatic registration employ a multi-resolution process (Viola and Wells 1997, Wu &

The existing automatic data-registration techniques based on spatial information fall into two categories: intensity-based and feature-based (Zitova and Flusser 2003). The featurebased technique extracts salient structures from sensed and reference data sets by accurate feature detectin and by the overlap criterion. As the relevant objects of intereset (e.g., roofs) and lines (e.g., roads) are expected to be stable in time at a fixed position, the feature-based method is more suitable for multi-sensor and multi-data set fusion, change detection and mosaicking. The method generally consists of four steps (Jensen, 2004): (1) CP extraction; (2) transformation-model determination; (3) image transformation and resampling, and (4) assessment of registration accuracy. The first step is the most complex, and its success essentially determines registration accuracy. Thus, the detection method should be able to detect the same features in all projections and in different data, regardless of the particular image/sensor/data type deformation. Despite the achieved performance, the existing methods operate directly on gray intensity values and hence they are not suited for

The suggested algorithm is an adapted version of the four stages AIRTop (Figure 6) algorithm (Brook & Ben-Dor 2011c). First, the significant features are extracted from all input data sets and converted to a vector format. Since the studied scene has a large area, regions of interest (ROI) with relatively large variations are selection. The idea of addressing the registration problem by applying a global-to-local level strategy (the whole image is now divided into regions of interest which are treated as an image) proves to be an elegant way

Fig. 4. Flow chart scheme of the LiDAR data surface analysis

map is estimated. The next stage is to fragment a surface model convolved with highly heterogeneous terrain slopes into subareas with fixed slope (Zhang et al., 2003; Shan & Sampath 2005). At this stage, the terrain is uniformly normalized and the separation between on- and off-terrain points is applicable.

The building boundary is determined by a modified convex hull algorithm (Jarvis 1973) which classifies the cluster data into boundary (contour/edge) and non-boundary (intershape) points (Jarvis 1977). Separating points located on buildings from those on trees and bushes, is a difficult task (Wang & Shan 2009). The common assumption is that the building outlines are separated from the trees in terms of size and shape. The dimensionality learning method, proposed by Wang and Shan (2009), is an efficient technique for this purpose.

In relatively flat urban areas, the roads, which have the same elevation (height) as a bare surface, can be extracted by arrangement examination. The simple geometric and topological relations between streets might be used to improve the consistency of road extraction. First, the DEM data are used to obtain candidate roads, sidewalks and parking lots. Then the road model is established, based on the continuous network of points which are used to extract information such as centerline, edge and width of the road (Akel et al. 2003; Hinz & Baumgartner 2003; Cloude et al., 2004).

## **2.4 Data registration: Automatic and manual approaches**

The optical and thermal imagery and LiDAR data have fundamentally different characteristics. The LiDAR data (monochromatic NIR laser pulse) provides terrain characteristics; hence, optical imagery (radiation reflected back from the surface at many wavelengths) provides ability for in situ, easy, rapid and accurate assessment of many materials on a spatial/spectral/temporal domain, and thermal imagery determines temperatures and radiance signature of urban materials and land covers. Since all these datasets (Figure 5) are crucial for the assessment and classification of the urban area, a novel method for automatic registration and data fusion is needed.

Data fusion techniques combine data from multiple sensors and related information from associated databases. The integrated data set achieves higher accuracy and more specific inferences that might be obtained by the use of single sensor alone. In general, data registration is a critical preprocessing procedure in all remote-sensing applications that utilizes multiple sensor inputs, including multi-sensor data fusion, temporal change

map is estimated. The next stage is to fragment a surface model convolved with highly heterogeneous terrain slopes into subareas with fixed slope (Zhang et al., 2003; Shan & Sampath 2005). At this stage, the terrain is uniformly normalized and the separation

The building boundary is determined by a modified convex hull algorithm (Jarvis 1973) which classifies the cluster data into boundary (contour/edge) and non-boundary (intershape) points (Jarvis 1977). Separating points located on buildings from those on trees and bushes, is a difficult task (Wang & Shan 2009). The common assumption is that the building outlines are separated from the trees in terms of size and shape. The dimensionality learning method, proposed by Wang and Shan (2009), is an efficient technique for this purpose.

In relatively flat urban areas, the roads, which have the same elevation (height) as a bare surface, can be extracted by arrangement examination. The simple geometric and topological relations between streets might be used to improve the consistency of road extraction. First, the DEM data are used to obtain candidate roads, sidewalks and parking lots. Then the road model is established, based on the continuous network of points which are used to extract information such as centerline, edge and width of the road (Akel et al.

The optical and thermal imagery and LiDAR data have fundamentally different characteristics. The LiDAR data (monochromatic NIR laser pulse) provides terrain characteristics; hence, optical imagery (radiation reflected back from the surface at many wavelengths) provides ability for in situ, easy, rapid and accurate assessment of many materials on a spatial/spectral/temporal domain, and thermal imagery determines temperatures and radiance signature of urban materials and land covers. Since all these datasets (Figure 5) are crucial for the assessment and classification of the urban area, a novel

Data fusion techniques combine data from multiple sensors and related information from associated databases. The integrated data set achieves higher accuracy and more specific inferences that might be obtained by the use of single sensor alone. In general, data registration is a critical preprocessing procedure in all remote-sensing applications that utilizes multiple sensor inputs, including multi-sensor data fusion, temporal change

Fig. 4. Flow chart scheme of the LiDAR data surface analysis

between on- and off-terrain points is applicable.

2003; Hinz & Baumgartner 2003; Cloude et al., 2004).

**2.4 Data registration: Automatic and manual approaches** 

method for automatic registration and data fusion is needed.

Fig. 5. Flow chart scheme of the input data and registration techniques

detection, and data mosaicking (Moigne et al., 2002). In manual registration, the selection of control points (CPs) is usually performed by a human operator. This has proven to be inaccurate, time-consuming, and unfeasible due to data complexity, which makes it cumbersome or even impossible for the human eye to discern the suitable CPs. Therefore, researchers have focused on automating feature detection to align two or more data sets with no need for human intervention.

The automatic registration of data sets has generated extensive research interest in the fields of computer vision, medical imaging and remote sensing. Comprehensive reviews have been published by Brown (1992) and Zitova and Flusser (2003). Many proposed schemes for automatic registration employ a multi-resolution process (Viola and Wells 1997, Wu & Chung 2004, Fan et al., 2005, Zavorin & Moigne 2005, Xu & Chen 2007).

The existing automatic data-registration techniques based on spatial information fall into two categories: intensity-based and feature-based (Zitova and Flusser 2003). The featurebased technique extracts salient structures from sensed and reference data sets by accurate feature detectin and by the overlap criterion. As the relevant objects of intereset (e.g., roofs) and lines (e.g., roads) are expected to be stable in time at a fixed position, the feature-based method is more suitable for multi-sensor and multi-data set fusion, change detection and mosaicking. The method generally consists of four steps (Jensen, 2004): (1) CP extraction; (2) transformation-model determination; (3) image transformation and resampling, and (4) assessment of registration accuracy. The first step is the most complex, and its success essentially determines registration accuracy. Thus, the detection method should be able to detect the same features in all projections and in different data, regardless of the particular image/sensor/data type deformation. Despite the achieved performance, the existing methods operate directly on gray intensity values and hence they are not suited for handling multi-sensor and multi-type data sets.

The suggested algorithm is an adapted version of the four stages AIRTop (Figure 6) algorithm (Brook & Ben-Dor 2011c). First, the significant features are extracted from all input data sets and converted to a vector format. Since the studied scene has a large area, regions of interest (ROI) with relatively large variations are selection. The idea of addressing the registration problem by applying a global-to-local level strategy (the whole image is now divided into regions of interest which are treated as an image) proves to be an elegant way

Fusion of Optical and Thermal Imagery and LiDAR Data for

pixels size and location).

**3. 3-D urban environment model** 

D scenes.

2 http://www.citygml.org/

up geometries, which in turn composes layers.

Application to 3-D Urban Environment and Structure Monitoring 41

features of the sensed goereferenced LiDAR data according to a non georeferenced imagery in order to reserve original raw geometry, dimensionality and imagery matrices (imagery

In the proposed 3-D urban application, the manual registration is used to register facades imagery and thematic mapping acquired by ground sensors and simplify buildings model extracted from LiDAR. This method is executed by a human operator, who identifies a set of corresponding CPs from the images and referenced control building model. Despite the fact that manual registration has been proven inaccurate and time-consuming due to data complexity, this method is still the most widely used technique. We found that for the

The urban database-driven 3-D model represents a realistic illustration of the environment that can be regularly updated with attribute details and sensor-based information. The spatial data model is a hierarchical structure (Figure 7), consisting of elements, which make

Level 1- "City 3D"

Level 2- "Building Model"

Level 3- "Spectral Model"

A fundamental demand in non-traditional, multi-sensors and multi-type applications is spatial indexing. A spatial index, which is a logical index, provides a mechanism to limit searches based on spatial criteria (such as intersection and containment). Due to the variation of data formats and types, it is difficult to satisfy the frequent updating and

An R-tree index is implemented on spatial data by using Oracle's extensible indexing framework (Song et al. 2009). This index approximates the geometry with a single rectangle that minimally encloses the geometry (minimum bounding rectangle MBR). A bounding volume is created around the 3-D object, which equals the bounding volume around the solid. The index is helpful in conducting very fast searches and spatial analyses over large 3-

CityGML2 is an application based on OGC's (open geospatial consortium) GML 3.1. This application not only represents the graphical appearance but in particular, it takes care of the semantic properties (Kolbe et al. 2005), such as the spectral/thematic properties, and model evaluations. The main advantage is the ability to maintain different levels of detail (Kolbe & Bacharach 2006). The underlying model differentiates three levels of detail, for

current data sets, manual registration is the easiest and most accurate solution.

Fig. 7. The 3-D urban environment application's conceptual architecture

extension requirements for developing urban environments.

which objects become more detailed as the level incise.

Fig. 6. A flow chart describing the registration algorithm. Blue box: topology map matching. Orange box: matching process. Green box: validation and accuracy.

of speeding up the whole process, while enhancing the accuracy of the registration procedure (Chantous et al. 2009). Thus, we expected this method to greatly reduce false alarms in the subsequent feature extraction and CP identification steps (Brook et al., 2011). To select the distinct areas in the vector data sets, a map of extracted features is divided into adjacent small blocks (10% × 10% of original image pixels with no overlap between blocks). Then, the significant CPs extraction has been performed by applying the SURF algorithm (Brown & Lowe, 2002). First the fast-Harris corners Detector (Lindeberg, 2004), which based on an integral image, was performed. The Hessian matrix is responsible for primary image rotation using principal points that identified as "interesting" potential CPs in the block. The local feature representing vector is made by combination of Haar wavelet response. The values of dominant directions are defined in relation to the principal point. As the number of interesting points tracked within the block is more than the predefined threshold, the block is selected and considered a suitable candidate for CPs detection.

The spatial distribution and relationship of these features are expressed by topology rules (one-to-one) and they are converted to potential CPs by determining a transformation model between sensed and reference data sets. The defined rules for a weight-based topological map-matching (tMM) algorithm manage (Velaga et al. 2009), transform and resample features of the sensed goereferenced LiDAR data according to a non georeferenced imagery in order to reserve original raw geometry, dimensionality and imagery matrices (imagery pixels size and location).

In the proposed 3-D urban application, the manual registration is used to register facades imagery and thematic mapping acquired by ground sensors and simplify buildings model extracted from LiDAR. This method is executed by a human operator, who identifies a set of corresponding CPs from the images and referenced control building model. Despite the fact that manual registration has been proven inaccurate and time-consuming due to data complexity, this method is still the most widely used technique. We found that for the current data sets, manual registration is the easiest and most accurate solution.

## **3. 3-D urban environment model**

40 Remote Sensing – Advanced Techniques and Platforms

**SURF Algorithm**

Preliminary Detection

**Loop:** 

Calculation of each element in pairs

**Weights Hierarchy**

No

**Fast-Harris**  Corner detection

Yes

**Haar Wavelet Response** Region orientation

**Collect CPs** 

**Test Point Error (TPE) Threshold**

**Complete Algorithm**

Yes


Integral Image

**Unpacking (**Topology match checking**):**  -Feature composition relationship

No Yes **Determine** 

**CP Pair**

**RMS Threshold**

**Execute Map-Matching** Transform

Hassian Approximation

Fig. 6. A flow chart describing the registration algorithm. Blue box: topology map matching.

No

of speeding up the whole process, while enhancing the accuracy of the registration procedure (Chantous et al. 2009). Thus, we expected this method to greatly reduce false alarms in the subsequent feature extraction and CP identification steps (Brook et al., 2011). To select the distinct areas in the vector data sets, a map of extracted features is divided into adjacent small blocks (10% × 10% of original image pixels with no overlap between blocks). Then, the significant CPs extraction has been performed by applying the SURF algorithm (Brown & Lowe, 2002). First the fast-Harris corners Detector (Lindeberg, 2004), which based on an integral image, was performed. The Hessian matrix is responsible for primary image rotation using principal points that identified as "interesting" potential CPs in the block. The local feature representing vector is made by combination of Haar wavelet response. The values of dominant directions are defined in relation to the principal point. As the number of interesting points tracked within the block is more than the predefined threshold, the

The spatial distribution and relationship of these features are expressed by topology rules (one-to-one) and they are converted to potential CPs by determining a transformation model between sensed and reference data sets. The defined rules for a weight-based topological map-matching (tMM) algorithm manage (Velaga et al. 2009), transform and resample

Orange box: matching process. Green box: validation and accuracy.

**Initialization:**  -Insert features -Topology building

Primary Detection

**Automatic Registration - Control Point Detection**

> **Matching:**  -Linear network -Transfer match

**Weight-based Topological Map-matching**

**Loop:**  Calculation of lower weight pair

block is selected and considered a suitable candidate for CPs detection.

The urban database-driven 3-D model represents a realistic illustration of the environment that can be regularly updated with attribute details and sensor-based information. The spatial data model is a hierarchical structure (Figure 7), consisting of elements, which make up geometries, which in turn composes layers.

Fig. 7. The 3-D urban environment application's conceptual architecture

A fundamental demand in non-traditional, multi-sensors and multi-type applications is spatial indexing. A spatial index, which is a logical index, provides a mechanism to limit searches based on spatial criteria (such as intersection and containment). Due to the variation of data formats and types, it is difficult to satisfy the frequent updating and extension requirements for developing urban environments.

An R-tree index is implemented on spatial data by using Oracle's extensible indexing framework (Song et al. 2009). This index approximates the geometry with a single rectangle that minimally encloses the geometry (minimum bounding rectangle MBR). A bounding volume is created around the 3-D object, which equals the bounding volume around the solid. The index is helpful in conducting very fast searches and spatial analyses over large 3- D scenes.

CityGML2 is an application based on OGC's (open geospatial consortium) GML 3.1. This application not only represents the graphical appearance but in particular, it takes care of the semantic properties (Kolbe et al. 2005), such as the spectral/thematic properties, and model evaluations. The main advantage is the ability to maintain different levels of detail (Kolbe & Bacharach 2006). The underlying model differentiates three levels of detail, for which objects become more detailed as the level incise.

<sup>2</sup> http://www.citygml.org/

Fusion of Optical and Thermal Imagery and LiDAR Data for

Data Fusion *Fuzzy Logic*

*Data integration*

Data source

Component /Operator

Data processing

Layer

Data source

Component /Operator

Data processing

Layer

Interface Browser CityGML *Downloadable application*

Application to 3-D Urban Environment and Structure Monitoring 43

Thermal map Thematic

Buildings model Ground imagery

Spatial selection

Interface Browser CityGML *Downloadable application*

**Level 2 – "Building Model"** 

map

*3-D thematic analysis/interaction*

**Level 3 – "Spectral Model"** 

Process Map

Spatial selection

*Fuzzy Rules*

Spatial selection

*Fuzzy Rules*

Material Map

Quality Map

Process Map

Fig. 9. The 3-D urban environment application – Level 2 (detailed architecture)

Digital Picture Measurements

Spectra model

Spatial selection

Fig. 10. The 3-D urban environment application – Level 3 (detailed architecture)

*2-D thematic analysis/interaction*

The 3-D monitoring built urban environment application, up to this point, employs single processing algorithms applied on imagery or LiDAR data, without taken into account contextual information. The data fusion application must provide fully integrated information, both of the classification products and the context within the scene. In the proposed application, a complete classification and identification task consist of subtasks, which have to operate on material and object characteristic/shape levels provided by accurately registered database. Moreover, the final fused and integrated application should be operated on objects of different sizes and scales, such as a single building detected within the urban area or a selected region on a building facade. The multi-scale and multi-sensor data fusion is possible with the eCognition procedure (user guide eCognition, 2003), when

**4. 3-D urban environment application** 

the substructures are archived by a hierarchical network.

The 3-D urban application is based on an integrated data set: spectral models, ground camera and airborne images, and LiDAR data. The system requirements are defined to include geo-spatial planning information and one-to-one topology. The concept architecture diagram is presented in Figure 4. As the model consist visualization and interactivity with maps and 3-D scenes, the interface includes 3-D interaction, 2-D vertical and horizontal interactions and browsers that contain spectral/thematic temporal information. The 3-D urban application provides services such as thematic mapping, and a complete quantitative review of the building and it's surrounding with respect to temporal monitoring. The design of the application shows the possibilities of delivering integrated information and thus holistic views of whole urban environments in a freeze-frame view of the spatiotemporal domain.

The self-sufficient/self-determining levels of the integrated information contribute different parts to this global urban environmental application. The first level (Figure 8), termed "City 3-D", supplies three different products: 1) integrated imagery and LiDAR data, 2) 3-D thematic map, and 3) 2-D thematic map (which includes 3-D analysis layers such as terrain properties, spatial analysis, etc.).

Fig. 8. The 3-D urban environment application – Level 1 (detailed architecture)

The second level, termed "Building Model" (Figure 9), focuses on a single building in 3-D and provides two additional products: 1) integrated imagery and building model extracted from LiDAR data set, and 2) 3-D thematic map for general materials classification, and quantitative thematic maps implemented by spectral models.

The most specific and localized level is the third level, termed "Spectral Model" (Figure 9). The area of interest in this level is a particular place (a patch) on the wall of the building in question. The spatial investigation at this level is a continuation of the previous level; yet, the data source consists of spectral models that are evaluated for spectral in-situ point measurements. This level does not provide any integrated and rectified information, but provides geo-referencing of the results of the spectral models in realistic 3-D scale. This level completes the database of the suggested 3-D urban environment application.

The 3-D urban application is based on an integrated data set: spectral models, ground camera and airborne images, and LiDAR data. The system requirements are defined to include geo-spatial planning information and one-to-one topology. The concept architecture diagram is presented in Figure 4. As the model consist visualization and interactivity with maps and 3-D scenes, the interface includes 3-D interaction, 2-D vertical and horizontal interactions and browsers that contain spectral/thematic temporal information. The 3-D urban application provides services such as thematic mapping, and a complete quantitative review of the building and it's surrounding with respect to temporal monitoring. The design of the application shows the possibilities of delivering integrated information and thus holistic views of whole urban environments in a freeze-frame view of the spatiotemporal

The self-sufficient/self-determining levels of the integrated information contribute different parts to this global urban environmental application. The first level (Figure 8), termed "City 3-D", supplies three different products: 1) integrated imagery and LiDAR data, 2) 3-D thematic map, and 3) 2-D thematic map (which includes 3-D analysis layers such as terrain

**Level 1 – "City 3D"** 

Thermal map

> Spatial selection

Thematic map

Spatial selection

*Fuzzy Rules*

*2-D thematic analysis/interaction*

Fig. 8. The 3-D urban environment application – Level 1 (detailed architecture)

quantitative thematic maps implemented by spectral models.

Interface Browser CityGML *Downloadable application*

DTM Buildings model

Data Fusion *Fuzzy Logic*

*Data integration*

The second level, termed "Building Model" (Figure 9), focuses on a single building in 3-D and provides two additional products: 1) integrated imagery and building model extracted from LiDAR data set, and 2) 3-D thematic map for general materials classification, and

Roads model

*Airborne* LiDAR *Airborne* Imagery

Spatial *surface*  analysis

*3-D Thematic analysis/interaction*

The most specific and localized level is the third level, termed "Spectral Model" (Figure 9). The area of interest in this level is a particular place (a patch) on the wall of the building in question. The spatial investigation at this level is a continuation of the previous level; yet, the data source consists of spectral models that are evaluated for spectral in-situ point measurements. This level does not provide any integrated and rectified information, but provides geo-referencing of the results of the spectral models in realistic 3-D scale. This level completes the database of the suggested 3-D urban

domain.

properties, spatial analysis, etc.).

Data source

Component /Operator

Data processing

Layer

environment application.

Fig. 9. The 3-D urban environment application – Level 2 (detailed architecture)

Fig. 10. The 3-D urban environment application – Level 3 (detailed architecture)

## **4. 3-D urban environment application**

The 3-D monitoring built urban environment application, up to this point, employs single processing algorithms applied on imagery or LiDAR data, without taken into account contextual information. The data fusion application must provide fully integrated information, both of the classification products and the context within the scene. In the proposed application, a complete classification and identification task consist of subtasks, which have to operate on material and object characteristic/shape levels provided by accurately registered database. Moreover, the final fused and integrated application should be operated on objects of different sizes and scales, such as a single building detected within the urban area or a selected region on a building facade. The multi-scale and multi-sensor data fusion is possible with the eCognition procedure (user guide eCognition, 2003), when the substructures are archived by a hierarchical network.

Fusion of Optical and Thermal Imagery and LiDAR Data for

produces highly accurate and reliable merged information.

sensor-based quantitative/thermal information and models.

purpose, both registration models and data fusion techniques were used.

better the data fusion.

**5. Discussion** 

for data fusion and registration.

systems with multi-sensor sources.

Application to 3-D Urban Environment and Structure Monitoring 45

introduce expert knowledge and information into the logic, the better and detailed the description of the real world environment are modeled by the membership function, the

The operational system controls that a first class hierarchy will be loaded and used in the next step for data integration. Based on this preliminary fusion, first objects of interest are created from object primitives by thematic-based fusion. The same steps are performed until the final information (spectral quantitative model) is applied. The results are registered and integrated information is followed by the reliability map, which is established by the primary accuracy and classification confidence of each input data. The reliability map is important for post-processing inspection and testing routines; objects with low reliability must be assigned manually because no decision is possible. The suggested application involves semi-automatic or even manual stages, which have proven to be time-consuming operations. Yet, due to the expert system support, it is a time efficient application that

In this chapter, we present techniques for data fusion and data registration in several levels. Our study focused on the registration and the integration of multi-sensor and multitemporal information for a 3-D urban environment monitoring application. For that

The 3-D urban application satisfies a fundamental demand for non-traditional, multi-sensor and multi-type data. The frequent updating and extension requirement is replaced by integrating the variation in data formats and types for developing an urban environment. The main benefit of 3-D modeling and simulation over traditional 2-D mapping and analysis is a realistic illustration that can be regularly updated with attribute details and remote

The proposed application offers an advanced methodology by integrating information into a 5-D data set. The ability to include an accurate and realistic 3-D position, quantitative information, thermal properties and temporal changes provide a near-real-time monitoring system for photogrammetric and urban planning purposes. The main objectives of many studies are linked to, and rely on a historical set of remotely sensed imagery for quantitative assessment and spatial evolution of an urban environment (Jensen and Cowen 1999, Donnay et al. 2001, Herold et al. 2003, 2005). The well-known methodology is pattern observation in the spatiotemporal and spectral domains. The main objective of this research is a fully controlled, near-real-time, natural and realistic monitoring system for an urban environment. This task led us first to combine the image-processing and map-matching procedures, and then incorporate remote sensing and GIS tools into an integrative method

The proposed application for data fusion proved to be able to integrate several different types of data acquired from different sensors, and which are additionally dissimilar in rotation, translation, and possible scaling. The data fusion operated by fuzzy logic is a final product of the application. This approach is an important stage for quality assurance and validation but furthermore for information fusion in current and future remote sensing

The results of spectral/thermal classification processes are by far not only a spectral/thematic aggregation of classes converted to polygons or polylines (in vector format), but also a spatial and semantic structuring of the scene content (example of roofs extraction in Figure 11). The resulting network of extracted and identified objects can be seen as a spatial/semantic network of the scene. The local contextual information describes the joint relationships and meaningful interactions between those objects in the build urban environment and linked multi-scale and multi-sensor products. This hierarchy in the rulebase design allows a well-structured incorporation of knowledge.


• Concrete

Fig. 11. Hierarchical rule-base structure in eCognition

In fact, now each object is identified not only by its spectral, thermal, textural, morphological, topological and shape properties, but also by its unique information linkage and its actual neighbors. The data is fused by mutual dependencies within and between objects that create a semantic network of the scene. To assure high level accuracy and operational efficiency the input products are inspected by the basic topological rule, which obligates that object borders overlay borders of objects on the next layer. Therefore, the multi-scale information, which is represented concurrently, can be related to each other.

The semantic network of fuzzy logic is an expert system that quantities uncertainties and variations of the input data. The fuzzy logic, as an alternative approach for the Boolean statements, avoids arbitrary thresholds and thus, it is able to estimate a real world environment (Benz et al., 2004). The implemented rules are guided by the reliability of class assignments, thus the solution is always possible, even if there are contradictory assignments (Civanlar & Trussel, 1986). This logic proposes a deliberate choice and parameterization of the membership function that established the relationship between object features and acceptable characteristics. Since the design is the most crucial step to introduce expert knowledge and information into the logic, the better and detailed the description of the real world environment are modeled by the membership function, the better the data fusion.

The operational system controls that a first class hierarchy will be loaded and used in the next step for data integration. Based on this preliminary fusion, first objects of interest are created from object primitives by thematic-based fusion. The same steps are performed until the final information (spectral quantitative model) is applied. The results are registered and integrated information is followed by the reliability map, which is established by the primary accuracy and classification confidence of each input data. The reliability map is important for post-processing inspection and testing routines; objects with low reliability must be assigned manually because no decision is possible. The suggested application involves semi-automatic or even manual stages, which have proven to be time-consuming operations. Yet, due to the expert system support, it is a time efficient application that produces highly accurate and reliable merged information.

## **5. Discussion**

44 Remote Sensing – Advanced Techniques and Platforms

The results of spectral/thermal classification processes are by far not only a spectral/thematic aggregation of classes converted to polygons or polylines (in vector format), but also a spatial and semantic structuring of the scene content (example of roofs extraction in Figure 11). The resulting network of extracted and identified objects can be seen as a spatial/semantic network of the scene. The local contextual information describes the joint relationships and meaningful interactions between those objects in the build urban environment and linked multi-scale and multi-sensor products. This hierarchy in the rule-

> **Operation Level Object** Segmentation Level 1 Roof

Hierarchical rule-base structure in eCognition:

• Volume 3-D • Area 2-D •Diameter 2-D • Delta H (ground surface - top)

• VNIR-SWIR thematic class

• Thermal micro – silicates class

• Tile • Bitumen • Concrete

• Tile • Bitumen • Concrete

In fact, now each object is identified not only by its spectral, thermal, textural, morphological, topological and shape properties, but also by its unique information linkage and its actual neighbors. The data is fused by mutual dependencies within and between objects that create a semantic network of the scene. To assure high level accuracy and operational efficiency the input products are inspected by the basic topological rule, which obligates that object borders overlay borders of objects on the next layer. Therefore, the multi-scale information, which is represented concurrently, can be related to each other.

The semantic network of fuzzy logic is an expert system that quantities uncertainties and variations of the input data. The fuzzy logic, as an alternative approach for the Boolean statements, avoids arbitrary thresholds and thus, it is able to estimate a real world environment (Benz et al., 2004). The implemented rules are guided by the reliability of class assignments, thus the solution is always possible, even if there are contradictory assignments (Civanlar & Trussel, 1986). This logic proposes a deliberate choice and parameterization of the membership function that established the relationship between object features and acceptable characteristics. Since the design is the most crucial step to

base design allows a well-structured incorporation of knowledge.

• Shape 3-D • Shape • Proportions

• Material

Roof

Fig. 11. Hierarchical rule-base structure in eCognition

In this chapter, we present techniques for data fusion and data registration in several levels. Our study focused on the registration and the integration of multi-sensor and multitemporal information for a 3-D urban environment monitoring application. For that purpose, both registration models and data fusion techniques were used.

The 3-D urban application satisfies a fundamental demand for non-traditional, multi-sensor and multi-type data. The frequent updating and extension requirement is replaced by integrating the variation in data formats and types for developing an urban environment. The main benefit of 3-D modeling and simulation over traditional 2-D mapping and analysis is a realistic illustration that can be regularly updated with attribute details and remote sensor-based quantitative/thermal information and models.

The proposed application offers an advanced methodology by integrating information into a 5-D data set. The ability to include an accurate and realistic 3-D position, quantitative information, thermal properties and temporal changes provide a near-real-time monitoring system for photogrammetric and urban planning purposes. The main objectives of many studies are linked to, and rely on a historical set of remotely sensed imagery for quantitative assessment and spatial evolution of an urban environment (Jensen and Cowen 1999, Donnay et al. 2001, Herold et al. 2003, 2005). The well-known methodology is pattern observation in the spatiotemporal and spectral domains. The main objective of this research is a fully controlled, near-real-time, natural and realistic monitoring system for an urban environment. This task led us first to combine the image-processing and map-matching procedures, and then incorporate remote sensing and GIS tools into an integrative method for data fusion and registration.

The proposed application for data fusion proved to be able to integrate several different types of data acquired from different sensors, and which are additionally dissimilar in rotation, translation, and possible scaling. The data fusion operated by fuzzy logic is a final product of the application. This approach is an important stage for quality assurance and validation but furthermore for information fusion in current and future remote sensing systems with multi-sensor sources.

Fusion of Optical and Thermal Imagery and LiDAR Data for

*Meteorology*, 20, 67–87.

*Journal*, 1-28, http://www.onlineplaning.org

*IEEE Fuzzy Sets and Systems*,18, 1 –14.

London: Birkbeck College.

286-302.

29(4), 369-399.

(1-2), 83-98.

Workshop, Washington DC, 80-86.

Planning Association National Conference, Boston, MA.

from LIDAR data. In: ISPRS 2004, Istanbul, Turkey.

Journal of Geophysical Research, 74 (6), 1614-1634. Cressie, A.N.C. (1993). Statistics for spatial data. Review. New York: Wiley.

London and New York: Taylor and Francis, 3-18.

sensing data. *Remote Sensing of Environment*, 111, 537-552.

*Photogrammetric Engineering and Remote Sensing*, 55, 69–76.

Application to 3-D Urban Environment and Structure Monitoring 47

Carlson, T.N.; Dodd, J.K.; Benjamin, S.G. & Cooper, J.N. (1981). Satellite estimation of the

Chan, R.; Jepson, W. & Friedman, S. (1998). Urban simulation: an innovative tool for

Chantous, M.; Ghosh, S. & Bayoumi, M.A. (2009). Multi-modal automatic image registration

Cloude, S.P.; Kootsookos, P.J. & Rottensteiner, F. (2004). The automatic extraction of roads

Conel, J.E. (1969). Infrared Emissivities of Silicates: Experimental Results and a Cloudy

Dodge, M.; Smith, A. & Fleetwood, S., 1998. Towards the virtual city: VR & internet GIS for

Donnay, J.P.; Barnsley, M.J. & Longley, P.A. (2001). Remote sensing and urban analysis. In:

Fan, X., Rhody, H. and Saber, E., 2005. Automatic registration of multi-sensor airborne

Heiden, U., Segl, K., Roessner, S. and Kaufmann, H., 2007. Determination of robust spectral

Henry, J.A.; Dicks, S.E.; Wetterqvist, O.F. & Roguski, S.J. (1989). Comparison of satellite,

Herold, M., Goldstein, N.C. and Clarke, K.C., 2003. The spatiotemporal form of urban

Herold, M., Couclelis, H. and Clarke, K.C., 2005. The role of spatial metrics in the analysis

Hinz, S. and Baumgartner, A., 2003. Automatic extraction of urban road networks from

International Conference on Image Processing, Cairo, Egypt, 173-176. Civanlar, R. & Trussel, H. (1986). Constructing membership functions using statistical data.

facilitate better public participation in the planning process? *On Line Planning* 

surface energy balance, moisture availability and thermal inertia. *Journal of Applied* 

interactive planning and consensus building. In: Proceedings of the 1998 American

technique based on complex wavelets. In: Proceedings of the 16th IEEE

Atmosphere Model of Spectral Emission from Condensed Particulate Mediums.

urban planning. In: Virtual Reality and Geographical Information Systems.

J.P. Donnay, M.J. Barnsley and P.A. Longley, eds. *Remote sensing and urban analysis*.

imagery. In: Proceedings of the 34th Applied Imagery and Pattern Recognition

features for identification of urban surface materials in hyperspectral remote

ground-based, and modeling techniques for analyzing the urban heat island*.* 

growth: measurement, analysis and modeling. *Remote Sensing of Environment*, 86,

and modeling of land use change. Computers, *Environment and Urban Systems*,

multi-view aerial imagery*. ISPRS Journal of Photogrammetry and Remote Sensing,* 58

Brown, H. & Lowe, D. (2002). Invariant features from interest point groups, in BMVC. Bulmer, D. (2001). How can computer simulated visualizations of the built environment

The multi-dimensionality (5-D) of the developed urban environment application provides services such as thematic and thermal mapping, and a complete quantitative review of the building and its surroundings. These services are completed by providing the ability for accurate temporal monitoring and dynamic changes (changed detection) observations. The application design shows the possibility of delivering integrated information, and thus holistic views of whole urban environments, in a freeze-frame view of the spatio-temporal domain.

## **6. Conclusion**

In conclusion, the suggested application may provide the urban planners, civil engineers and decision makers with tools to consider quantitative spectral information and temporal investigation in the 3-D urban space. It is seamlessly integrating the multi-sensor, multidimensional, multi-scaling and multi-temporal data into a 5-D operated system. The application provides a general overview of thematic maps, and the complete quantitative assessment for any building and its surroundings in a 3-D natural environment, as well as, the holistic view of urban environment.

## **7. Acknowledgment**

This research work is supported by Discovery Grand (3-8163) from the Ministry of Science of Israel. The authors would like to express their deepest gratitude for this opportunity.

## **8. References**


The multi-dimensionality (5-D) of the developed urban environment application provides services such as thematic and thermal mapping, and a complete quantitative review of the building and its surroundings. These services are completed by providing the ability for accurate temporal monitoring and dynamic changes (changed detection) observations. The application design shows the possibility of delivering integrated information, and thus holistic views of whole urban environments, in a freeze-frame view of the spatio-temporal

In conclusion, the suggested application may provide the urban planners, civil engineers and decision makers with tools to consider quantitative spectral information and temporal investigation in the 3-D urban space. It is seamlessly integrating the multi-sensor, multidimensional, multi-scaling and multi-temporal data into a 5-D operated system. The application provides a general overview of thematic maps, and the complete quantitative assessment for any building and its surroundings in a 3-D natural environment, as well as,

This research work is supported by Discovery Grand (3-8163) from the Ministry of Science of Israel. The authors would like to express their deepest gratitude for this opportunity.

Akel, N.A.; Zilberstein, O. & Doytsher, Y. (2003). Automatic DTM extraction from dense raw

Allen, J. & Lu, K. (2003). Modeling and prediction of future urban growth in the Charleston

Ameri, B. (2000). Automatic recognition and 3-D reconstruction of buildings from digital

Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I. & Heynen, M. (2004). Multi-

information. *ISPRS Journal of Photogrammetry & Remote Sensing*, 58, 239– 258 Brook, A. & Ben-Dor, E. (2011a). Advantages of boresight effect in the hyperspectral data

Brook, A. & Ben-Dor, E. (2011b). Supervised vicarious calibration of hyperspectral remote

Brook, A. & Ben-Dor, E. (2011c). Automatic registration of airborne and space-borne images

Brook, A.; Ben-Dor, E. & Richter, R. (2011). Modeling and monitoring urban built

Brown, L.G. (1992). A survey of image registration techniques. *ACM Computing Surveys,* 24,

environment via multi-source integrated and fused remote sensing data.

sensing data. *Remote Sensing of Environment*, 115, 1543-1555.

*International Journal of Image and Data Fusion*, in press, 1-31.

by topology map-matching with SURF. *Remote Sensing*, 3, 65-82.

imagery. Thesis (PhD), University of Stuttgart.

analysis. *Remote Sensing*, 3 (3), 484-502.

LIDAR data in urban areas. In: Proc. FIG Working WeekParis, France, April 2003,

region of South Carolina: a GIS-based integrated approach, *Conservation Ecology*,

resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready

domain.

**6. Conclusion** 

**7. Acknowledgment** 

**8. References** 

1-10.

8(2), 202-211.

325-376.

the holistic view of urban environment.

Brown, H. & Lowe, D. (2002). Invariant features from interest point groups, in BMVC.


Fusion of Optical and Thermal Imagery and LiDAR Data for

York: Springer-Verlag.

*Letters,* 6(1), 57 – 61.

463-472.

(2), 217-226.

Salford, UK.

*Pattern Recognition Letters*, 15, 1119 – 1125.

*Geoscience and Remote Sensing*, 39 (7), 1525-1532.

*Journal of Remote Sensing*, 10, 1699–1720.

experiments. *Statistical Science*, 4(4), 409–435.

UserGuide eCognition, 2003. Website: www.definiens\_imaging.com.

*International Journal of Computer Vision*, 24, 137-154.

*Topics in Signal Processing*, 5(3), 521 – 533.

*Meteorology*, 22, 560–571.

Application to 3-D Urban Environment and Structure Monitoring 49

Pudil, P.; Novovicova, J. & Kittler, J. (1994). Floating search methods in feature selection,

Ridd, M.K. 1995. Exploring V-I-S model for urban ecosystem analysis through remote

Richards, J.A. and Jia, X., 1999. Remote sensing digital image analysis: an introduction. New

Robila, S.A. & Maciak, L.G. (2006). Considerations on Parallelizing Nonnegative Matrix

Roessner, S., Segl, K., Heiden, U., Munier, K. and Kaufmann, H., 1998. Application of

Roessner, S., Segl, K., Heiden, U. and Kaufmann, H., 2001. Automated differentiation of

Roth, M.; Oke, T.R. & Emery, W.J. (1989). Satellite-derived urban heat islands from three

Rottensteiner, F., Trinder, J., Clode, S., Kubic, K., 2003. Building detection using LIDAR data and multispectral images. In: Proceedings of DICTA, Sydney, Australia, 673-682. Sacks, J.; Welch, W.J.; Mitchell, T.J. &Wynn, H.P. (1989). Design and analysis of computer

Shan, J. and Sampath, A., 2005. Urban DEM generation from raw LIDAR data: a labeling

Song, Y., Wang, H., Hamilton, A. and Arayici, Y., 2009. Producing 3D applications for urban

Tao, V., 2001. Database-guided automatic inspection of vertically structured transportation objects from mobile mapping image sequences. *In: ISPRS Press*, 1401-1409.

Velaga, N.R., Quddus, M.A. and Bristow, A.L., 2009. Developing an enhanced weight-based

Villa, A.; Chanussot, J.; Benediktsson, J.A. & Jutten, C. (2011). Spectral Unmixing for the

Viola, P. and Wells, W.M., 1997. Alignment by maximization of mutual information.

Vukovich, F.M. (1983). An analysis of the ground temperature and reflectivity pattern about

*Transportation Research Part C: Emerging Technologies*, 17, 672-683.

Factorization for Hyperspectral Data Unmixing, *IEEE Geoscience and Remote Sensing* 

hyperspectral DAIS data for differentiation of urban surface in the city of Dresden, Germany. In: Proceedings 1st EARSel Workshop on Imaging Spectroscopy, Zurich,

urban surfaces based on airborne hyperspectral imagery. *IEEE Transactions on* 

coastal cities and the utilization of such data in urban climatology*. International* 

algorithm and its performance. *Photogrammetric Engineering and Remote Sensing*, 71

planning by integrating 3D scanned building data with geo-spatial data. Protocol. Research Institute for the Built and Human Environment (BuHu), University of

topological map-matching algorithm for intelligent transport systems.

Classification of Hyperspectral Images at a Finer Spatial Resolution. *IEEE Selected* 

St. Louis, Missouri, using HCMM satellite data. *Journal of Climate and Applied* 

sensing. *International Journal of Remote Sensing*, 16, 993-1000.


Jarvis, R.A., 1973. On the identification of the convex hull of a finite set of points in the

Jarvis, R.A., 1977. Computing the shape hull of points in the plane. In: Proceedings of the

Jensen, J.R. and Cowen, D.C., 1999. Remote sensing of urban/suburban infrastructure and

Jensen, J.R., 2004. Introductory digital image processing. 3rd ed. Upper Saddle River, NJ:

Jepson, W.H., Liggett, R.S. and Friedman, S., 2001. An integrated environment for urban

Juan, G., Martinez, M. and Velasco, R., 2007. Hyperspectral remote sensing application for semi-urban areas monitoring. *Urban Remote Sensing Joint Event*, 11 (13), 1-5. Kidder, S.Q. & Wu, H-T. (1987). A multispectral study of the St. Louis area under snow-

Kolbe, T.H., Gerhard, G. and Plümer, L., 2005. CityGML—Interoperable access to 3D city

Kolbe, T. and Bacharach, S., 2006. CityGML: An open standard for 3D city models. *Directions* 

Lee, H.Y., Park, W., Lee, H.-K. and Kim, T.-G., 2000. Towards knowledge-based extraction of

Lindeberg, T., 2004. Feature detection with automatic scale selection. *International Journal of* 

Masaharu, H. and Ohtsubo, K., 2002. A filtering method of airborne laser scanner data for

Moigne, J.L., Campbel, W.J. and Cromp, R.F., 2002. An automated parallel image

Nichol, JE. (1994). A GIS-based approach to microclimate monitoring in Singapore's high-

Nichol, J.E. (1996). High-resolution surface temperature patterns related to urban

Pauca,V.P.; Piper, J. & Plemmons R.J. (2006) Nonnegative matrix factorization for spectral

Symposium on Image Analysis and Interpretation, Austin, TX, 171-176. Li, R. and Zhou, G., 1999. Experimental study on ground point determination from high-

Li, Y., 2008. Automated georeferencing. Thesis (PhD). University of Texas at Dallas.

*Transactions on Geoscience and Remote Sensing*, 40, 1849-1864.

data analysis. *Linear Algebra and Applications*, 416(1), 29-47.

*Magazine ESRI,* http://directionmag.com/articles/123103

IEEE Computer Society Conference Pattern Recognition and Image Processing, 231-

socio-economic attributes. *Photogrammetric Engineering and Remote Sensing*, 65 (5),

simulation. In: R.K. Brail and R.E. Klosterman, eds. Planning support systems: integrating geographic information systems, models, and visualization tools.

covered conditions using NOAA-7 AVHRR data. *Remote Sensing of Environment,* 22,

models. In: International Symposium on Geoinformation for Disaster Management GI4DM 2005, Delft, Netherlands, Lecture Notes in Computer Science, March, 2005.

roads from 1m resolution satellite images. In: Proceedings of the IEEE Southwest

resolution airborne and satellite imagery. In: Proceedings of the ASPRS Annual

complex terrain. *The International Archives of Photogrammetry, Remote Sensing, and* 

registration technique based on the correlation of wavelet features. *IEEE* 

rise housing estates. *Photogrammetric Engineering and Remote Sensing*, 60, 1225–1232.

morphology in a tropical city: a satellite-based study. *Journal of Applied Meteorology*,

plane. *Information Processing Letters*, 2, 18-21.

241.

611-622.

159–172.

Prentice Hall.

Redlands, CA: ESRI, 387-404.

Conference, Portland, ME, 88-97.

*Spatial Information Sciences*, 15 (3B), 165-169.

*Computer Vision*, 30, 79-116.

35, 135–146.


**3**

*México* 

**Statistical Properties of**

*1CICESE, División de Física Aplicada,* 

*2Facultad de Ciencias Marinas, UABC* 

*Departamento de Óptica* 

**Surface Slopes via Remote Sensing** 

The complexity of wave motion in deep waters, which can damage marine platforms and vessels, and in shallow waters, same that can afflict human settlements and recreational areas, has given origin to a long-term development in laboratory and field studies, the conclusions of which are used to design methodology and set bases to understand wave

Via remote sensing, the use of radar images and optical processing of aerial photographs has been used. The interest in wave data is manifold; one element is the inherent interest in the directional spectra of waves and how they influence the marine environment and the coastline. These wave data can be readily and accurately collected by aerial photographs of the wave sun glint patterns which show reflections of the Sun and sky light from the water

In a series of articles, Cox and Munk (1954a, 1954b, 1955) studied the distribution of intensity or glitter pattern in aerial photographs of the sea. One of their conclusions was that for constant and moderate wind speed, the probability density function of the slopes is approximately Gaussian. This could be taken as an indication that in certain circumstances, the ocean surface could be modeled as a Gaussian random process. Similar observations by Longuet-Higgins et al. (1963) (cited by Longuet-Higgins (1962)) with a floating buoy, which filters out the high-frequency components, come considerably closer to the Gaussian

Other authors (Stilwell, 1969; Stilwell & Pilon, 1974) have studied the same problem considering a sea surface illuminated by a continuous sky light with no azimuthal variations in sky radiance. Different models of sky light have been used emphasizing the existence of a nonlinear relationship between the slope spectrum and the corresponding wave image

Simulated sea surfaces have been analyzed by optical systems to understand the optical technique in order to obtain best qualitative information of the spectrum (Álvarez-Borrego,

**1. Introduction** 

motion behavior.

distribution.

and thus offer high-contrast wave images.

1987; Álvarez-Borrego & Machado, 1985).

spectrum (Peppers & Ostrem, 1978; Chapman & Irani, 1981).

Josué Álvarez-Borrego1 and Beatriz Martín-Atienza2


## **Statistical Properties of Surface Slopes via Remote Sensing**

Josué Álvarez-Borrego1 and Beatriz Martín-Atienza2

*1CICESE, División de Física Aplicada, Departamento de Óptica 2Facultad de Ciencias Marinas, UABC México* 

## **1. Introduction**

50 Remote Sensing – Advanced Techniques and Platforms

Wang, Y., 2008. A further discussion of 3D building reconstruction and roof reconstruction

Wang, J. and Shan, J., 2009. Segmentation of LiDAR point clouds for building extraction. In:

Whitney, A.W. (1971). A Direct Method of Nonparametric Measurement Selection, *IEEE* 

Wu, J. and Chung, A., 2004. Multimodal brain image registration based on wavelet

Xu, R. and Chen, Y., 2007. Wavelet-based multiresolution medical image registration

Young, S.J., Johnson, R.B., and Hackwell, J.A., 2002. An in-scene method for atmospheric

Zhang K, Chen S, Whitman D, Shyu M, Yan J, Zhang C. 2003. A Progressive Morphological

Zavorin, I. and Le Moigne, J., 2005. Use of multiresolution wavelet feature pyramids for

Zhou, G., 2004. Urban 3D GIS from LiDAR and digital aerial images. *Computers and* 

Zitova, B. and Flusser, J., 2003. Image registration methods: a survey. *Image and Vision* 

Journal of Innovative Computing, Information and Control, 3, 285-296. Yang, F. & Jiang, T. (2003). Pixon-Based Image Segmentation With Markov Random Fields.

on Medical Imaging and Augmented Reality, Beijing, China.

*Transactions on Geoscience and Remote Sensing*, 41(4), 872-882.

Sensing and Land Information Systems, Freiburg.

ASPRS 2009 Annual Conference, Baltimore, MD.

*IEEE Transactions on Image Processing*, 12, 1552-1559.

*Trans. Computers*, 20(9), 1100-1103.

28.

*Processing*, 14, 770-782.

*Geosciences*, 30, 345-353.

*Computing*, 21, 977-1000.

based on airborne LiDAR data by VEPS' partner, the Department of Remote

transform using SAD and MI. In: Proceedings of the 2nd International Workshop

strategy combining mutual information with spatial information. International

compensation of thermal hyperspectral data. *Journal of Geophysical Research*, 107, 20-

Filter for Removing Non-Ground Measurements from Airborne LIDAR Data. *IEEE* 

automatic registration of multisensor imagery*. IEEE Transactions on Image* 

The complexity of wave motion in deep waters, which can damage marine platforms and vessels, and in shallow waters, same that can afflict human settlements and recreational areas, has given origin to a long-term development in laboratory and field studies, the conclusions of which are used to design methodology and set bases to understand wave motion behavior.

Via remote sensing, the use of radar images and optical processing of aerial photographs has been used. The interest in wave data is manifold; one element is the inherent interest in the directional spectra of waves and how they influence the marine environment and the coastline. These wave data can be readily and accurately collected by aerial photographs of the wave sun glint patterns which show reflections of the Sun and sky light from the water and thus offer high-contrast wave images.

In a series of articles, Cox and Munk (1954a, 1954b, 1955) studied the distribution of intensity or glitter pattern in aerial photographs of the sea. One of their conclusions was that for constant and moderate wind speed, the probability density function of the slopes is approximately Gaussian. This could be taken as an indication that in certain circumstances, the ocean surface could be modeled as a Gaussian random process. Similar observations by Longuet-Higgins et al. (1963) (cited by Longuet-Higgins (1962)) with a floating buoy, which filters out the high-frequency components, come considerably closer to the Gaussian distribution.

Other authors (Stilwell, 1969; Stilwell & Pilon, 1974) have studied the same problem considering a sea surface illuminated by a continuous sky light with no azimuthal variations in sky radiance. Different models of sky light have been used emphasizing the existence of a nonlinear relationship between the slope spectrum and the corresponding wave image spectrum (Peppers & Ostrem, 1978; Chapman & Irani, 1981).

Simulated sea surfaces have been analyzed by optical systems to understand the optical technique in order to obtain best qualitative information of the spectrum (Álvarez-Borrego, 1987; Álvarez-Borrego & Machado, 1985).

Statistical Properties of Surface Slopes via Remote Sensing 53

Fig. 1. The detector is located in the zenith of each reflection point in the profile.

Sun), where there are incidence rays which are determined by the condition

 

So, the projection of this source on the detector, after reflection, is given by

 

 

in other words, the source is angularly described by the function,

where rect(.) represents the rectangle function (Gaskill, 1978).

*<sup>s</sup>* and the specular angle is given by .

2

reflected to the camera. The directions,

where equation (1) is taken into account.

represents the angle between the normal to the plane and the source S. This angle is given

Because the source has a finite size, there are several incidence directions which are specular

, 2 2 *s os s*

 , *os s os rect* 

2 2 2 2 *s s*

> *<sup>r</sup> <sup>R</sup> rect*

 

 

 

 

 

> 

*<sup>r</sup>* From this two equations we can

(2)

(3)

, (4)

, (5)

*os* , can be written like

*r s* . (1)

*os* (where this angle is the angular dimension of the

 

write

by , 

Fuks and Charnotskii (2006) derived the joint probability density function of surface height and partial second derivatives for an ensemble of specular points at a random rough Gaussian isotropic surface at normal incidence. However, in a real physical situation, consideration of Gaussian statistics can be a very good approximation.

Cox and Munk (1956) observed that the center of the glitter pattern images had shifted downwind from the grid center. This shift can be associated with an up/downwind asymmetry of the wave profile (Munk, 2009). Surfaces of small positive slope are more probable than those of negative slope; large positive slopes are less probable than larger negative slopes, thus permitting the restraint of a zero mean slope (Bréon & Henrist, 2006).

According with Longuet-Higgins (1963) the sea surface slopes have a Gaussian probability function to a first approximation. In the next approximation skewness is taken into account. The kurtosis is zero, as are all the higher cumulants. In the next approximation, the distribution is given taken into account the kurtosis.

Walter Munk (2009) writes that the skewness appears to be correlated with a rather sudden onset of breaking for winds above 4 m s-1 and he does not think that skewness comes from parasitic capillaries. Chapron et al. (2002) suggest that the actual waves form under nearbreaking conditions, along with the varying population and length scales for these breaking events, should also contribute to the skewness.

In this chapter we will consider two different cases to analyze statistical properties of surface slopes via remote sensing: first we assume the fluctuation of the surface slopes to be statistically Gaussian and the second case we assume the fluctuation of the surface slopes to be statistically non-Gaussian. We, also, assume that the surfaces are illuminated by a source, the Sun, of a fixed angular extent, , and imaged through a lens that subtends a very small solid angle. With these considerations, we calculated their images, as they would be formed by a signal clipping detector. In order to do this, we define a "glitter function", which operates on the slope of the surfaces. In the first case we consider two situations: the detector line of sight angle, *<sup>d</sup>* , is constant for each point on the surface and *<sup>d</sup>* is variable for each point in the surface. In the second case, with non-Gaussian statistics, we consider *<sup>d</sup>* variable for each point in the surface only, because we consider that this case is more realistic.

## **2. Geometry of the model (Gaussian case considering a constant detector angle)**

The physical situation is shown in figure 1. The surface *x* is illuminated by a uniform incoherent source S of limited angular extent, with wavelength . Its image is formed in D by an aberration free optical system. The incidence angle, *<sup>s</sup>* , is defined as the angle between the incidence angle direction and the normal to the mean surface. Then, in figure 1, *<sup>s</sup>* , represents the mean angle subtended by the source S and *<sup>d</sup>* represents the mean angle subtended by the optical system of the detector with the normal to the mean surface.

The apparent diameter of the source is and of the detector is *d* . Light from the source is reflected on the surface just one time and, depending on the slope, the light reflected will or will not be part of the image. In broad terms, the image consists of bright and dark regions that we call a glitter pattern. represents the angle between the x axis and the surface, and

Fuks and Charnotskii (2006) derived the joint probability density function of surface height and partial second derivatives for an ensemble of specular points at a random rough Gaussian isotropic surface at normal incidence. However, in a real physical situation,

Cox and Munk (1956) observed that the center of the glitter pattern images had shifted downwind from the grid center. This shift can be associated with an up/downwind asymmetry of the wave profile (Munk, 2009). Surfaces of small positive slope are more probable than those of negative slope; large positive slopes are less probable than larger negative slopes, thus permitting the restraint of a zero mean slope (Bréon & Henrist, 2006). According with Longuet-Higgins (1963) the sea surface slopes have a Gaussian probability function to a first approximation. In the next approximation skewness is taken into account. The kurtosis is zero, as are all the higher cumulants. In the next approximation, the

Walter Munk (2009) writes that the skewness appears to be correlated with a rather sudden onset of breaking for winds above 4 m s-1 and he does not think that skewness comes from parasitic capillaries. Chapron et al. (2002) suggest that the actual waves form under nearbreaking conditions, along with the varying population and length scales for these breaking

In this chapter we will consider two different cases to analyze statistical properties of surface slopes via remote sensing: first we assume the fluctuation of the surface slopes to be statistically Gaussian and the second case we assume the fluctuation of the surface slopes to be statistically non-Gaussian. We, also, assume that the surfaces are illuminated by a source,

solid angle. With these considerations, we calculated their images, as they would be formed by a signal clipping detector. In order to do this, we define a "glitter function", which operates on the slope of the surfaces. In the first case we consider two situations: the

for each point in the surface. In the second case, with non-Gaussian statistics, we consider

*<sup>d</sup>* variable for each point in the surface only, because we consider that this case is more

between the incidence angle direction and the normal to the mean surface. Then, in figure 1,

reflected on the surface just one time and, depending on the slope, the light reflected will or will not be part of the image. In broad terms, the image consists of bright and dark regions

and of the detector is

subtended by the optical system of the detector with the normal to the mean surface.

**2. Geometry of the model (Gaussian case considering a constant detector** 

*<sup>d</sup>* , is constant for each point on the surface and

represents the angle between the x axis and the surface, and

, and imaged through a lens that subtends a very small

*x* is illuminated by a uniform

. Its image is formed in D

*<sup>s</sup>* , is defined as the angle

*<sup>d</sup>* represents the mean angle

*d* . Light from the source is

*<sup>d</sup>* is variable

consideration of Gaussian statistics can be a very good approximation.

The physical situation is shown in figure 1. The surface

incoherent source S of limited angular extent, with wavelength

*<sup>s</sup>* , represents the mean angle subtended by the source S and

by an aberration free optical system. The incidence angle,

distribution is given taken into account the kurtosis.

events, should also contribute to the skewness.

the Sun, of a fixed angular extent,

The apparent diameter of the source is

that we call a glitter pattern.

detector line of sight angle,

realistic.

**angle)**

Fig. 1. The detector is located in the zenith of each reflection point in the profile.

 represents the angle between the normal to the plane and the source S. This angle is given by , *<sup>s</sup>* and the specular angle is given by . *<sup>r</sup>* From this two equations we can write

$$
\theta\_r = \theta\_s - \mathbf{2}a \; . \tag{1}
$$

Because the source has a finite size, there are several incidence directions which are specular reflected to the camera. The directions, *os* (where this angle is the angular dimension of the Sun), where there are incidence rays which are determined by the condition

$$
\theta\_s - \frac{\beta}{2} \le \theta\_{\rm os} \le \theta\_s + \frac{\beta}{2},
\tag{2}
$$

in other words, the source is angularly described by the function, *os* , can be written like

$$\sigma\left(\theta\_{\rm os}\right) = \text{rect}\left[\frac{\theta\_{\rm os} - \theta\_{\rm s}}{\beta}\right],\tag{3}$$

where rect(.) represents the rectangle function (Gaskill, 1978).

So, the projection of this source on the detector, after reflection, is given by

$$
\theta\_s - \frac{\beta}{2} - 2a \le \theta \le \theta\_s + \frac{\beta}{2} - 2a \,\, \text{s} \tag{4}
$$

$$\sigma\_R\left(\theta\right) = \text{rect}\left(\frac{\theta - \theta\_r}{\beta}\right),\tag{5}$$

where equation (1) is taken into account.

Statistical Properties of Surface Slopes via Remote Sensing 55

( ) , *<sup>I</sup> Ix B p d* 

where *B* is defined by equation (12) and *p* is the probability density function in one dimension, where in a first approximation a Gaussian function is considered. Substituting in

*I x rect d*

and <sup>2</sup> 1 4, *o o b*

 <sup>1</sup> . <sup>2</sup> 2 2 *<sup>I</sup> b a I x erf erf*

 2 2 2 2 . *I I I x Ix B <sup>p</sup> <sup>d</sup>* 

 

<sup>2</sup> <sup>2</sup> 1 , *I I <sup>I</sup>*

<sup>2</sup> 1 1 , 2 2 22 22 *<sup>I</sup> ba ba erf erf erf erf*

The relation (18) is shown in figure 2 for some typical cases, using the geometry described

In the figure we can observe the dependence of this relationship with the angular position of

and small values of variance of the surface slopes, it is possible to obtain bigger values in the variance of the intensities in the image. From equation (18), we can see that this behavior is

, and in the vertical axis we have the variance of the intensities of the image, <sup>2</sup>

*<sup>s</sup>* . In figure 2 we also can observe that for small incidence angles (0-10 degrees)

*Ix Ix*

which is the required relation between the variance of the intensities in the image, <sup>2</sup>

*I x Ix* , therefore

and substituting the expression of *I x* , equation (15), in equation (17), we have

(13)

(14)

(15)

(16)

(17)

2

(18)

*<sup>I</sup>* .

*<sup>I</sup>* , and

2

we can write

 

*<sup>o</sup> <sup>I</sup>*

1/2 2 2 <sup>1</sup> ( ) exp . <sup>2</sup> <sup>1</sup> <sup>2</sup>

*o*

2

 

> 

In the horizontal axis we have the variance of the surface

 

*<sup>I</sup>* , is defined by (Papoulis, 1981)

**2.1 Relationship among the variances of the intensities in the image, surface slopes** 

*<sup>I</sup>* , may be written (Papoulis, 1981)

**and surface heights**  The mean of the image,

Defining <sup>2</sup> 1 4 *o o a*

equation (13) the expressions for *B* and *p* , we have

 

.

The variance of the intensities in the image, <sup>2</sup>

But, <sup>2</sup> *B B* , then <sup>2</sup>

above, with 0*<sup>o</sup>* 

slopes, <sup>2</sup> 

the source,

the variance of the surface slopes, <sup>2</sup>

*<sup>d</sup>* and 0.68 .*<sup>o</sup>* 

On the other side, the detection system pupil can be represented by the function

$$P(\theta) = \text{rect}\left(\frac{\theta - \theta\_d}{\delta d}\right). \tag{6}$$

The intensity light *I* , arriving to the detection plane D depends on the overlap between the functions *<sup>R</sup>* and *P*, and can be approximated by

$$I = \begin{array}{c} \frac{\pi}{2} \\\\ \sigma\_R(\theta)P(\theta)d\theta \,. \end{array} \tag{7}$$

In practical situations *d* is so smaller than , that we can to approximate , *P <sup>d</sup>* where is the Dirac delta, of this way

$$\begin{aligned} \mathcal{I} & \approx \sigma\_R \left( \theta\_d \right), \\ & \approx \text{rect} \left( \frac{\theta\_d - \theta\_r}{\beta} \right). \end{aligned} \tag{8}$$

The light reflection will arrive to the detector D when

$$
\theta\_r - \frac{\beta}{2} \le \theta\_d \le \theta\_r + \frac{\beta}{2},
\tag{9}
$$

and because 2 *r s* , we have

$$
\frac{\theta\_s - \theta\_d}{2} - \frac{\beta}{4} \le \alpha \le \frac{\theta\_s - \theta\_d}{2} + \frac{\beta}{4}.\tag{10}
$$

Defining tan , / 2 *s d* and tan , *<sup>o</sup>* and using the relationship <sup>2</sup> tan 4 tan 1 tan 4 , valid for small 4 , we obtain the next condition for the slopes

$$
\Gamma \Pi\_o - \left(1 + \Pi\_o^2\right) \frac{\beta}{4} \le \Pi \le \Pi\_o + \left(1 + \Pi\_o^2\right) \frac{\beta}{4}.\tag{11}
$$

We find then the "glitter function", given by

$$B(\Pi) = \text{rect}\left[\frac{\Pi - \Pi\_o}{\left(1 + \Pi\_o^2\right)\mathcal{J}\mathcal{J}}\right].\tag{12}$$

This expression (eq. 12) tell us that the geometry of the problem selects a surface slope region and encodes like bright points in the image (glitter pattern).

*<sup>d</sup> P rect*

The intensity light *I* , arriving to the detection plane D depends on the overlap between the

*<sup>R</sup> I Pd*

, *R d*

 

, 2 2 *r dr*

. 24 24 *s d s d*

and tan , *<sup>o</sup>*

 

 2 2 1 1 . 4 4 *oo oo* 

This expression (eq. 12) tell us that the geometry of the problem selects a surface slope

 <sup>2</sup> . <sup>1</sup> <sup>2</sup> *o o*

 

*d r*

*d* 

. (6)

. (7)

, that we can to approximate

. (8)

(9)

(10)

(11)

(12)

and using the relationship

4 , we obtain the next condition for

On the other side, the detection system pupil can be represented by the function

, and can be approximated by

2

2

is the Dirac delta, of this way

 

*rect*

region and encodes like bright points in the image (glitter pattern).

*B rect*

 

 

*d* is so smaller than

*I*

functions

 *<sup>R</sup>* and *P*

In practical situations

and because 2 *r s* 

Defining tan ,

 

the slopes

We find then the "glitter function", given by

<sup>2</sup> tan 4 tan 1 tan 4 ,

   

where

, *P*

 *<sup>d</sup>* 

The light reflection will arrive to the detector D when

, we have

/ 2 *s d*

 valid for small

#### **2.1 Relationship among the variances of the intensities in the image, surface slopes and surface heights**

The mean of the image, *<sup>I</sup>* , may be written (Papoulis, 1981)

$$\mu\_I = \left\{ I(\mathbf{x}) \right\} = \bigcap\_{-\infty}^{+\infty} B(\Pi) p(\Pi) d\Pi\_\prime \tag{13}$$

where *B* is defined by equation (12) and *p* is the probability density function in one dimension, where in a first approximation a Gaussian function is considered. Substituting in equation (13) the expressions for *B* and *p* , we have

$$\mu\_{\rm I} = \left\{ I(\mathbf{x}) \right\} = \frac{1}{\sigma\_{\rm II} \left( 2\pi \right)^{1/2}} \stackrel{\approx}{\int} \operatorname{rect} \left[ \frac{\Pi - \Pi\_o}{\left( 1 + \Pi\_o^2 \right) \stackrel{\theta}{\mathcal{V}\_2}} \right] \exp \left( -\frac{\Pi^2}{2\sigma\_{\rm II}^2} \right) d\Pi. \tag{14}$$

Defining <sup>2</sup> 1 4 *o o a* and <sup>2</sup> 1 4, *o o b* we can write

$$\mu\_I = \left\{ I(\mathbf{x}) \right\} = \frac{1}{2} \left[ \text{erf}\left(\frac{b}{\sqrt{2}\sigma\_{\text{II}}}\right) - \text{erf}\left(\frac{a}{\sqrt{2}\sigma\_{\text{II}}}\right) \right]. \tag{15}$$

The variance of the intensities in the image, <sup>2</sup> *<sup>I</sup>* , is defined by (Papoulis, 1981)

$$
\sigma\_I^2 = \left\langle I^2 \left( \mathbf{x} \right) \right\rangle - \left\langle I \left( \mathbf{x} \right) \right\rangle^2 = \underset{-\mathbf{e}}{\overset{+\mathbf{e}}{\int}} \left[ B \left( \Pi \right) - \mu\_I \right]^2 p(\Pi) d\Pi. \tag{16}
$$

But, <sup>2</sup> *B B* , then <sup>2</sup> *I x Ix* , therefore

$$
\sigma\_I^2 = \left< I(\mathbf{x}) \right> - \left< I(\mathbf{x}) \right>^2 = \mu\_I (1 - \mu\_I), \tag{17}
$$

and substituting the expression of *I x* , equation (15), in equation (17), we have

$$
\sigma\_{\rm I}^2 = \frac{1}{2} \left[ \text{erf}\left(\frac{b}{\sqrt{2}\sigma\_{\rm II}}\right) - \text{erf}\left(\frac{a}{\sqrt{2}\sigma\_{\rm II}}\right) \right] - \left(\frac{1}{2} \left[ \text{erf}\left(\frac{b}{\sqrt{2}\sigma\_{\rm II}}\right) - \text{erf}\left(\frac{a}{\sqrt{2}\sigma\_{\rm II}}\right) \right] \right)^2. \tag{18}
$$

which is the required relation between the variance of the intensities in the image, <sup>2</sup> *<sup>I</sup>* , and the variance of the surface slopes, <sup>2</sup> .

The relation (18) is shown in figure 2 for some typical cases, using the geometry described above, with 0*<sup>o</sup> <sup>d</sup>* and 0.68 .*<sup>o</sup>* In the horizontal axis we have the variance of the surface slopes, <sup>2</sup> , and in the vertical axis we have the variance of the intensities of the image, <sup>2</sup> *<sup>I</sup>* . In the figure we can observe the dependence of this relationship with the angular position of the source, *<sup>s</sup>* . In figure 2 we also can observe that for small incidence angles (0-10 degrees) and small values of variance of the surface slopes, it is possible to obtain bigger values in the variance of the intensities in the image. From equation (18), we can see that this behavior is

Statistical Properties of Surface Slopes via Remote Sensing 57

, is given by equation (19), and the relationship between *C*

2 2

1/2 1 2 2 2 2 2

exp . 2 1 2 1

exp , 4 2 2 1 2 1

<sup>2</sup> <sup>2</sup> 2 2 2 2

=0.03 . The correlation functions of figure 3 are normalized.

and the require inverse process to determine the correlation

*<sup>I</sup>* can be calculated from equation (21). We wrote in Table 1 the

*C C* 

*b C a C*

 

, is given by (Álvarez-Borrego, 1993)

 

 

2

, and the surface

and the

(20)

, from the

(21)

 

, can be

. Although equation (20) is a more

2 *I*

 

The relationship between correlation functions of the surface heights, *C*

obtain information of the correlation function of the surface heights, *C*

 

2 1 2 1 2 1 2

*B B <sup>C</sup> C d <sup>d</sup> C C*

In order to achieve the inverse process, using equation (19) and equation (20), these two equations must meet certain conditions. For example, it is required that there exists one to

Using equation (19) the processed data can be numerically integrated twice, such that we

complicated expression, we cannot obtain an analytical result from it. A first integral can be analytically solved and for the second it is possible to obtain the solution by numerical

integration. Resolving the first integral analytically, equation (20) can be written like

*C erf erf d*

 

So, a relationship between values of the correlation function of the intensities in the image,

obtained (Figure 3). In this case, to small angles we can find higher values for the correlation function of the intensities in the image. In all the cases, the angular position of the camera or

Also, from equation (19), it is possible to obtain the correlation function of the surface

values of the image variance in order to normalize the correlations in figure 3 for different

10 0.03 0.0119734700 20 0.03 0.0083223130 30 0.03 0.0044081650 40 0.03 0.0016988780 50 0.03 0.0004438386 Table 1. Values of the image variance in order to normalize the correlations in figure 3 for

2 

, and the values of the correlation function of the surface slopes takes, *C*

2 2 2 2

and <sup>2</sup> 1 4. *o o b*

correlation function of the intensities in the image, *CI*

one correspondence among the amount involved.

correlation function of the surface slopes, *C*

 

2

slopes, *C*

*I I*

 

2

*<sup>d</sup>* , is zero and <sup>2</sup>

, from *C*

*s* **.** 

function of the surface heights is completed.

*b*

*a*

 

A theoretical variance <sup>2</sup>

*s* **.** 

> *s*

different values for

where <sup>2</sup> 1 4 *o o a*

*I I*

 

*CI* 

detector,

heights, *C*

values for

independent of any surface height power spectrum that we are analyzing, because this relation depends on the probability density function of the surface slopes and the geometry of the experiment only.

Fig. 2. Relationship between the variance of the surface slopes with the variance of the intensities in the image.

In certain cases, figure 2, if we have data corresponding to a *<sup>s</sup>* value only, it is not possible to obtain the variance of the surface slopes, <sup>2</sup> , because for a value of <sup>2</sup> *<sup>I</sup>* we will have two possible values of <sup>2</sup> . To solve this problem, it is necessary to analyze images which correspond at two or more incidence angles and to select a slope variance value which is consistent with all these data.

The relationship between <sup>2</sup> and <sup>2</sup> can be derived from (Papoulis, 1981)

$$\mathcal{C}\_{\Pi} \left( \tau \right) = -\frac{d^2 \mathcal{C}\_{\zeta} \left( \tau \right)}{d \tau^2},\tag{19}$$

if we know the correlation function of the surface heights (this will be shown in next section of this chapter). Here, *C* is the correlation function of the surface heights and *C* is the correlation function of the surface slopes.

#### **2.2 Relationship between the correlation function of the intensities in the image and of the surface heights**

Our analysis involves three random processes: the surface profile, *x*, its surface slopes, *x*, and the image, *I x* . Each process has a correlation function and it was shown (Álvarez-Borrego, 1993) that these three functions hold a relationship.

independent of any surface height power spectrum that we are analyzing, because this relation depends on the probability density function of the surface slopes and the geometry

Fig. 2. Relationship between the variance of the surface slopes with the variance of the

correspond at two or more incidence angles and to select a slope variance value which is

 <sup>2</sup> <sup>2</sup> , *d C*

if we know the correlation function of the surface heights (this will be shown in next section

**2.2 Relationship between the correlation function of the intensities in the image and** 

*x*, and the image, *I x* . Each process has a correlation function and it was shown

 

*<sup>d</sup>*

, because for a value of <sup>2</sup>

. To solve this problem, it is necessary to analyze images which

can be derived from (Papoulis, 1981)

is the correlation function of the surface heights and *C*

*<sup>s</sup>* value only, it is not possible

*<sup>I</sup>* we will have two

is

*x*, its surface slopes,

(19)

In certain cases, figure 2, if we have data corresponding to a

 and <sup>2</sup> 

Our analysis involves three random processes: the surface profile,

(Álvarez-Borrego, 1993) that these three functions hold a relationship.

*C*

to obtain the variance of the surface slopes, <sup>2</sup>

 

the correlation function of the surface slopes.

of the experiment only.

intensities in the image.

possible values of <sup>2</sup>

of this chapter). Here, *C*

**of the surface heights** 

consistent with all these data. The relationship between <sup>2</sup>

The relationship between correlation functions of the surface heights, *C* , and the surface slopes, *C* , is given by equation (19), and the relationship between *C* and the correlation function of the intensities in the image, *CI* , is given by (Álvarez-Borrego, 1993)

$$\sigma\_I^2 \mathbf{C}\_I(\tau) = \int\_{-\alpha - \alpha}^{\alpha} \int \frac{B(\Pi\_1)B(\Pi\_2)}{2\pi\sigma\_{\Pi}^2 \left[1 - \mathbf{C}\_{\Pi}^2(\tau)\right]^{1/2}} \exp\left[-\frac{\Pi\_1^2 + \Pi\_2^2 - 2\mathbf{C}\_{\Pi}(\tau)\Pi\_1\Pi\_2}{2\sigma\_{\Pi}^2 \left[1 - \mathbf{C}\_{\Pi}^2(\tau)\right]}\right] d\Pi\_1 d\Pi\_2. \tag{20}$$

In order to achieve the inverse process, using equation (19) and equation (20), these two equations must meet certain conditions. For example, it is required that there exists one to one correspondence among the amount involved.

Using equation (19) the processed data can be numerically integrated twice, such that we obtain information of the correlation function of the surface heights, *C* , from the correlation function of the surface slopes, *C* . Although equation (20) is a more complicated expression, we cannot obtain an analytical result from it. A first integral can be analytically solved and for the second it is possible to obtain the solution by numerical integration. Resolving the first integral analytically, equation (20) can be written like

$$\sigma\_l^2 \mathbf{C}\_l(\tau) = \left| \frac{\sqrt{2}}{4\sigma\_{\Pi}\sqrt{\pi}} \exp\left[ -\frac{\Pi\_2^2}{2\sigma\_{\Pi}^2} \right] \right| \left. \text{erf}\left( \frac{b - \mathbf{C}\_{\Pi}(\tau)\Pi\_2}{\sqrt{2\sigma\_{\Pi}^2 \left[1 - \mathbf{C}\_{\Pi}^2(\tau)\right]}} \right) - \left. \text{erf}\left( \frac{a - \mathbf{C}\_{\Pi}(\tau)\Pi\_2}{\sqrt{2\sigma\_{\Pi}^2 \left[1 - \mathbf{C}\_{\Pi}^2(\tau)\right]}} \right) \right| d\Pi\_2,\tag{21}$$

where <sup>2</sup> 1 4 *o o a* and <sup>2</sup> 1 4. *o o b* 

So, a relationship between values of the correlation function of the intensities in the image, *CI* , and the values of the correlation function of the surface slopes takes, *C* , can be obtained (Figure 3). In this case, to small angles we can find higher values for the correlation function of the intensities in the image. In all the cases, the angular position of the camera or detector, *<sup>d</sup>* , is zero and <sup>2</sup> =0.03 . The correlation functions of figure 3 are normalized.

Also, from equation (19), it is possible to obtain the correlation function of the surface heights, *C* , from *C* and the require inverse process to determine the correlation function of the surface heights is completed.

A theoretical variance <sup>2</sup> *<sup>I</sup>* can be calculated from equation (21). We wrote in Table 1 the values of the image variance in order to normalize the correlations in figure 3 for different values for *s* **.** 


Table 1. Values of the image variance in order to normalize the correlations in figure 3 for different values for *s* **.** 

Statistical Properties of Surface Slopes via Remote Sensing 59

Fig. 4. Geometry of the real physical situation. Counterclockwise angles are considered as

<sup>2</sup>

*i oi*

*oi*

1

2 2 1 1 , 4 4

 tan . <sup>2</sup> *s d i*

with respect to each *i* point of the surface (Figure 5). Combining equations (25) –

*i x i x H H* 

*I* may be written as (Álvarez-Borrego & Martín-Atienza, 2010)

 

 *oi oi i oi oi* 

> *i i* tan ,

 

The interval characterized by equation (25) defines a specular band where certain slopes generate bright spots in the image. This band has now a nonlinear slope due to the variation

1 1 1 1 tan tan . 22 4 22 4

**3.1 Relationships among the variances of the intensities in the image and surface** 

,

 

(24)

(26)

 

(27)

(25)

(28)

2

The glitter function can be expressed as (Álvarez-Borrego & Martín-Atienza, 2010)

*i*

*oi*

(27), the slope interval, where a bright spot is received by the detector, is

*s s <sup>i</sup>*

*B rect*

positive and clockwise angles as negative.

where

of *<sup>d</sup> <sup>i</sup>* 

**slopes** 

The mean of the image

Fig. 3. Relationship between the correlation function of the surface slopes and the correlation function of the intensities in the image.

### **3. Geometry of the model (Gaussian case considering a variable detector angle)**

A more real physical situation is shown in figure 4. The surface, *x* , is illuminated by a uniform incoherent source S of limited angular extent, with wavelength . Its image is formed in D by an aberration-free optical system. The incidence angle *<sup>s</sup>* is defined as the angle between the incidence angle direction and the normal to the mean surface and represents the mean angle subtended by the source S. *<sup>d</sup> <sup>i</sup>* corresponds to the angle subtended by the optical system of the detector with the normal to point *i* of the surface, i. e.

$$\left(\left(\theta\_d\right)\_i\right) = \tan^{-1}\left(\frac{i\Delta x}{H}\right)\_\prime \tag{22}$$

where H is the height of the detector and *x* is the interval between surface points. We can see that in this more realistic physical situation, angle *<sup>d</sup>* is changing with respect to each point in the surface. It is worth noticing that a variable *<sup>d</sup>* does not restrict the sensor field of view.

*<sup>i</sup>* is the angle subtended between the normal to the mean surface and the normal to the slope for each *i* point in the surface

$$\alpha\_i = \frac{\theta\_s + \left(\theta\_d\right)\_i}{2} = \frac{\theta\_s}{2} + \frac{1}{2}\tan^{-1}\left(\frac{i\Delta x}{H}\right). \tag{23}$$

The apparent diameter of the source is . Light from the source is reflected on the surface for just one time, and, depending on the slope, the light reflected will or will not be part of the image. Thus, the image consists of bright and dark regions that we call a glitter pattern.

Fig. 4. Geometry of the real physical situation. Counterclockwise angles are considered as positive and clockwise angles as negative.

The glitter function can be expressed as (Álvarez-Borrego & Martín-Atienza, 2010)

$$B\left(\Pi\_i\right) = \text{rect}\left[\frac{\Pi\_i - \Pi\_{oi}}{\left(1 + \Pi\_{oi}^2\right)\frac{\mathcal{B}}{\mathcal{B}}}\right] \tag{24}$$

where

58 Remote Sensing – Advanced Techniques and Platforms

Fig. 3. Relationship between the correlation function of the surface slopes and the correlation

angle between the incidence angle direction and the normal to the mean surface and

subtended by the optical system of the detector with the normal to point *i* of the surface,

<sup>1</sup> tan , *<sup>d</sup> <sup>i</sup>*

where H is the height of the detector and *x* is the interval between surface points. We can

*<sup>i</sup>* is the angle subtended between the normal to the mean surface and the normal to the

2 22 *s d i s*

for just one time, and, depending on the slope, the light reflected will or will not be part of the image. Thus, the image consists of bright and dark regions that we call a glitter pattern.

<sup>1</sup> <sup>1</sup> tan .

 

*i x H* 

*i x H*

. Light from the source is reflected on the surface

*x* , is illuminated by a

. Its image is

*<sup>s</sup>* is defined as the

corresponds to the angle

(22)

*<sup>d</sup>* is changing with respect to each

*<sup>d</sup>* does not restrict the sensor field

(23)

**3. Geometry of the model (Gaussian case considering a variable detector** 

A more real physical situation is shown in figure 4. The surface,

represents the mean angle subtended by the source S. *<sup>d</sup> <sup>i</sup>*

see that in this more realistic physical situation, angle

point in the surface. It is worth noticing that a variable

*i*

 

slope for each *i* point in the surface

The apparent diameter of the source is

uniform incoherent source S of limited angular extent, with wavelength

formed in D by an aberration-free optical system. The incidence angle

function of the intensities in the image.

**angle)**

i. e.

of view.

$$
\Gamma \Pi\_{oi} - \left(\mathbf{1} + \Pi\_{oi}^2\right) \frac{\beta}{\mathbf{4}} \le \Gamma \Gamma\_i \le \Gamma\_{oi} + \left(\mathbf{1} + \Pi\_{oi}^2\right) \frac{\beta}{\mathbf{4}}\tag{25}
$$

$$\Pi\_i = \tan(a\_i),\tag{26}$$

$$\Pi\_{oi} = \tan\left[\frac{\theta\_s + \left(\theta\_d\right)\_i}{2}\right].\tag{27}$$

The interval characterized by equation (25) defines a specular band where certain slopes generate bright spots in the image. This band has now a nonlinear slope due to the variation of *<sup>d</sup> <sup>i</sup>* with respect to each *i* point of the surface (Figure 5). Combining equations (25) – (27), the slope interval, where a bright spot is received by the detector, is

$$\frac{\theta\_s}{2} + \frac{1}{2} \tan^{-1} \left( \frac{i \Delta \mathbf{x}}{H} \right) - \frac{\beta}{4} \le \alpha\_i \le \frac{\theta\_s}{2} + \frac{1}{2} \tan^{-1} \left( \frac{i \Delta \mathbf{x}}{H} \right) + \frac{\beta}{4}. \tag{28}$$

#### **3.1 Relationships among the variances of the intensities in the image and surface slopes**

The mean of the image *I* may be written as (Álvarez-Borrego & Martín-Atienza, 2010)

Statistical Properties of Surface Slopes via Remote Sensing 61

*N I i I ii i I x Ix B pd <sup>N</sup>* 

 

*I x* = *I x* ; therefore

<sup>2</sup> <sup>2</sup> 1 . *I I <sup>I</sup>*

1 1 <sup>1</sup> , 2 4 22 22

*ba ba erf erf erf erf N N* 

which is the required relationship between the variance of the intensities in the image <sup>2</sup>

The relationship between the variance of the surface slopes and the variances of the

The detector is located as shown in figure 4 and the subtended angle by the source is

 . When the camera detector is at H=100 m the behavior of the curves look similar to the curves shown in Álvarez-Borrego & Martín-Atienza, 2010 (figure 6a). In this case, we also can observe that, for big incidence angles (40o – 50o) and small values of variance of the surface slopes, it is possible to obtain bigger values in the variance of the intensities in the

*<sup>I</sup>* versus <sup>2</sup>

These results match with the results presented by Álvarez-Borrego in 1993. Figure 6j was made considering an H=1000 m. The reason for this match is that the condition proposed by

similar to have the sensor camera to an H value very high where the surface slopes values

increasing to 200 m the line of 50o starts to decay and start to cross with the others. In so far

explanation for this is very simple: if the camera stays at H=100 m, it will receive more

camera will receive less light reflection of large incidence angles but will have more light reflection for small incidence angles. Therefore, when the camera is at a larger height, will have more reflection from light incidence angles smaller than light of larger incidence angles. Thus we can say that the results presented by Álvarez-Borrego in 1993, Cureton *et al.*, 2007 and Álvarez-Borrego & Martín-Atienza in 2010 are correct for the Gaussian case.

to analyze images which correspond at two or more incidence angles and to select a slope

variance value which is consistent with all these data (Álvarez-Borrego, 1995).

.

*ii ii*

*Ix Ix*

<sup>1</sup> .

 

 

*<sup>I</sup>* increases for lower

*<sup>s</sup>* angles (10o-50o) is shown in figure 6 (equation 34).

*<sup>d</sup>* value constant (see figure 2). This condition is

*<sup>s</sup>* go down until the order of the curves change. The

*<sup>s</sup>* , because the geometry of reflection. When H increases, the

*<sup>s</sup>* of 10o and 50o. It can be seen that when H is

) are changing while H is being

*<sup>s</sup>* value, it is not possible to obtain a

. To solve this problem, it is necessary

*<sup>s</sup>* values (10o-20o).

(32)

(33)

2

(34)

*I*

 <sup>2</sup> <sup>2</sup> 2 2 1

However, *Bi* <sup>=</sup> <sup>2</sup> *B <sup>i</sup>* , then <sup>2</sup>

1

and the variance of the surface slopes <sup>2</sup>

If we analyze the figure 6j we can observe that <sup>2</sup>

intensities of the image for different

Álvarez-Borrego in 1993 considers a

Figure 6 shows how these relationships ( <sup>2</sup>

In certain cases, if we have data corresponding to one

single value for the variance of the surface slopes <sup>2</sup>

bigger. Dark lines show limit extremes for

as H goes up, the lines, with larger

reflection of light at large

are considered almost constant.

*i*

*N*

2

*I*

0.68*<sup>o</sup>*

image.

Substituting the equation (31) in equation (33), we have

Fig. 5. All the random processes involved in our analysis. The specular band corresponds to bright regions in the image.

$$\mu\_I = \left\langle I(\mathbf{x}) \right\rangle = \bigcap\_{-\infty}^{\infty} B(\Pi\_i) p\left(\Pi\_i\right) d\Pi\_{i\prime} \tag{29}$$

where *Bi* is the glitter function defined be equation (24). *pi* is the probability density function, where a Gaussian function is considered in one dimension. Substituting in equation (29) the expressions for *Bi* and *pi* , we have

$$\mu\_{I} = \left\langle I(\mathbf{x}) \right\rangle = \frac{1}{N} \sum\_{i=1}^{N} \frac{1}{\sigma\_{\Pi} \sqrt{2\pi}} \int\_{-\infty}^{\infty} \text{rect}\left| \frac{\Pi\_{i} - \Pi\_{oi}}{\left(1 + \Pi\_{oi}^{2}\right) \frac{\beta}{2}} \right| \exp\left(-\frac{\Pi\_{i}^{2}}{2\sigma\_{\Pi}^{2}}\right) d\Pi\_{i}.\tag{30}$$

The detector angle *<sup>d</sup>* is a function of the position *x* ; thus, the specular angle is a function of the distance *x* from the nadir point of the detector 0 *n* to the point *n i* (equation 22).

Defining <sup>2</sup> 1 4 *i oi oi a* and <sup>2</sup> 1 4 *i oi oi b* , we can write

$$\mu\_{I} = \left\{ I\left(\mathbf{x}\right) \right\} = \frac{1}{N} \sum\_{i=1}^{N} \frac{1}{2} \left| \operatorname{erf}\left(\frac{b\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) - \operatorname{erf}\left(\frac{a\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) \right|. \tag{31}$$

The variance of the intensities in the image <sup>2</sup> *<sup>I</sup>* is defined by (Álvarez-Borrego & Martín-Atienza, 2010)

Fig. 5. All the random processes involved in our analysis. The specular band corresponds to

 , *I i <sup>i</sup> <sup>i</sup> Ix B p d* 

where *Bi* is the glitter function defined be equation (24). *pi* is the probability density function, where a Gaussian function is considered in one dimension. Substituting in

> *I i <sup>i</sup> oi*

the distance *x* from the nadir point of the detector 0 *n* to the point *n i* (equation 22).

and <sup>2</sup> 1 4 *i oi oi b*

*b a I x erf erf <sup>N</sup>* 

*I x rect d*

exp . 2 2 <sup>1</sup> 2

*<sup>d</sup>* is a function of the position *x* ; thus, the specular angle is a function of

*i i*

1 1 . <sup>2</sup> 2 2

*i oi i*

, we can write

 

<sup>2</sup> <sup>2</sup> <sup>1</sup>

(29)

(30)

(31)

2

*<sup>I</sup>* is defined by (Álvarez-Borrego & Martín-

 

equation (29) the expressions for *Bi* and *pi* , we have

1 1

 

1

*i*

*N*

*N*

*N*

*I*

The variance of the intensities in the image <sup>2</sup>

Defining <sup>2</sup> 1 4 *i oi oi a*

The detector angle

Atienza, 2010)

bright regions in the image.

$$
\sigma\_l^2 = \left\langle I^2 \left( \mathbf{x} \right) \right\rangle - \left\langle I \left( \mathbf{x} \right) \right\rangle^2 = \frac{1}{N} \sum\_{i=1}^N \int \left[ B \left( \Pi\_i \right) - \mu\_l \right]^2 p \left( \Pi\_i \right) d\Pi\_i. \tag{32}
$$

However, *Bi* <sup>=</sup> <sup>2</sup> *B <sup>i</sup>* , then <sup>2</sup> *I x* = *I x* ; therefore

$$
\sigma\_I^2 = \left< I(\mathbf{x}) \right> - \left< I(\mathbf{x}) \right>^2 = \mu\_I \left( 1 - \mu\_I \right). \tag{33}
$$

Substituting the equation (31) in equation (33), we have

$$
\sigma\_{l}^{2} = \frac{1}{N} \sum\_{i=1}^{N} \left| \frac{1}{2} \right[ \text{erf}\left(\frac{b\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) - \text{erf}\left(\frac{a\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) \right] - \frac{1}{4N} \left[ \text{erf}\left(\frac{b\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) - \text{erf}\left(\frac{a\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) \right]^{2} \Big|\_{\epsilon'} \tag{34}
$$

which is the required relationship between the variance of the intensities in the image <sup>2</sup> *I* and the variance of the surface slopes <sup>2</sup> .

The relationship between the variance of the surface slopes and the variances of the intensities of the image for different *<sup>s</sup>* angles (10o-50o) is shown in figure 6 (equation 34). The detector is located as shown in figure 4 and the subtended angle by the source is 0.68*<sup>o</sup>* . When the camera detector is at H=100 m the behavior of the curves look similar to the curves shown in Álvarez-Borrego & Martín-Atienza, 2010 (figure 6a). In this case, we also can observe that, for big incidence angles (40o – 50o) and small values of variance of the surface slopes, it is possible to obtain bigger values in the variance of the intensities in the image.

If we analyze the figure 6j we can observe that <sup>2</sup> *<sup>I</sup>* increases for lower *<sup>s</sup>* values (10o-20o). These results match with the results presented by Álvarez-Borrego in 1993. Figure 6j was made considering an H=1000 m. The reason for this match is that the condition proposed by Álvarez-Borrego in 1993 considers a *<sup>d</sup>* value constant (see figure 2). This condition is similar to have the sensor camera to an H value very high where the surface slopes values are considered almost constant.

Figure 6 shows how these relationships ( <sup>2</sup> *<sup>I</sup>* versus <sup>2</sup> ) are changing while H is being bigger. Dark lines show limit extremes for *<sup>s</sup>* of 10o and 50o. It can be seen that when H is increasing to 200 m the line of 50o starts to decay and start to cross with the others. In so far as H goes up, the lines, with larger *<sup>s</sup>* go down until the order of the curves change. The explanation for this is very simple: if the camera stays at H=100 m, it will receive more reflection of light at large*<sup>s</sup>* , because the geometry of reflection. When H increases, the camera will receive less light reflection of large incidence angles but will have more light reflection for small incidence angles. Therefore, when the camera is at a larger height, will have more reflection from light incidence angles smaller than light of larger incidence angles. Thus we can say that the results presented by Álvarez-Borrego in 1993, Cureton *et al.*, 2007 and Álvarez-Borrego & Martín-Atienza in 2010 are correct for the Gaussian case.

In certain cases, if we have data corresponding to one *<sup>s</sup>* value, it is not possible to obtain a single value for the variance of the surface slopes <sup>2</sup> . To solve this problem, it is necessary to analyze images which correspond at two or more incidence angles and to select a slope variance value which is consistent with all these data (Álvarez-Borrego, 1995).

Statistical Properties of Surface Slopes via Remote Sensing 63

**3.2 Relationship between the correlation functions of the intensities in the image and** 

1 1 , ,

*C C*

<sup>2</sup> <sup>2</sup> 2 2 2 2 1 1

is given by

1 2 12 1 2

*C*

*i i*

*b C a C*

*<sup>s</sup>* value, the relationship between the correlation functions

=0.03 . The correlation function of the intensities in the

*<sup>d</sup>* as variable, is shown in figure 4. We think this is a more realistic

*I* may be written as (Álvarez-Borrego & Martín-Atienza, 2010):

exp , 4 2 2 1 2 1

was calculated. Then, the several computed relationships for each

2 2 1 12 2

*i i i i*

 

 

*<sup>d</sup>* varies point to point in the profile.

*<sup>I</sup>* can be calculated

*<sup>s</sup>* and H (100, 500, 1000 and

(35)

and the

. (36)

(37)

*s*

The relationship between the correlation function of the surface slopes *C*

<sup>2</sup>

 

second integral the process must be numeric. Thus, eq. (35) can be written like

<sup>2</sup> <sup>2</sup> <sup>2</sup> 2 2

 

*C erf erf d N N C C*

and <sup>2</sup> 1 4. *i oi oi b*

 

In order to avoid computer memory problems, the 16384 data point profile was divided into

image is not normalized. Similar to the behavior of the variances, when H increases the

from equation (37). We wrote in Table 2 the values of the image variance in order to

**4. Geometry of the model (Non-Gaussian case considering a variable** 

**4.1 Relationships among the variances of the intensities in the image and surface** 

, exp 2 1 2 1

1 2 1/2 2 2 2 2

*I I i i ii i i*

1 2

Although it is possible to obtain an analytical relationship for the first integral, for the

*C B B p d d*

1 1

*i i*

*N N*

into a number of consecutive intervals. The value of

**slopes considering a non-Gaussian probability density function** 

normalize the correlations in figure 7 for different values for

behavior of the curves have a similar process. A theoretical variance <sup>2</sup>

*N N*

correlation functions of the intensities in the image *CI*

**of the surface slope** 

where *p* 1 2 *i i* , is defined by

*p*

11 2

*i*

*N N b*

*i*

For each interval and for each

In this case we used a value of <sup>2</sup>

and *C*

value were averaged.

*i i a*

where <sup>2</sup> 1 4 *i oi oi a*

*I I*

*CI* 

5000 m).

situation.

**detector angle)** 

The model, considering

The mean of the image

  *i i*

 

Fig. 6. Relationship between the variance of the surface slopes and the variance of the intensities of the image for different H values.

From equation (34), we can see that this relation depends on the probability density function of the surface slopes and the geometry of the experiment only.

#### **3.2 Relationship between the correlation functions of the intensities in the image and of the surface slope**

The relationship between the correlation function of the surface slopes *C* and the correlation functions of the intensities in the image *CI* is given by

$$\sigma\_I^2 \mathbf{C}\_I(\tau) = \frac{1}{N} \sum\_{i=1}^N \int\_0^\sigma \frac{1}{N} \sum\_{i=1}^N \int B(\Pi\_{1i}) B(\Pi\_{2i}) p(\Pi\_{1i}, \Pi\_{2i}) d\Pi\_{1i} d\Pi\_{2i} \tag{35}$$

where *p* 1 2 *i i* , is defined by

62 Remote Sensing – Advanced Techniques and Platforms

Fig. 6. Relationship between the variance of the surface slopes and the variance of the

From equation (34), we can see that this relation depends on the probability density function

intensities of the image for different H values.

of the surface slopes and the geometry of the experiment only.

$$p\left(\Pi\_{1i}, \Pi\_{2i}\right) = \frac{1}{2\pi\sigma\_{\Pi}^{2}\left[1 - \mathbf{C}\_{\Pi}^{2}\left(\tau\right)\right]^{1/2}} \exp\left[-\frac{\Pi\_{1i}^{2} - 2\mathbf{C}\_{\Pi}\left(\tau\right)\Pi\_{1i}\Pi\_{2i} + \Pi\_{2i}^{2}}{2\sigma\_{\Pi}^{2}\left(1 - \mathbf{C}\_{\Pi}^{2}\left(\tau\right)\right)}\right].\tag{36}$$

Although it is possible to obtain an analytical relationship for the first integral, for the second integral the process must be numeric. Thus, eq. (35) can be written like

$$\sigma\_{l}^{2}\mathbf{C}\_{l}(\tau) = \frac{1}{N} \sum\_{i=1}^{N} \int\_{\frac{1}{2}}^{\frac{1}{2}} \frac{1}{N} \sum\_{i=1}^{N} \frac{\sqrt{2}}{4\sigma\_{\Pi}\sqrt{\pi}} \exp\left[-\frac{\Pi\_{2}^{2}}{2\sigma\_{\Pi}^{2}}\right] \left| \text{erf}\left(\frac{b\_{i} - \mathbf{C}\_{\Pi}(\tau)\Pi\_{2}}{\sqrt{2\sigma\_{\Pi}^{2}\left[1 - \mathbf{C}\_{\Pi}^{2}(\tau)\right]}}\right) - \text{erf}\left(\frac{a\_{i} - \mathbf{C}\_{\Pi}(\tau)\Pi\_{2}}{\sqrt{2\sigma\_{\Pi}^{2}\left[1 - \mathbf{C}\_{\Pi}^{2}(\tau)\right]}}\right) \right| d\Pi\_{2},\tag{37}$$

where <sup>2</sup> 1 4 *i oi oi a* and <sup>2</sup> 1 4. *i oi oi b* 

In order to avoid computer memory problems, the 16384 data point profile was divided into into a number of consecutive intervals. The value of *<sup>d</sup>* varies point to point in the profile. For each interval and for each *<sup>s</sup>* value, the relationship between the correlation functions *CI* and *C* was calculated. Then, the several computed relationships for each *s* value were averaged.

In this case we used a value of <sup>2</sup> =0.03 . The correlation function of the intensities in the image is not normalized. Similar to the behavior of the variances, when H increases the behavior of the curves have a similar process. A theoretical variance <sup>2</sup> *<sup>I</sup>* can be calculated from equation (37). We wrote in Table 2 the values of the image variance in order to normalize the correlations in figure 7 for different values for *<sup>s</sup>* and H (100, 500, 1000 and 5000 m).

#### **4. Geometry of the model (Non-Gaussian case considering a variable detector angle)**

The model, considering *<sup>d</sup>* as variable, is shown in figure 4. We think this is a more realistic situation.

#### **4.1 Relationships among the variances of the intensities in the image and surface slopes considering a non-Gaussian probability density function**

The mean of the image *I* may be written as (Álvarez-Borrego & Martín-Atienza, 2010):

Statistical Properties of Surface Slopes via Remote Sensing 65

Fig. 7. Relationship between the correlation function of the surface slopes and the correlation

<sup>2</sup> 4 2 <sup>2</sup> <sup>1</sup> <sup>4</sup>

*I i <sup>i</sup> oi i i*

the distance x from the nadir point of the detector, n = 0, to the point n = i (see equation (22)).

*I i i*

 

> 

2 2 2 8

 

2 3 4

*a a*

2 3 4

*i i*

exp 3 2 62 24 2

2 2 3

 

 (41)

 

exp 3 2 62 24 2

*i i*

2 2 3

<sup>1</sup> 1 3 <sup>2</sup> <sup>6</sup> 1 1

3

3

 

4 2

2 2 2 2

2 2 2 2

*i i*

*i i*

 

(40)

 

, we can write

.

 

2 6 3 24

*<sup>d</sup>* is a function of the position x, thus, the specular angle is a function of

1 1 1 3

*a a*

*b b b b*

exp . 2 2 <sup>1</sup> <sup>1</sup>

and <sup>2</sup> 1 4 *i oi oi b*

*i rect d*

function of the intensities in the image.

Writing again <sup>2</sup> 1 4 *i oi oi a*

1

*N*

1

*i*

*N*

*N*

The detector angle

 

*N*

*i oi*

*b a erf erf*

*i i*


Table 2. Values of the image variance in order to normalize the correlations in figure 7 for different values for *<sup>s</sup>* and H.

$$\mu\_{\rm I} = \left\langle I(x) \right\rangle = \frac{1}{N} \sum\_{i=1}^{N} \int\_{-\infty}^{+\infty} B(\Pi\_i) p\left(\Pi\_i\right) d\Pi\_i \tag{38}$$

where *B <sup>i</sup>* is the glitter function defined by equation (24). *pi* is the probability density function, where a non-Gaussian function is considered in one dimension (Cureton, 2010)

$$p\left(\Pi\_{i}\right) = \frac{1}{\sigma\_{\Pi}\sqrt{2\pi}}\exp\left(-\frac{\Pi\_{i}^{2}}{2\sigma\_{\Pi}^{2}}\right)\left[1 + \frac{1}{6}\lambda\_{\Pi}^{(3)}\left\{\left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right)^{3} - 3\left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right)\right\} + \frac{1}{24}\lambda\_{\Pi}^{(4)}\left\{\left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right)^{4} - 6\left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right)^{2} + 3\right\}\right],\tag{39}$$

where <sup>3</sup> is the skewness, <sup>4</sup> is the kurtosis and is the standard deviation of the surface slopes.

Substituting in equation (38) the expressions for *Bi* and *pi* , we have

100 10 0.03 0.00003160564 100 20 0.03 0.00005271762 100 30 0.03 0.00014855790 100 40 0.03 0.00058990210 100 50 0.03 0.00195377600 500 10 0.03 0.00015853820 500 20 0.03 0.00023902050 500 30 0.03 0.00043911520 500 40 0.03 0.00107317300 500 50 0.03 0.00269619900 1000 10 0.03 0.00031712280 1000 20 0.03 0.00047002010 1000 30 0.03 0.00078709770 1000 40 0.03 0.00161060600 1000 50 0.03 0.00344703200 5000 10 0.03 0.00158160000 5000 20 0.03 0.00228022000 5000 30 0.03 0.00332568200 5000 40 0.03 0.00498063700 5000 50 0.03 0.00723998800 Table 2. Values of the image variance in order to normalize the correlations in figure 7 for

> 1

 

where *B <sup>i</sup>* is the glitter function defined by equation (24). *pi* is the probability density function, where a non-Gaussian function is considered in one dimension (Cureton, 2010)

 

is the kurtosis and

3 4

<sup>2</sup> 11 1 exp 1 <sup>3</sup> 6 3,

3 4 2

is the standard deviation of the

*ii ii*

 

1 *<sup>N</sup> I i ii i Ix B p d <sup>N</sup>* 

2 

2 *I*

(38)

(39)

H

different values for

*i*

where <sup>3</sup> 

surface slopes.

*<sup>s</sup>* and H.

2

 

is the skewness, <sup>4</sup>

*<sup>i</sup> <sup>p</sup>*

2 2 6 24

Substituting in equation (38) the expressions for *Bi* and *pi* , we have

*s*

Fig. 7. Relationship between the correlation function of the surface slopes and the correlation function of the intensities in the image.

$$\mu\_{l} = \frac{1}{N} \sum\_{i=1}^{N} \frac{1}{\sigma\_{\Pi} \sqrt{2\pi}} \int\_{-\infty}^{+\infty} \text{rect}\left[\frac{\Pi\_{i} - \Pi\_{oi}}{\left(1 + \Pi\_{oi}^{2}\right)\frac{\beta}{2}}\right] \exp\left(-\frac{\Pi\_{i}^{2}}{2\sigma\_{\Pi}^{2}}\right) \left|\begin{matrix} 1 + \frac{1}{6}\lambda\_{\Pi}^{(3)} \\ \left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right)^{3} - 3\left(\frac{\Pi\_{i}}{\sigma\_{\Pi}}\right) \end{matrix}\right| \begin{matrix} \frac{1}{\sigma\_{\Pi}} \\ \frac{1}{\sigma\_{\Pi}} \end{matrix} + \frac{1}{2\left(\frac{1}{6}\lambda\_{\Pi}^{(4)} - \frac{1}{2}\frac{\beta}{2}\right)} \int\_{-\infty}^{+\infty} d\Pi\_{i}. \tag{40}$$

The detector angle *<sup>d</sup>* is a function of the position x, thus, the specular angle is a function of the distance x from the nadir point of the detector, n = 0, to the point n = i (see equation (22)).

Writing again <sup>2</sup> 1 4 *i oi oi a* and <sup>2</sup> 1 4 *i oi oi b* , we can write

$$\mu\_{l} = \frac{1}{N} \sum\_{i=1}^{N} \left\{ \begin{aligned} &crf\left(\frac{b\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) - crf\left(\frac{a\_{i}}{\sqrt{2}\sigma\_{\Pi}}\right) \Bigg| \left[\frac{1}{2} + \frac{1}{8}\lambda\_{\Pi}^{(4)}\left(1 - 3\sigma\_{\Pi}^{2}\right)\right] + \\ & \left[-\frac{a\_{i}^{2}}{2\sigma\_{\Pi}^{2}}\right] \Bigg| \left[\frac{\lambda\_{\Pi}^{(3)}}{6\sqrt{2}\pi\sigma\_{\Pi}^{2}}\left(a\_{i}^{2} - \sigma\_{\Pi}^{2}\right) + \frac{\lambda\_{\Pi}^{(4)}a\_{i}}{24\sqrt{2}\pi\sigma\_{\Pi}^{3}}\left(a\_{i}^{2} - 3\sigma\_{\Pi}^{2}\right)\right] + \\ & \left[+\exp\left(-\frac{b\_{i}^{2}}{2\sigma\_{\Pi}^{2}}\right) \Bigg| \left[\frac{\lambda\_{\Pi}^{(3)}}{6\sqrt{2}\pi\sigma\_{\Pi}^{2}}\left(\sigma\_{\Pi}^{2} - b\_{i}^{2}\right) + \frac{\lambda\_{\Pi}^{(4)}b\_{i}}{24\sqrt{2}\pi\sigma\_{\Pi}^{3}}\left(3\sigma\_{\Pi}^{2} - b\_{i}^{2}\right)\right] \end{aligned} \tag{41}$$

Statistical Properties of Surface Slopes via Remote Sensing 67

Fig. 8. Relationship between the variance of the surface slopes and the variance of the intensities of the image, for different H values considering a non-Gaussian probability

About the non-Gaussian case we can conclude that the main difference with the Gaussian case is the less higher values of the variance of the intensities of the image for small values

*s* is in the 40o – 50o range when H=100 m. In addition, when

density function where the skewness has been taken account only.

of surface slope variance when

The variance of the intensities in the image <sup>2</sup> *<sup>I</sup>* is defined by equation (33). Substituting equation (41) in equation (33) we have

 4 2 2 3 4 2 2 2 2 2 2 2 3 2 3 4 2 2 2 2 2 2 3 1 1 1 3 2 2 2 8 1 exp 3 2 62 24 2 exp 3 2 62 24 2 *i i i i I i i i i i i b a erf erf a a a a N b b b b* 1 4 2 2 3 4 2 2 2 2 2 22 3 2 3 2 2 2 1 1 1 3 2 2 2 8 1 exp 3 2 62 24 2 exp 2 62 *N i i i i i i i i b a erf erf a a a a N b* 2 1 4 2 2 2 <sup>3</sup> 3 24 2 *N i i i i <sup>b</sup> b b* (42)

which is the required relationship between the variance of the intensities in the image <sup>2</sup> *I* and the variance of the surface slopes <sup>2</sup> when a non-Gaussian probability density function is considered.

The relationship between the variance of the surface slopes and the variances of the intensities of the image for different *<sup>s</sup>* angles (10o-50o) is shown in figures 8 and 9 (equation 42). Figures 8 and 9 show this relationship considering the skewness and the skewness and kurtosis in the non-Gaussian probability density function respectively. We can see that the behavior of the curves looks very similar to the Gaussian case (figure 6). The values for skewness and kurtosis were taken from a Table showed by Plant (2003) from data given by Cox & Munk (1956), for a wind speed of 13.3 m/s with the wind sensor at 12.5 m on the sea surface level.

The curves including the skewness and skewness and kurtosis are little higher for small values of <sup>2</sup> compared with the Gaussian case (figure 6) except when *<sup>s</sup>* is below 40o where the Gaussian and non-Gaussian cases (considering skewness only) are inverted to small surface slope variances, and these results show that <sup>2</sup> *<sup>I</sup>* increases for higher*<sup>s</sup>* values (figures 8a and 9a). Cox & Munk (1956) reported <sup>2</sup> values of 0.04 and 0.05 like maximum values of the surface slopes in the wind direction and values of 0.03 in the cross wind direction for wind speed bigger than 10 m/s. Thus, we think that in the range for <sup>2</sup> from 0-0.05 the behavior of the curves look very clear and separate each one of the other (figures 8a and 9a). If we analyze the figures 8j and 9j we can observe that <sup>2</sup> *<sup>I</sup>* increases for lower *s* values (10o-20o).

Figures 8 and 9 show how these relationships ( <sup>2</sup> *<sup>I</sup>* versus <sup>2</sup> ) are changing while H is being bigger, where the skewness and skewness and kurtosis are being considered. These curves have the same behavior like in the Gaussian case and the explanation for this inversion is the same as explained before.

2 3 4 2 2 2 2 2

 

*a a*

2 3 4

*i i*

exp 3 2 62 24 2

2 2 3

 

 

 

exp 3 2 62 24 2

*i i*

2 2 3

2 3 4

 

*a a*

exp 3 2 62 24 2

*i i*

2

which is the required relationship between the variance of the intensities in the image <sup>2</sup>

The relationship between the variance of the surface slopes and the variances of the intensities

8 and 9 show this relationship considering the skewness and the skewness and kurtosis in the non-Gaussian probability density function respectively. We can see that the behavior of the curves looks very similar to the Gaussian case (figure 6). The values for skewness and kurtosis were taken from a Table showed by Plant (2003) from data given by Cox & Munk (1956), for a

The curves including the skewness and skewness and kurtosis are little higher for small

where the Gaussian and non-Gaussian cases (considering skewness only) are inverted to

values of the surface slopes in the wind direction and values of 0.03 in the cross wind direction for wind speed bigger than 10 m/s. Thus, we think that in the range for <sup>2</sup>

0-0.05 the behavior of the curves look very clear and separate each one of the other (figures

bigger, where the skewness and skewness and kurtosis are being considered. These curves have the same behavior like in the Gaussian case and the explanation for this inversion is

*<sup>I</sup>* versus <sup>2</sup>

compared with the Gaussian case (figure 6) except when

*b b b b*

*I i i*

> 

2 2 2 8

 

2 22 3

 

 

2 3

*i i*

exp 2 62

*i*

*b*

 

2 2

wind speed of 13.3 m/s with the wind sensor at 12.5 m on the sea surface level.

 

2 2 2 8

 

*i i*

*b a erf erf*

*b a erf erf*

*<sup>I</sup>* is defined by equation (33). Substituting

 

2

(42)

*I*

*<sup>I</sup>* increases for lower

) are changing while H is being

*<sup>I</sup>* increases for higher

values of 0.04 and 0.05 like maximum

*<sup>s</sup>* is below 40o

*<sup>s</sup>* values

*s*

from

 

 

 

4 2

1 1 1 3

*a a*

 

1 1 1 3

*i i*

*a a*

2 2 2 2

*i i*

 

4 2

2 2 2 2

4 2 2 2 <sup>3</sup> 3

*i i i <sup>b</sup> b b* 

when a non-Gaussian probability density function

24 2

*<sup>s</sup>* angles (10o-50o) is shown in figures 8 and 9 (equation 42). Figures

The variance of the intensities in the image <sup>2</sup>

equation (41) in equation (33) we have

1

*N*

1

*N*

is considered.

values of <sup>2</sup>

values (10o-20o).

the same as explained before.

of the image for different

1

and the variance of the surface slopes <sup>2</sup>

small surface slope variances, and these results show that <sup>2</sup>

8a and 9a). If we analyze the figures 8j and 9j we can observe that <sup>2</sup>

(figures 8a and 9a). Cox & Munk (1956) reported <sup>2</sup>

Figures 8 and 9 show how these relationships ( <sup>2</sup>

*i*

*N*

1

*i*

*N*

Fig. 8. Relationship between the variance of the surface slopes and the variance of the intensities of the image, for different H values considering a non-Gaussian probability density function where the skewness has been taken account only.

About the non-Gaussian case we can conclude that the main difference with the Gaussian case is the less higher values of the variance of the intensities of the image for small values of surface slope variance when *s* is in the 40o – 50o range when H=100 m. In addition, when

Statistical Properties of Surface Slopes via Remote Sensing 69

**4.2 Relationship between the correlation functions of the intensities in the image and** 

As mentioned before, our analysis involves three random processes: the surface profile

*x* , its surface slopes *x* and the image *I x* . Each process has a correlation function

*I I i i ii i i*

*C B B p d d*

1 1 , ,

*C C*

is given by

1 2 12 1 2

*C*

*C*

 

2 2 1 12 2

> 

> 

*i i i i*

 

,

are the relationship between the

(43)

and the

(44)

(45)

**of the surface slope considering a non-Gaussian probability density function** 

and it was shown in (Álvarez-Borrego, 1993) that these three functions are related. The relationship between the correlation function of the surface slopes *C*

<sup>2</sup>

 

 

<sup>1</sup> <sup>2</sup> , exp

1 2 1/2 2 2 2 2

2 1 2 1

 

second integral the process must be numeric. Thus, equation (43) can be written like

*<sup>i</sup> I I i i i ii <sup>i</sup>*

2 1

*C ua v A B a C d*

Although it is possible to obtain an analytical relationship for the first integral, for the

 2 2 <sup>1</sup> ,

 

*C*

 

2

21 12 2 2 2 1

*ii i i*

21 2 2 2

 *C* 

> and <sup>21</sup>

*ub v A B b C*

2 2

*i i i ii*

exp exp , <sup>2</sup>

2 2 22 2 2 2 2

*erf ub v erf ua v A B C*

*ii ii i i*

2 2 32 32 3

3

2 12 1 2 1

*ii i*

2

 

 

1 1

*i i*

*N N*

3 30 1 1 2

2

3

 

*i i*

 

3 2

*N N*

correlation function of the intensities in the image *CI*

*p*

*i i*

 and <sup>30</sup> 

1 1

1 1

*N N*

*i*

*N N b*

*i*

*i i a*

moments of 1*<sup>i</sup>* and 2*<sup>i</sup>* .

where <sup>03</sup> 

 

where

 

12 1

 

*i*

2 2 2 2

3 03 2 2 2

3

are the skewness, <sup>12</sup>

exp

*u*

 

*i i*

where *p* 1 2 *i i* , is defined by (Cureton, 2010)

Fig. 9. Relationship between the variance of the surface slopes and the variance of the intensities of the image, for different H values considering a non-Gaussian probability density function where the skewness and kurtosis have been taken account.

H=1000 m this condition is inverted, we can find less smaller values of the variance of the intensities of the image for small values of surface slope variance when *<sup>s</sup>* is in the 10o – 20o range. In the other angles, in both cases, it is not possible to see significant differences between the values 10o – 30o when H=100 m and 30o – 50o when H=1000 m.

Fig. 9. Relationship between the variance of the surface slopes and the variance of the intensities of the image, for different H values considering a non-Gaussian probability

H=1000 m this condition is inverted, we can find less smaller values of the variance of the

range. In the other angles, in both cases, it is not possible to see significant differences

*<sup>s</sup>* is in the 10o – 20o

density function where the skewness and kurtosis have been taken account.

intensities of the image for small values of surface slope variance when

between the values 10o – 30o when H=100 m and 30o – 50o when H=1000 m.

#### **4.2 Relationship between the correlation functions of the intensities in the image and of the surface slope considering a non-Gaussian probability density function**

As mentioned before, our analysis involves three random processes: the surface profile *x* , its surface slopes *x* and the image *I x* . Each process has a correlation function and it was shown in (Álvarez-Borrego, 1993) that these three functions are related.

The relationship between the correlation function of the surface slopes *C* and the correlation function of the intensities in the image *CI* is given by

$$
\sigma\_I^2 \mathbb{C}\_I(\tau) = \frac{1}{N} \sum\_{i=1}^N \int\_0^\sigma \frac{1}{N} \sum\_{i=1}^N \int B(\Pi\_{1i}) B(\Pi\_{2i}) p(\Pi\_{1i}, \Pi\_{2i}) d\Pi\_{1i} d\Pi\_{2i} \tag{43}
$$

where *p* 1 2 *i i* , is defined by (Cureton, 2010)

$$\begin{aligned} p\left(\Pi\_{1i},\Pi\_{2i}\right) &= \frac{1}{2\pi\sigma\_{\Pi}^{2}\left[1-\mathsf{C}\_{\Pi}^{2}\left(\mathsf{r}\right)\right]^{1/2}}\exp\left[-\frac{\Pi\_{1i}^{2}-2\mathsf{C}\_{\Pi}\left(\mathsf{r}\right)\Pi\_{1i}\Pi\_{2i}+\Pi\_{2i}^{2}}{2\sigma\_{\Pi}^{2}\left(1-\mathsf{C}\_{\Pi}^{2}\left(\mathsf{r}\right)\right)}\right] \times \\\\ \left\{ \begin{array}{l} \left.\lambda\_{\Pi}^{(3)}\left[\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)^{3}-3\sigma\_{\Pi}^{2}\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)\right]+\\ \left.3\lambda\_{\Pi}^{(2)}\left[\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)^{2}\left(\frac{\Pi\_{2i}}{\sigma\_{\Pi}}\right)-\sigma\_{\Pi}^{2}\left(\frac{\Pi\_{2i}}{\sigma\_{\Pi}}\right)+2\sigma\_{\Pi}^{2}\mathcal{C}\_{\Pi}\left(\mathsf{r}\right)\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)\right]+\\ \left.3\lambda\_{\Pi}^{(12)}\left[\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)\left(\frac{\Pi\_{2i}}{\sigma\_{\Pi}}\right)^{2}-\sigma\_{\Pi}^{2}\left(\frac{\Pi\_{1i}}{\sigma\_{\Pi}}\right)+2\sigma\_{\Pi}^{2}\mathcal{C}\_{\Pi}\left(\mathsf{r}\right)\left(\frac{\Pi\_{2i}}{\sigma\_{\Pi}}\right)\right]+\\ \end{aligned} \tag{44}$$

where <sup>03</sup> and <sup>30</sup> are the skewness, <sup>12</sup> and <sup>21</sup> are the relationship between the moments of 1*<sup>i</sup>* and 2*<sup>i</sup>* .

Although it is possible to obtain an analytical relationship for the first integral, for the second integral the process must be numeric. Thus, equation (43) can be written like

$$\sigma\_{l}^{2}\mathbf{C}\_{l}\left(\mathbf{r}\right) = \frac{1}{N} \sum\_{i=1}^{N} \left| \frac{1}{N} \sum\_{i=1}^{N} \exp\left(-\frac{\Pi\_{2i}}{2\sigma\_{\Pi}^{2}}\right) \times \begin{bmatrix} \exp\left[-\left(\imath b\_{i} + \imath \boldsymbol{\Pi}\_{2i}\right)^{2}\right] \times \left(A\_{1}\Pi\_{2i}^{2} + B\_{1}b\_{i}\Pi\_{2i} + \mathbf{C}\_{1}\right) \\\\+\exp\left[-\left(\imath a\_{i} + \imath \boldsymbol{\Pi}\_{2i}\right)^{2}\right] \times \left(A\_{2}\Pi\_{2i}^{2} + B\_{2}a\_{i}\Pi\_{2i} + \mathbf{C}\_{2}\right) \\\\+\left[\epsilon\boldsymbol{\sigma} f\left(\imath b\_{i} + \imath \boldsymbol{\Pi}\_{2i}\right) - \epsilon\boldsymbol{\sigma} f\left(\imath a\_{i} + \imath \boldsymbol{\Pi}\_{2i}\right)\right] \times \left(A\_{3}\Pi\_{2i}^{3} + B\_{3}\Pi\_{2i} + \mathbf{C}\_{3}\right) \end{bmatrix} d\Pi\_{2i'} \tag{45}$$

where

$$\mu = \frac{1}{\sqrt{2\sigma\_{\Pi}^2 \left[1 - \mathcal{C}\_{\Pi}^2(\tau)\right]}} \prime$$

Statistical Properties of Surface Slopes via Remote Sensing 71

Fig. 10. Relationship between the correlation function of the surface slopes and the correlation

*s* .

function of the intensities in the image. The curves correspond to different values of

 2 2 , 2 1 *C v C* 2 2 30 21 12 1 3 2 1 3 3, 12 *C A CC A* 2 30 21 1 2 3 1 3 , 12 *C BC B* 2 22 2 4 4 30 21 4 12 1 3 1 21 3 6 3 , 12 *<sup>i</sup> C C bC C* 2 22 2 4 4 30 21 4 12 2 3 1 21 3 6 3 , 12 *<sup>i</sup> C C aC C* 3 2 <sup>30</sup> <sup>21</sup> 12 03 3 4 <sup>2</sup> 33 , 24 *A CCC* 2 2 30 2 21 12 03 3 2 2 2 1 1 12 1 , <sup>8</sup> *C C B C <sup>C</sup> <sup>C</sup>* 

$$\mathcal{C}\_3 = \frac{\sqrt{2}}{4\sqrt{\pi}\sigma\_{\text{II}}}.$$

Figure 10 shows graphically the relationship between the normalized correlation function of the surface slopes *<sup>n</sup> C* and the normalized correlation function of the intensities of the image *<sup>I</sup> <sup>n</sup> <sup>C</sup>* . In this case a <sup>2</sup> 0.03 was used. When H increases the behavior of the curves have a similar process like the variance curves.

When H=100 m (Figure 10a) the behavior of the curves for *<sup>s</sup>* of 10o – 20o have an "unusual" behavior for low surface slope variances when compared with Gaussian case. This is because the inversion of the curves starts to lower values of H. In order to avoid memory computer problems, the 16384 data points profile was divided into a number of consecutive intervals. The value of *<sup>d</sup>* varies point to point in the profile. For each interval and for each *<sup>s</sup>* value, the relationship between the correlation functions *CI* and *C* was calculated. Then, all the computed relationships for each *<sup>s</sup>* value were averaged.

A theoretical variance <sup>2</sup> *<sup>I</sup>* can be calculated from equation (45). We wrote in Table 3 the values of the image variance in order to normalize the correlations in figure 10 for different values for *<sup>s</sup>* and H (100, 500, 1000 and 5000 m).

*C*

2 1

*v*

2

*C bC C*

*C aC C*

*C C B C <sup>C</sup> <sup>C</sup>*

 

*<sup>s</sup>* value, the relationship between the correlation functions *CI*

2

1

12 *C*

1

2

2

the surface slopes *<sup>n</sup> C*

intervals. The value of

A theoretical variance <sup>2</sup>

image *<sup>I</sup> <sup>n</sup> <sup>C</sup>* 

values for

12 *<sup>i</sup> C*

12 *<sup>i</sup> C*

3 4

 

curves have a similar process like the variance curves.

When H=100 m (Figure 10a) the behavior of the curves for

calculated. Then, all the computed relationships for each

*<sup>s</sup>* and H (100, 500, 1000 and 5000 m).

. In this case a <sup>2</sup>

3 2 2 2 1 1

24

1 3 1

2 3 1

12 *C*

 2 2

*C* 

3 2 <sup>30</sup> <sup>21</sup> 12 03

<sup>2</sup> 33 ,

22 2 4 4 30 21 4 12

1 3 2

1 2 3

 

 

12 1 , <sup>8</sup>

<sup>2</sup> . <sup>4</sup>

Figure 10 shows graphically the relationship between the normalized correlation function of

behavior for low surface slope variances when compared with Gaussian case. This is because the inversion of the curves starts to lower values of H. In order to avoid memory computer problems, the 16384 data points profile was divided into a number of consecutive

values of the image variance in order to normalize the correlations in figure 10 for different

 

 

 

 

2 2

3

*C*

*A CCC*

*BC B*

*A CC A*

2 30 21 12

 

30 21

22 2 4 4 30 21 4 12

21 3 6 3 ,

   

 

> 

 

*<sup>s</sup>* of 10o – 20o have an "unusual"

and *C*

was

*<sup>s</sup>* value were averaged.

 

 

 

21 3 6 3 ,

 

> 

30 2 21 12 03

and the normalized correlation function of the intensities of the

*<sup>d</sup>* varies point to point in the profile. For each interval and for each

*<sup>I</sup>* can be calculated from equation (45). We wrote in Table 3 the

0.03 was used. When H increases the behavior of the

 

 

 

  ,

3 3,

3 ,

 

Fig. 10. Relationship between the correlation function of the surface slopes and the correlation function of the intensities in the image. The curves correspond to different values of *s* .

Statistical Properties of Surface Slopes via Remote Sensing 73

This work was partially supported by CONACyT with grant No. 102007 and SEP-

Álvarez-Borrego, J. (1987). Optical analysis of two simulated images of the sea surface.

Álvarez-Borrego, J. (1993). Wave height spectrum from sun glint patterns: an inverse

Álvarez-Borrego, J. (1995). Some statistical properties of surface heights via remote sensing.

Álvarez-Borrego, J. & Machado M. A. (1985). Optical analysis of a simulated image of the sea surface. *Applied Optics,* Vol.24, No.7, pp. 1064-1072, ISSN 1559-128X Álvarez-Borrego, J. & Martin-Atienza, B. (2010). An improved model to obtain some

Bréon, F. M. & Henrist N. (2006). Spaceborn observations of ocean glint reflectance and

Chapman, R. D. & Irani G. B. (1981). Errors in estimating slope spectra from wave images.

Chapron, B.; Vandemark D. & Elfouhaily T. (2002). On the skewness of the sea slope

Cox, C. & Munk W. (1954a). Statistics of the sea surface derived from sun glitter. *Journal* 

Cox, C. & Munk W. (1954b). Measurements of the roughness of the sea surface from

Cox, C. & Munk W. (1955). Some problems in optical oceanography. *Journal of Marine* 

Cox, C. & Munk. W. (1956). Slopes of the sea surface deduced from photographs of sun glitter. *Bulletin of the Scripps Institution of Oceanography,* Vol.6, No.9, pp. 401-488 Cureton, G. P. (2010). *Retrieval of nonlinear spectral information from ocean sunglint*. PhD thesis,

Cureton, G. P.; Anderson, S. J.; Lynch, M. J. & McGann, B. T. (2007). Retrieval of wind wave

Fuks, I. M. & Charnotskii, M. I. (2006). Statistics of specular points at a randomly rough

elevation spectra from sunglint data. *IEEE Transactions on Geoscience and Remote* 

surface. *Journal of the Optical Society of America, Optical Image Science,* Vol.23, No.1,

*Applied Optics,* Vol.20, No.20, pp. 3645-3652, ISSN 1559-128X

*Marine Research,* Vol.13, No.2, pp. 198-227, ISSN 0022-2402

No.11, pp. 838-850, ISSN 1084-7529

pp. 73-80, ISSN 1084-7529

*Research,* Vo.14, pp. 63-78, ISSN 0022-2402

Curtin University of Technology, Australia, March

*Sensing,* Vol.45, No.9, pp. 2829-2836, ISSN 0196-2892

*Journal of Modern Optics,* Vol.42, No.2, pp. 279-288, ISSN 0950-0340

*Proceedings SPIE International Society of the Optical Engineering,* Vol.804, pp.192-200,

problem. *Journal of Geophysical Research,* Vol.98, No.C6, pp. 10245-10258, ISSN 0148-

statistical properties of surface slopes via remote sensing using variable reflection angle. *IEEE Transactions on Geoscience and Remote Sensing,* Vol.48, No.10, pp. 3647-

modeling of wave slope distributions. *Journal Geophysical Research,* Vol.111, CO6005,

probability distribution. *Gas Transfer at Water Surfaces,* Vol.127, pp. 59-63, ISSN

photographs of the Sun's glitter. *Journal of the Optical Society of America,* Vol.24,

**6. Acknowledgments** 

**7. References** 

0227

PROMET/103.5/10/5021 (UABC-PTC-225).

ISSN 0277-786X

3651, ISSN 0196-2892

ISSN 0148-0227

0875909868


Table 3. Values of the image variance in order to normalize the correlations in figure 10 for different values for *<sup>s</sup>* and H.

## **5. Conclusions**

We derive the variance of the surface heights from the variance of the intensities in the image via remote sensing considering a glitter function given by equation (12) when the geometry consider a detector angle of 0*<sup>o</sup> <sup>d</sup>* , and considering a glitter function given by the equation (24) considering a geometrically improved model with variable detector line of sight angle, given by figure 4. In this last case, we consider Gaussian statistics and non-Gaussian statistics. We derive the variance of the surface slopes from the variance of the intensities of remote sensed images for different H values. In addition, we discussed the determination of the correlation function of the surface slopes from the correlation function of the image intensities considering Gaussian and non-Gaussian statistics.

Analyzing the variances curves for Gaussian and non-Gaussian case it is possible to see the behavior of the curves for different incident angles when H increases. This behavior agrees with the results presented by Álvarez-Borrego (1993) and Geoff Cureton *et al*. 2007, and Álvarez-Borrego and Martin-Atienza (2010) for the Gaussian case.

These new results solve the inverse problem when it is necessary to analyze the statistical of a real sea surface via remote sensing using the image of the glitter pattern of the marine surface.

## **6. Acknowledgments**

This work was partially supported by CONACyT with grant No. 102007 and SEP-PROMET/103.5/10/5021 (UABC-PTC-225).

## **7. References**

72 Remote Sensing – Advanced Techniques and Platforms

100 10 0.03 0.003126364 100 20 0.03 0.004354971 100 30 0.03 0.006071378 100 40 0.03 0.008187813 100 50 0.03 0.009875824 500 10 0.03 0.012038690 500 20 0.03 0.011886750 500 30 0.03 0.009668245 500 40 0.03 0.006645083 500 50 0.03 0.003959459 1000 10 0.03 0.012945720 1000 20 0.03 0.010339930 1000 30 0.03 0.006902623 1000 40 0.03 0.004036960 1000 50 0.03 0.002067475 5000 10 0.03 0.011358240 5000 20 0.03 0.007713670 5000 30 0.03 0.004572885 5000 40 0.03 0.002406005 5000 50 0.03 0.001022463 Table 3. Values of the image variance in order to normalize the correlations in figure 10 for

We derive the variance of the surface heights from the variance of the intensities in the image via remote sensing considering a glitter function given by equation (12) when the

the equation (24) considering a geometrically improved model with variable detector line of sight angle, given by figure 4. In this last case, we consider Gaussian statistics and non-Gaussian statistics. We derive the variance of the surface slopes from the variance of the intensities of remote sensed images for different H values. In addition, we discussed the determination of the correlation function of the surface slopes from the correlation function

Analyzing the variances curves for Gaussian and non-Gaussian case it is possible to see the behavior of the curves for different incident angles when H increases. This behavior agrees with the results presented by Álvarez-Borrego (1993) and Geoff Cureton *et al*. 2007, and

These new results solve the inverse problem when it is necessary to analyze the statistical of a real sea surface via remote sensing using the image of the glitter pattern of the marine

*<sup>d</sup>* , and considering a glitter function given by

of the image intensities considering Gaussian and non-Gaussian statistics.

Álvarez-Borrego and Martin-Atienza (2010) for the Gaussian case.

2 

2 *I*

*s*

H

different values for

**5. Conclusions** 

surface.

*<sup>s</sup>* and H.

geometry consider a detector angle of 0*<sup>o</sup>*


**4**

**Classification of Pre-Filtered**

*1National Aerospace University* 

*2University of Rennes 1 3Plymouth Marine Laboratory* 

> *1Ukraine 2France 3UK*

**Multichannel Remote Sensing Images** 

Vladimir Lukin1, Nikolay Ponomarenko1, Dmitriy Fevralev1,

Multichannel remote sensing (RS) has gained popularity and has been successfully applied for solving numerous practical tasks as forestry, agriculture, hydrology, meteorology, ecology, urban area and pollution control, etc. (Chang, 2007). Using the term "multichannel", we mean a wide set of imaging approaches and RS systems (complexes) including multifrequency and dual/multi polarization radar (Oliver & Quegan, 2004), multi- and hyperspectral optical and infrared sensors. While for such radars the number of formed images is a few, the number of channels (components or sub-bands) in images can be tens, hundreds and even more than one thousand for optical/infrared imagers. TerraSAR-X is a good example of modern multichannel radar system; AVIRIS, HYDICE, HYPERION and others can serve as examples of modern hyperspectral imagers, both

An idea behind increasing the number of channels is clear and simple: it is possible to expect that more useful information can be extracted from more data or this information is more reliable and accurate. However, the tendency to increasing the channels' (sub-band) number has also its "black" side. One has to register, to process, to transmit and to store more data. Even visualization of the obtained multichannel images for their displaying at tristimuli monitors becomes problematic (Zhang et al., 2008). Huge size of the obtained data leads to difficulties at any standard stage of multichannel image processing involving calibration, georeferencing, compression if used (Zabala et al., 2006). But, probably, the most essential

a. Noise characteristics in multichannel image components can be considerably different in the sense of noise type (additive, multiplicative, signal-dependent, mixed), statistics (probability density function (PDF), variance), spatial correlation (Kulemin et al., 2004;

airborne and spaceborne (Landgrebe, 2002; Schowengerdt, 2007).

problems arise in image pre-filtering and classification. The complexity of these tasks deals with the following:

Barducci et al., 2005, Uss et al., 2011, Aiazzi et al., 2006);

**1. Introduction** 

Benoit Vozel2, Kacem Chehdi2 and Andriy Kurekin3


## **Classification of Pre-Filtered Multichannel Remote Sensing Images**

Vladimir Lukin1, Nikolay Ponomarenko1, Dmitriy Fevralev1, Benoit Vozel2, Kacem Chehdi2 and Andriy Kurekin3 *1National Aerospace University* 

*2University of Rennes 1 3Plymouth Marine Laboratory 1Ukraine 2France 3UK* 

## **1. Introduction**

74 Remote Sensing – Advanced Techniques and Platforms

Gaskill, J. D. (1978). *Linear systems, Fourier transform, and optics.* John Wiley & Sons. ISBN 0-

Longuet-Higgins, M. S. (1962). The statistical geometry of random surfaces. *Proceedings Symposium Applied Mathematics 1960 13th Hydrodynamic Instability,* pp. 105-143 Longuet-Higgins, M. S.; Cartwright, D. E. & Smith, N. D. (1963). Observations of the

Munk, W. (2009). An inconvenient sea truth: spread, steepness, and skewness of surface slopes. *Annual Review of Marine Sciences,* Vol.1, pp. 377-415, ISSN 1941-1405 Papoulis, A. (1981). *Probability, Random Variables, and Stochastic Processes,* chapter 9, McGraw-

Peppers, N. & Ostrem, J. S. (1978). Determination of wave slopes from photographs of the

Plant, W. J. (2003). A new interpretation of sea-surface slope probability density functions. *Journal of Geophysical Research,* Vol.108, No.C9, 3295, ISSN 0148-0227 Stilwell, D. Jr. (1969). Directional energy spectra of the sea from photographs. *Journal of* 

Stilwell, D. Jr. & Pilon, R. O. (1974). Directional spectra of surface waves from photographs. *Journal of Geophysical Research,* Vol.79, No.9, pp.1277-1284, ISSN 0148-0227

*Wave Spectra,* Prentice-Hall, Englewood Cliffs, N. J. (Ed.), 111-136

*Geophysical Research,* Vol.74, No.8, pp. 1974-1986, ISSN 0148-0227

directional spectrum of sea waves using the motions of a floating buoy, In: *Ocean* 

ocean surface: A new approach. *Applied Optics,* Vol.17, No.21, pp. 3450-3458, ISSN

471-29288-5, New York, USA

1559-128X

Hill, ISBN 0-07-119981-0, New York, USA

Multichannel remote sensing (RS) has gained popularity and has been successfully applied for solving numerous practical tasks as forestry, agriculture, hydrology, meteorology, ecology, urban area and pollution control, etc. (Chang, 2007). Using the term "multichannel", we mean a wide set of imaging approaches and RS systems (complexes) including multifrequency and dual/multi polarization radar (Oliver & Quegan, 2004), multi- and hyperspectral optical and infrared sensors. While for such radars the number of formed images is a few, the number of channels (components or sub-bands) in images can be tens, hundreds and even more than one thousand for optical/infrared imagers. TerraSAR-X is a good example of modern multichannel radar system; AVIRIS, HYDICE, HYPERION and others can serve as examples of modern hyperspectral imagers, both airborne and spaceborne (Landgrebe, 2002; Schowengerdt, 2007).

An idea behind increasing the number of channels is clear and simple: it is possible to expect that more useful information can be extracted from more data or this information is more reliable and accurate. However, the tendency to increasing the channels' (sub-band) number has also its "black" side. One has to register, to process, to transmit and to store more data. Even visualization of the obtained multichannel images for their displaying at tristimuli monitors becomes problematic (Zhang et al., 2008). Huge size of the obtained data leads to difficulties at any standard stage of multichannel image processing involving calibration, georeferencing, compression if used (Zabala et al., 2006). But, probably, the most essential problems arise in image pre-filtering and classification.

The complexity of these tasks deals with the following:

a. Noise characteristics in multichannel image components can be considerably different in the sense of noise type (additive, multiplicative, signal-dependent, mixed), statistics (probability density function (PDF), variance), spatial correlation (Kulemin et al., 2004; Barducci et al., 2005, Uss et al., 2011, Aiazzi et al., 2006);

Classification of Pre-Filtered Multichannel Remote Sensing Images 77

It follows from the aforesaid that it is impossible to take into account all factors mentioned above. Thus, it seems reasonable to concentrate on considering several particular aspects. Therefore, within this Chapter we concentrate on analyzing multichannel data information component and noise characteristics first. To our opinion, this is needed for better understanding of what are peculiarities of requirements to filtering and what approaches to denoising can be applied. All these questions are thoroughly discussed in Section 2 with taking into account recent advances in theory and practice of image filtering. Besides, we briefly consider some aspects of classifier training in Section 3. Section 4 deals with analysis of classification results for three-channel data created on basis of Landsat images with artificially added noise. Throughout the Chapter, we present examples from real-life RS

One can expect that more efficient filtering leads to better classification. This expectation is, in general, correct. However, considering image filtering, one should always keep in mind that alongside with noise removal (which is a positive effect) any filter produces distortions and artefacts (negative effects) that influence RS data classification as well. Because of this, filtering, to be reasonable for applying, has to provide more positive effects than negative ones from the viewpoint of solving a final task, RS data classification in the considered

Speaking very simply, benefits of multichannel remote sensing compared to single-channel mode are due to the following reasons. First, availability of multichannel (especially hyperspectral) data allows solving many particular tasks since while for one particular task one subset of sub-band data is "optimal", another subset is "optimal" for solving another task. Thus, multichannel remote sensing is multi-purpose allowing different users to be satisfied with employing data collected one time for a given territory. Second, useful information is often extracted by exploiting certain similarity of information content in component images and practical independence of noise in these components. Thus, efficient

Really, correlation of information content in multichannel RS data is usually high. Let us give one example. Consider hyperspectral data provided by AVIRIS airborne system

wavelength for a *k*-th subband (the total number of sub-bands for AVIRIS images is 224). Let us analyze cross-correlation factors determined for neighbouring *k*-th and *k*+1-th sub-band

1 1

 

*I J im im mean k k im im i j I I* 

( ( ( , , ) ( ))( ( , , ) ( ))) /( )

( ) ( , , )/( )

 *i j I J*

*k mean k k mean k im im k k*

(1)

*k K* defines

is wavelength, , 1,... *<sup>k</sup>*

11 1

*I J*

    where

(available at http://aviris.jpl.nasa.gov/aviris) that can be represented as *Iij* (, , )

images of different origin to provide generality of analysis and conclusions.

**2. Approaches to multichannel image filtering 2.1 Information content and noise characteristics** 

1,..., , 1,..., *i Ij J im im* denote image size and

1 1

*i j*

SNR increases due to forming and processing more sub-band images.

*R Ii j I Ii j I* 

case.

images as

1

*I J im im k k*


b. These characteristics can be a priori unknown or known only partly, signal-to-noise ratio can considerably vary from one to another component image (Kerekes & Baum, 2003) and even from one to another data cube of multichannel data obtained for

c. Although there are numerous books and papers devoted to image filter design and performance analysis (Plataniotis &Venetsanopoulos, 2000; Elad, 2010), they mainly deal with grayscale and color image processing; there are certain similarities between multichannel image filtering and color image denoising but the former case is

d. Recently, several papers describing possible approaches to multichannel image filtering have appeared (De Backer et al., 2008; Amato et al., 2009; Benedetto et al., 2010; Renard et al., 2006; Chen & Qian, 2011; Demir et al., 2011, Pizurica & Philips, 2006; Renard et al., 2008); a positive feature of some of these papers is that they study efficiency of denoising together with classification accuracy; this seems to be a correct approach since classification (in wide sense) is the final goal of multichannel RS data exploitation and filtering is only a pre-requisite for better classification; there are two main drawbacks of these papers: noise is either simulated and additive white Gaussian noise (AWGN) is usually considered as a model, or aforementioned peculiarities of noise in

e. Though efficiency of filtering and classification are to be studied together, there is no well established correlation between quantitative criteria commonly used in filtering (and lossy compression) as mean square error (MSE), peak signal-to-noise ratio (PSNR) and some others and criteria of classification accuracy as probability of correct classification (PCC), misclassification matrix, anomaly detection probability and others

f. One problem in studying classification accuracy is availability of numerous classifiers currently applied to multichannel images as neural network (NN) ones (Plaza et al., 2008), Support Vector Machines (SVM) and their modifications (Demir et al., 2011), different statistical and clustering tools (Jeon & Landgrebe, 1999), Spectral Angle

g. It is quite difficult to establish what classifier is the best with application to multichannel RS data because classifier performance depends upon many factors as methodology of learning, parameters (as number of layers and neurons in them for NN), number of classes and features' separability, etc.; it seems that many researchers are simply exploiting one or two classifiers that are either available as ready computer

h. Dimensionality reduction, especially for hyperspectral data, is often used to simplify classification, to accelerate learning, to avoid dealing with spectral bands for which signal-to-noise ratios (SNRs) are quite low (Chen & Qian, 2011) due to atmospheric effects; to exploit only data from those sub-bands that are the most informative for solving a given particular task (Popov et al.., 2011); however, it is not clear how to perform dimensionality reduction in an optimal manner and how filtering influences

i. Test multichannel images for which it could be possible to analyze efficiency of filtering and accuracy of classification are absent; because of this, people either add noise of quite high level to real-life data (that seem practically noise free) artificially or characterize efficiency of denoising by the "final result", i.e. by increasing the PCC

different imaging missions;

sufficiently more complicated;

(Christophe et al., 2005);

dimensionality reduction;

(Chen & Qian, 2011).

real-life images are not taken into account;

Mapper (SAM) (Renard et al., 2008), etc.;

tools or for which the users have certain experience;

It follows from the aforesaid that it is impossible to take into account all factors mentioned above. Thus, it seems reasonable to concentrate on considering several particular aspects. Therefore, within this Chapter we concentrate on analyzing multichannel data information component and noise characteristics first. To our opinion, this is needed for better understanding of what are peculiarities of requirements to filtering and what approaches to denoising can be applied. All these questions are thoroughly discussed in Section 2 with taking into account recent advances in theory and practice of image filtering. Besides, we briefly consider some aspects of classifier training in Section 3. Section 4 deals with analysis of classification results for three-channel data created on basis of Landsat images with artificially added noise. Throughout the Chapter, we present examples from real-life RS images of different origin to provide generality of analysis and conclusions.

One can expect that more efficient filtering leads to better classification. This expectation is, in general, correct. However, considering image filtering, one should always keep in mind that alongside with noise removal (which is a positive effect) any filter produces distortions and artefacts (negative effects) that influence RS data classification as well. Because of this, filtering, to be reasonable for applying, has to provide more positive effects than negative ones from the viewpoint of solving a final task, RS data classification in the considered case.

## **2. Approaches to multichannel image filtering**

#### **2.1 Information content and noise characteristics**

Speaking very simply, benefits of multichannel remote sensing compared to single-channel mode are due to the following reasons. First, availability of multichannel (especially hyperspectral) data allows solving many particular tasks since while for one particular task one subset of sub-band data is "optimal", another subset is "optimal" for solving another task. Thus, multichannel remote sensing is multi-purpose allowing different users to be satisfied with employing data collected one time for a given territory. Second, useful information is often extracted by exploiting certain similarity of information content in component images and practical independence of noise in these components. Thus, efficient SNR increases due to forming and processing more sub-band images.

Really, correlation of information content in multichannel RS data is usually high. Let us give one example. Consider hyperspectral data provided by AVIRIS airborne system (available at http://aviris.jpl.nasa.gov/aviris) that can be represented as *Iij* (, , ) where 1,..., , 1,..., *i Ij J im im* denote image size and is wavelength, , 1,... *<sup>k</sup> k K* defines wavelength for a *k*-th subband (the total number of sub-bands for AVIRIS images is 224). Let us analyze cross-correlation factors determined for neighbouring *k*-th and *k*+1-th sub-band images as

$$R^{k,k+1} = \left(\sum\_{i=1}^{l\_{im}} \sum\_{j=1}^{l\_{im}} (I(i,j,\lambda\_k) - I\_{\text{mean}}(\lambda\_k))(I(i,j,\lambda\_{k+1}) - I\_{\text{mean}}(\lambda\_{k+1}))\right) / \left(I\_{\text{im}} I\_{\text{im}} \sigma\_k \sigma\_{k+1}\right) \tag{1}$$

$$I\_{mean}(\mathcal{A}\_k) = \sum\_{i=1}^{I\_{im}} \sum\_{j=1}^{I\_{im}} I(i, j, \mathcal{A}\_k) \ne (I\_{im} J\_{im})$$

Classification of Pre-Filtered Multichannel Remote Sensing Images 79

0 50 100 150 200

0 50 100 150 200

from traditional PSNR, but for images without outliers this difference is not large and the

Noise characteristics in multichannel image channels can be rather different as well. The situation when noise type is different happens very seldom (this is possible if, e.g., optical and synthetic aperture radar (SAR) data are fused (Gungor & Shan, 2006) where additive noise model is typical for optical data and multiplicative noise is natural for radar ones). The same type of noise present in all component images is the case met much more often. However, noise type can be not simple and noise characteristics (e.g., variance) can change in rather wide limits. Let us give one example. The estimated standard deviation (STD) of additive noise for all sub-band images is presented in Fig. 4. As it is seen, the estimates vary a lot. Even though these are estimates with a limited accuracy, the observed variations

A more thorough analysis (Uss et al., 2011) shows that noise is not purely additive but signal dependent even for data provided by such old hyperspectral sensors as AVIRIS. Sufficient variations of signal dependent noise parameters from one band to another are observed. Recent studies (Barducci et al., 2005, Alparone et al., 2006) demonstrate a clear tendency for signal-dependent noise component to become prevailing (over additive one) for new generation hyperspectral sensors. This means that special attention should be paid to this tendency in filter design and efficiency analysis with application to multichannel data denoising and classification. Although the methods of multichannel image denoising designed on basis of the additive noise model with identical variance in all component

**k**

**k**

**Drob**

Fig. 2. mod *PSNR k*( ) for the same image as in Fig. 1

Fig. 3. Robustly estimated dynamic range ( ) *D k rob*

clearly demonstrate that noise statistics is not constant.

tendencies observed for *PSNR* take place for mod *PSNR k*( ) as well.

**PSNRmod, dB**

$$
\sigma\_k^2 = \sum\_{i=1}^{I\_{\text{im}}} \sum\_{j=1}^{J\_{\text{im}}} \left( I(i, j, \mathbb{A}\_k) - I\_{\text{mean}}(\mathbb{A}\_k) \right)^2 \left/ \left( I\_{\text{im}} f\_{\text{im}} - 1 \right) \right.
$$

The obtained plot for AVIRIS data is presented in Fig. 1. It is seen that for most neighbour sub-bands the values of *k k* <sup>1</sup> *R* are close to unity confirming high correlation (very similar content) of these images. There are such *k* for which *k k* <sup>1</sup> *R* considerably differs from unity. In particular, this happens for several first sub-bands, several last sub-bands, sub-bands with *k* about 110 and 160. The main reason for this is the presence of noise.

Fig. 1. The plot <sup>1</sup> , 1,...,223 *k k R k* for AVIRIS image Moffett Field

To prove this, let us present data from the papers (Ponomarenko et al., 2006) and (Lukin et al., 2010b). Based on blind estimates of additive noise standard deviations in sub-band images <sup>2</sup> *ad k* , robust modified estimates PSNRmod have been obtained for all channels (modifications have been introduced for reducing the influence of hot pixel values):

$$PSNR\_{\text{mod}}(k) = 10\log\_{10}\left(\left(I\_{99\%}(k) - I\_{1\%}(k)\right)^2 / \hat{\sigma}\_{\text{ad}}^2\right) \tag{2}$$

where %( ) *qI k* defines *q*-th percent quintile of image values in *k*-th sub-band image.

The plot is presented in Fig. 2. Comparing the plots in Figures 1 and 2, it can be concluded that rather small *k k* <sup>1</sup> *<sup>R</sup>* are observed for such subintervals of *k* for which mod *PSNR k*( ) are also quite small. Thus, there is strict relation between these parameters.

There is also relation between mod *PSNR k*( ) and SNR for sub-band images analyzed in Ref. (Curran & Dungan, 1989). In this sense, one important peculiarity of multichannel (especially, hyperspectral) data is to be stressed. Dynamic range of the data in sub-band images characterized by max min *I kI k* () () (maximal and minimal values for a given *k*-th subband) varies a lot. Note that to avoid problems with hot pixels and outliers in data, it is also possible to characterize dynamic range by *I kIk* 99% 1% () () exploited in (2).

The plot of *DkI kIk rob* () () () 99% 1% is presented in Fig. 3. It follows from its analysis that a general tendency is decreasing of ( ) *D k rob* when *k* (and wavelength) increases with having sharp jumps down for sub-bands where atmospheric absorption and other physical effects take place. Though both mod *PSNR k*( ) and SNR can characterize noise influence (intensity) in images, we prefer to analyze mod *PSNR k*( ) and *PSNR* below as parameters more commonly used in practice of filter efficiency analysis. Strictly saying, mod *PSNR k*( ) differs

*Iij I I J*

The obtained plot for AVIRIS data is presented in Fig. 1. It is seen that for most neighbour sub-bands the values of *k k* <sup>1</sup> *R* are close to unity confirming high correlation (very similar content) of these images. There are such *k* for which *k k* <sup>1</sup> *R* considerably differs from unity. In particular, this happens for several first sub-bands, several last sub-bands, sub-bands

0 50 100 150 200

To prove this, let us present data from the papers (Ponomarenko et al., 2006) and (Lukin et al., 2010b). Based on blind estimates of additive noise standard deviations in sub-band

2 2 *PSNR k* mod( ) 10log (( ( ) ( )) / 10 99% 1% *ad k I kIk*

The plot is presented in Fig. 2. Comparing the plots in Figures 1 and 2, it can be concluded that rather small *k k* <sup>1</sup> *<sup>R</sup>* are observed for such subintervals of *k* for which mod *PSNR k*( ) are

There is also relation between mod *PSNR k*( ) and SNR for sub-band images analyzed in Ref. (Curran & Dungan, 1989). In this sense, one important peculiarity of multichannel (especially, hyperspectral) data is to be stressed. Dynamic range of the data in sub-band images characterized by max min *I kI k* () () (maximal and minimal values for a given *k*-th subband) varies a lot. Note that to avoid problems with hot pixels and outliers in data, it is also

The plot of *DkI kIk rob* () () () 99% 1% is presented in Fig. 3. It follows from its analysis that a general tendency is decreasing of ( ) *D k rob* when *k* (and wavelength) increases with having sharp jumps down for sub-bands where atmospheric absorption and other physical effects take place. Though both mod *PSNR k*( ) and SNR can characterize noise influence (intensity) in images, we prefer to analyze mod *PSNR k*( ) and *PSNR* below as parameters more commonly used in practice of filter efficiency analysis. Strictly saying, mod *PSNR k*( ) differs

, 1,...,223 *k k R k* for AVIRIS image Moffett Field

(modifications have been introduced for reducing the influence of hot pixel values):

where %( ) *qI k* defines *q*-th percent quintile of image values in *k*-th sub-band image.

also quite small. Thus, there is strict relation between these parameters.

possible to characterize dynamic range by *I kIk* 99% 1% () () exploited in (2).

**k**

, robust modified estimates PSNRmod have been obtained for all channels

(2)

*k k mean k im im*

( ( , , ) ( )) /( 1)

2 2

with *k* about 110 and 160. The main reason for this is the presence of noise.

1 1

*i j*

0 0.3 0.6 0.9 1.2

images <sup>2</sup> *ad k*

Fig. 1. The plot <sup>1</sup>

**Rkk+1**

*I J im im*

Fig. 2. mod *PSNR k*( ) for the same image as in Fig. 1

Fig. 3. Robustly estimated dynamic range ( ) *D k rob*

from traditional PSNR, but for images without outliers this difference is not large and the tendencies observed for *PSNR* take place for mod *PSNR k*( ) as well.

Noise characteristics in multichannel image channels can be rather different as well. The situation when noise type is different happens very seldom (this is possible if, e.g., optical and synthetic aperture radar (SAR) data are fused (Gungor & Shan, 2006) where additive noise model is typical for optical data and multiplicative noise is natural for radar ones). The same type of noise present in all component images is the case met much more often. However, noise type can be not simple and noise characteristics (e.g., variance) can change in rather wide limits. Let us give one example. The estimated standard deviation (STD) of additive noise for all sub-band images is presented in Fig. 4. As it is seen, the estimates vary a lot. Even though these are estimates with a limited accuracy, the observed variations clearly demonstrate that noise statistics is not constant.

A more thorough analysis (Uss et al., 2011) shows that noise is not purely additive but signal dependent even for data provided by such old hyperspectral sensors as AVIRIS. Sufficient variations of signal dependent noise parameters from one band to another are observed. Recent studies (Barducci et al., 2005, Alparone et al., 2006) demonstrate a clear tendency for signal-dependent noise component to become prevailing (over additive one) for new generation hyperspectral sensors. This means that special attention should be paid to this tendency in filter design and efficiency analysis with application to multichannel data denoising and classification. Although the methods of multichannel image denoising designed on basis of the additive noise model with identical variance in all component

Classification of Pre-Filtered Multichannel Remote Sensing Images 81

HH VV

The given example for dual polarization SAR data is also typical in the sense that noise in component images can be not additive (speckle is pure multiplicative) and not Gaussian (it has Rayleigh distribution for the considered amplitude single look SAR images). For the presented example of HH and VV polarization images statistical and spatial correlation characteristics of speckle are practically identical in both component images, but it is not

The presented results clearly demonstrate that noise in multichannel RS images can be signal-dependent where its variance (and sometimes even PDF) depends upon information signal (image). Noise statistics can also vary from one sub-band image to another. These peculiarities have to be taken into account in multichannel image simulation, filter and

If one deals with 3D data as multichannel RS images, an idea comes immediately that filtering can be carried out either component-wise or in a vector (3D) manner. This was understood more than 20 years ago when researchers and engineers ran into necessity to process colour RGB images (Astola et al.,1990). Whilst for colour images there are actually only these two ways, for multichannel images there is also a compromise variant of processing not entire 3D volume of data but also certain groups (sets) of channels (subbands) (Uss et al., 2011). As analogue of this situation, we can refer to filtering of video where a set of subsequent frames can be used for denoising (Dabov et al., 2007). There is also possibility to apply denoising only to some but not all component images. In this sense, it is worth mentioning the paper (Philips et al., 2009). It is demonstrated there that prefiltering of some sub-band images can make them useful for improving hyperspectral data classification carried out using reduced sets of the most informative channels. However, the proposed solution to apply the median filter with scanning windows of different size

Fig. 6. The SAR images after denoising

always the case for multichannel radar images.

classifier design and performance analysis.

**2.2 Component-wise and vector filtering** 

component-wise is, to our opinion, not the best choice.

Fig. 4. Estimated STD of noise for components of the same AVIRIS image

images can provide a certain degree of noise removal, they are surely not optimal for the considered task.

Consider one more example. Figure 5 presents two components of dual-polarisation (HH and VV) 512x512 pixel fragment SAR image of Indonesia formed by TerraSAR-X spaceborne system (http://www.infoterra.de/tsx/freedata/start.php). Amplitude images are formed from complex-valued data offered at this site. As it is seen, the HH and VV images are similar to each other although both are corrupted by fully developed speckle and there are some differences in intensity of backscattering for specific small sized objects placed on water surface (left part of images, dark pixels). The value of cross-correlation factor (1) is equal to 0.63, i.e. it is quite small. Both images have been separately denoised by the DCTbased filter adapted to multiplicative nature of noise (with the same characteristics for both images) and spatial correlation of speckle (Ponomarenko et al., 2008a). The filtered images are represented in Fig. 6 where it is seen that speckle has been effectively suppressed. Filtering has considerably increased inter-channel correlation, it is equal to 0.85 for denoised images. This indirectly confirms that low values of inter-channel correlation factor in original RS data can be due to noise.

Fig. 5. The 512x512 pixel fragment SAR images of Indonesia for two polarizations

0 50 100 150 200

Fig. 4. Estimated STD of noise for components of the same AVIRIS image

**k**

images can provide a certain degree of noise removal, they are surely not optimal for the

Consider one more example. Figure 5 presents two components of dual-polarisation (HH and VV) 512x512 pixel fragment SAR image of Indonesia formed by TerraSAR-X spaceborne system (http://www.infoterra.de/tsx/freedata/start.php). Amplitude images are formed from complex-valued data offered at this site. As it is seen, the HH and VV images are similar to each other although both are corrupted by fully developed speckle and there are some differences in intensity of backscattering for specific small sized objects placed on water surface (left part of images, dark pixels). The value of cross-correlation factor (1) is equal to 0.63, i.e. it is quite small. Both images have been separately denoised by the DCTbased filter adapted to multiplicative nature of noise (with the same characteristics for both images) and spatial correlation of speckle (Ponomarenko et al., 2008a). The filtered images are represented in Fig. 6 where it is seen that speckle has been effectively suppressed. Filtering has considerably increased inter-channel correlation, it is equal to 0.85 for denoised images. This indirectly confirms that low values of inter-channel correlation factor in

HH VV

Fig. 5. The 512x512 pixel fragment SAR images of Indonesia for two polarizations

considered task.

original RS data can be due to noise.

**STD**

Fig. 6. The SAR images after denoising

The given example for dual polarization SAR data is also typical in the sense that noise in component images can be not additive (speckle is pure multiplicative) and not Gaussian (it has Rayleigh distribution for the considered amplitude single look SAR images). For the presented example of HH and VV polarization images statistical and spatial correlation characteristics of speckle are practically identical in both component images, but it is not always the case for multichannel radar images.

The presented results clearly demonstrate that noise in multichannel RS images can be signal-dependent where its variance (and sometimes even PDF) depends upon information signal (image). Noise statistics can also vary from one sub-band image to another. These peculiarities have to be taken into account in multichannel image simulation, filter and classifier design and performance analysis.

## **2.2 Component-wise and vector filtering**

If one deals with 3D data as multichannel RS images, an idea comes immediately that filtering can be carried out either component-wise or in a vector (3D) manner. This was understood more than 20 years ago when researchers and engineers ran into necessity to process colour RGB images (Astola et al.,1990). Whilst for colour images there are actually only these two ways, for multichannel images there is also a compromise variant of processing not entire 3D volume of data but also certain groups (sets) of channels (subbands) (Uss et al., 2011). As analogue of this situation, we can refer to filtering of video where a set of subsequent frames can be used for denoising (Dabov et al., 2007). There is also possibility to apply denoising only to some but not all component images. In this sense, it is worth mentioning the paper (Philips et al., 2009). It is demonstrated there that prefiltering of some sub-band images can make them useful for improving hyperspectral data classification carried out using reduced sets of the most informative channels. However, the proposed solution to apply the median filter with scanning windows of different size component-wise is, to our opinion, not the best choice.

Classification of Pre-Filtered Multichannel Remote Sensing Images 83

Although noise is mostly signal-dependent in component images of hyperspectral data, there are certain sub-bands where dynamic range is quite small and additive noise component is dominant or comparable to signal-dependent one (Uss et al., 2011; Lukin et al., 2011b). One such image (sub-band 221 of the AVIRIS data set Cuprite) is presented in Fig. 7,a. Noise is clearly seen in this image and the estimated variance of additive noise component is about 30. The output image for the BM3D filter (Foi et al., 2007) which is currently the best among non-local denoisers is given in Fig. 7,b. Noise is suppressed and all details and edges

(a) (b)

Poisson transform) and other ones.

Fig. 7. Original 221 sub-band AVIRIS image Cuprite (a) and the output of BM3D filter (b)

However, applying the non-local filters becomes problematic if noise does not fit the (dominant) AWGN model considered above. There are several problems and few known ways out. The first problem is that the non-local denoising methods are mostly designed for removal of AWGN. Recall that these methods are based on searching for similar patches in a given image. The search becomes much more complicated if noise is not additive and, especially, if noise is spatially correlated. One way out is to apply a properly selected homomorphic variance-stabilizing transform to convert a signal dependent noise to pure additive and then to use non-local filtering (Mäkitalo et al., 2010). This is possible for certain types of signal-dependent noise (Deledalle et al., 2011, see also www.cs.tut.fi/~foi/optvst). Thus, the considered processing procedure becomes applicable under condition that the noise in an image is of known type, its characteristics are known or properly (accurately) pre-estimated and there exists the corresponding pair of homomorphic transforms. Examples of signal dependent noise types for which such transforms exist are pure multiplicative noise (direct transform is of logarithmic type), Poisson noise (Anscombe transform), Poisson and pure additive noise (generalized

are preserved well.

Thus, there are quite many opportunities and each way has its own advantages and drawbacks. Keeping in mind the peculiarities of image and noise discussed above, let us start from the simplest case of component-wise filtering. It is clear that more efficient filtering leads, in general, to better classification (although strict relationships between conventional quantitative criteria characterizing filtering efficiency and classifier performance are not established yet). Therefore, let us revisit recent achievements and advances in theory and practice of grayscale image filtering and analyze in what degree they can be useful for hyperspectral image denoising.

Recall that the case of additive white Gaussian noise (AWGN) present in images has been studied most often. Recently, the theoretical limits of denoising efficiency in terms of output mean square error (MSE) within non-local filtering approach have been obtained (Chatterjee & Milanfar, 2010). The authors have presented results for a wide variety of test images and noise variance values. Moreover, the authors have provided software that allows calculating potential (minimal reachable) output MSE for a given noise-free grayscale image for a given standard deviation of AWGN. Later, in the paper (Chatterjee & Milanfar, 2011), it has been shown how potential output MSE can be accurately predicted for a noisy image at hand.

This allows drawing important conclusions as follows. First, potential reduction of output MSE compared to variance of AWGN in original image depends upon image complexity and noise intensity. Reduction is large if an image is quite simple and noise variance is large, i.e. if input SNR (and PSNR) of an image to be filtered is low. For textural images and high input SNR, potential output MSE can be by only 1.2...1.5 times smaller than AWGN variance (see also data in the papers (Lukin at al., 2011, Ponomarenko et al., 2011, Fevralev et al., 2011)). This means that filtering becomes practically inefficient in the sense that positive effect of noise removal is almost "compensated" by negative effect of distortion introducing inherent for any denoising method in less or larger degree. With application to hyperspectral data filtering, this leads to the aforementioned idea that not all component images are to be filtered. The preliminary conclusion then is that sub-band images with rather high SNR are to be kept untouched whilst other ones can be denoised. A question is then what can be (automatic) rules for deciding what sub-band images to denoise and what to remain unfiltered? Unfortunately, such rules and automatic procedures are not proposed and tested yet. As preliminary considerations, we can state only that if input PSNR is larger than 35 dB, then it is hard to provide PSNR improvement due to filtering by more than 2...3 dB. Moreover, for input PSNR>35 dB, AWGN in original images is almost not seen (it can be observed only in homogeneous image regions with rather small mean intensity). Because of this, denoised and original component images might seem almost identical (Fevralev et al., 2011). Then it comes a question is it worth carrying out denoising for such component images with rather large input PSNR in the sense of filtering positive impact on classification accuracy. We will turn back to this question later in Section 4.

The second important conclusion that comes from the analysis in (Chatterjee & Milanfar, 2010) is that the best performance for grayscale image filtering is currently provided by the methods that belong to the non-local denoising group (Elad, 2010; Foi et al., 2007; Kervrann & Boulanger, 2008). The best orthogonal transform based methods are comparable to nonlocal ones in efficiency, especially if processed images are not too simple (Lukin et al., 2011a). Let us see how efficient these methods can be with application to component-wise processing of multichannel RS data.

Thus, there are quite many opportunities and each way has its own advantages and drawbacks. Keeping in mind the peculiarities of image and noise discussed above, let us start from the simplest case of component-wise filtering. It is clear that more efficient filtering leads, in general, to better classification (although strict relationships between conventional quantitative criteria characterizing filtering efficiency and classifier performance are not established yet). Therefore, let us revisit recent achievements and advances in theory and practice of grayscale image filtering and analyze in what degree

Recall that the case of additive white Gaussian noise (AWGN) present in images has been studied most often. Recently, the theoretical limits of denoising efficiency in terms of output mean square error (MSE) within non-local filtering approach have been obtained (Chatterjee & Milanfar, 2010). The authors have presented results for a wide variety of test images and noise variance values. Moreover, the authors have provided software that allows calculating potential (minimal reachable) output MSE for a given noise-free grayscale image for a given standard deviation of AWGN. Later, in the paper (Chatterjee & Milanfar, 2011), it has been shown how potential output MSE can be accurately predicted for a noisy image at hand.

This allows drawing important conclusions as follows. First, potential reduction of output MSE compared to variance of AWGN in original image depends upon image complexity and noise intensity. Reduction is large if an image is quite simple and noise variance is large, i.e. if input SNR (and PSNR) of an image to be filtered is low. For textural images and high input SNR, potential output MSE can be by only 1.2...1.5 times smaller than AWGN variance (see also data in the papers (Lukin at al., 2011, Ponomarenko et al., 2011, Fevralev et al., 2011)). This means that filtering becomes practically inefficient in the sense that positive effect of noise removal is almost "compensated" by negative effect of distortion introducing inherent for any denoising method in less or larger degree. With application to hyperspectral data filtering, this leads to the aforementioned idea that not all component images are to be filtered. The preliminary conclusion then is that sub-band images with rather high SNR are to be kept untouched whilst other ones can be denoised. A question is then what can be (automatic) rules for deciding what sub-band images to denoise and what to remain unfiltered? Unfortunately, such rules and automatic procedures are not proposed and tested yet. As preliminary considerations, we can state only that if input PSNR is larger than 35 dB, then it is hard to provide PSNR improvement due to filtering by more than 2...3 dB. Moreover, for input PSNR>35 dB, AWGN in original images is almost not seen (it can be observed only in homogeneous image regions with rather small mean intensity). Because of this, denoised and original component images might seem almost identical (Fevralev et al., 2011). Then it comes a question is it worth carrying out denoising for such component images with rather large input PSNR in the sense of filtering positive impact on

classification accuracy. We will turn back to this question later in Section 4.

processing of multichannel RS data.

The second important conclusion that comes from the analysis in (Chatterjee & Milanfar, 2010) is that the best performance for grayscale image filtering is currently provided by the methods that belong to the non-local denoising group (Elad, 2010; Foi et al., 2007; Kervrann & Boulanger, 2008). The best orthogonal transform based methods are comparable to nonlocal ones in efficiency, especially if processed images are not too simple (Lukin et al., 2011a). Let us see how efficient these methods can be with application to component-wise

they can be useful for hyperspectral image denoising.

Although noise is mostly signal-dependent in component images of hyperspectral data, there are certain sub-bands where dynamic range is quite small and additive noise component is dominant or comparable to signal-dependent one (Uss et al., 2011; Lukin et al., 2011b). One such image (sub-band 221 of the AVIRIS data set Cuprite) is presented in Fig. 7,a. Noise is clearly seen in this image and the estimated variance of additive noise component is about 30. The output image for the BM3D filter (Foi et al., 2007) which is currently the best among non-local denoisers is given in Fig. 7,b. Noise is suppressed and all details and edges are preserved well.

Fig. 7. Original 221 sub-band AVIRIS image Cuprite (a) and the output of BM3D filter (b)

However, applying the non-local filters becomes problematic if noise does not fit the (dominant) AWGN model considered above. There are several problems and few known ways out. The first problem is that the non-local denoising methods are mostly designed for removal of AWGN. Recall that these methods are based on searching for similar patches in a given image. The search becomes much more complicated if noise is not additive and, especially, if noise is spatially correlated. One way out is to apply a properly selected homomorphic variance-stabilizing transform to convert a signal dependent noise to pure additive and then to use non-local filtering (Mäkitalo et al., 2010). This is possible for certain types of signal-dependent noise (Deledalle et al., 2011, see also www.cs.tut.fi/~foi/optvst). Thus, the considered processing procedure becomes applicable under condition that the noise in an image is of known type, its characteristics are known or properly (accurately) pre-estimated and there exists the corresponding pair of homomorphic transforms. Examples of signal dependent noise types for which such transforms exist are pure multiplicative noise (direct transform is of logarithmic type), Poisson noise (Anscombe transform), Poisson and pure additive noise (generalized Poisson transform) and other ones.

Classification of Pre-Filtered Multichannel Remote Sensing Images 85

As an alternative solution to three-stage procedures that employ non-local filtering, it is possible to advice using locally adaptive DCT-based filtering (Ponomarenko et al., 2011). Under condition of a priori known or accurately pre-estimated dependence of signal

*f I* , it is easy to adapt local thresholds for

 *I i j I i j I i j f I i j* , (5)

*f I* and ( ,) *W kl norm* , it is possible to use an

(3)

is the parameter (for hard

(4)

*sd* 

is the estimate of the local mean for this block,

*Tnm* ( , ) ( ( , )) 

spatial spectrum ( ,) *W kl norm* is known in advance or accurately pre-estimated, the threshold

( , , , ) ( , ) ( ( , )) *Tnmkl W kl* 

One more option is to apply the modified sigma filter (Lukin et al., 2011b) where the

min max ( , ) ( , ) ( ( , )), ( , ) ( , ) ( ( , )) *sig sig I i j I i j f*

values for *ij*-th scanning window position that belong to the interval defined by (5) is carried out. This algorithm is very simple but not as efficient as the DCT-based filtering in the same conditions (Tsymbal et al., 2005). Moreover, the sigma filter can be in no way adapted to

adaptive DCT-based filter version designed for removing non-stationary noise (Lukin et al., 2010a). However, for efficient filtering, it is worth exploiting all information on noise

Let us come now to considering possible approaches to vector filtering of multichannel RS data. Again, let us start from theory and recent achievements. First of all, it has been recently shown theoretically that potential output MSE for vector (3D) processing is considerably better (smaller) than for component-wise filtering of color RGB images (Uss et al., 2011b), by 1.6…2.2 times. This is due to exploiting inherent inter-channel correlation of signal components. Then, if a larger number of channel data are processed together and interchannel correlation factor is larger than for RGB color images (where it is about 0.8), one can

Similar effects but concerning practical output MSEs have been demonstrated for 3D DCT based filter (Ponomarenko et al., 2008b) and vector modified sigma filter (Kurekin et al., 1999; Lukin et al., 2006; Zelensky et al., 2002) applied to color and multichannel RS images. It is shown in these papers that vector processing provides sufficient benefit in filtering efficiency (up to 2 dB) for the cases of three-channel image processing with similar noise

*sd* 

characteristics that is either available or can be retrieved from a given image.

 *f I nm* 

2.6 is recommended). If noise is spatially correlated and its normalized

*norm f I nm*

*sig* is the parameter commonly set equal to 2 (Lee, 1980) and averaging of all image

dependent noise variance on local mean <sup>2</sup> ( ) *tr*

where *k* and *l* are frequency indices in DCT domain.

neighbourhood for a current *ij*-th pixel is formed as

Finally, if there is no information on <sup>2</sup> ( ) *tr*

expect even better efficiency of 3D filtering.

where *I nm* (, ) 

thresholding,

where 

spatially correlated noise.

becomes also frequency-dependent

hard thresholding of DCT coefficients in each nm-th block as

Let us demonstrate applicability of the three-stage filtering procedure (direct homomorphic transform – non-local denoising – inverse homomorphic transform) for noise removal in SAR images corrupted by pure multiplicative noise (speckle). The output of this procedure exploited for processing the single-look SAR image in Fig. 5 (HH) is represented in Fig. 8,a. Details and edges are preserved well and speckle is sufficiently suppressed.

Fig. 8. The HH SAR image after denoising by the three-stage procedure (a) and vector DCTbased filtering (b)

The second problem is that similar patch search becomes problematic for spatially correlated noise. For correlated noise, similarity of patches can be due to similarity of noise realizations but not due to similarity of information content. Then, noise reduction ability of non-local denoising methods decreases and artefacts can appear. The problem of searching similar blocks (8x8 pixel patches) has been considered (Ponomarenko et al., 2010). But the proposed method has been applied to blind estimation of noise spatial spectrum in DCT domain, not to image filtering within non-local framework. The obtained estimates of the DCT spatial spectrum have been then used to improve performance of the DCT based filter (Ponomarenko2008). Note that adaptation to spatial spectrum of noise in image filtering leads to sufficient improvement of output image quality according to both conventional criteria and visual quality metrics (Lukin et al, 2008).

Finally, the third problem deals with accurate estimation of signal-dependent noise statistical characteristics (Zabrodina et al., 2011). Even assuming a proper variance stabilizing transform exists as, e.g., generalized Anscombe transform (Murtag et al, 1995) for mixed Poisson-like and additive noise, parameters of transform are to be adjusted to mixed noise statistics. Then, if statistical characteristics of mixed noise are estimated not accurately, variance stabilization is not perfect and this leads to reduction of filtering efficiency. Note that blind estimation of mixed noise parameters is not able nowadays to provide quite accurate estimation of parameters for all images and all possible sets of mixed noise parameters (Zabrodina et al., 2011). Besides, non-local filtering methods are usually not fast enough since search for similar patches requires intensive computations.

Let us demonstrate applicability of the three-stage filtering procedure (direct homomorphic transform – non-local denoising – inverse homomorphic transform) for noise removal in SAR images corrupted by pure multiplicative noise (speckle). The output of this procedure exploited for processing the single-look SAR image in Fig. 5 (HH) is represented in Fig. 8,a.

Details and edges are preserved well and speckle is sufficiently suppressed.

(a ) (b)

criteria and visual quality metrics (Lukin et al, 2008).

based filtering (b)

Fig. 8. The HH SAR image after denoising by the three-stage procedure (a) and vector DCT-

The second problem is that similar patch search becomes problematic for spatially correlated noise. For correlated noise, similarity of patches can be due to similarity of noise realizations but not due to similarity of information content. Then, noise reduction ability of non-local denoising methods decreases and artefacts can appear. The problem of searching similar blocks (8x8 pixel patches) has been considered (Ponomarenko et al., 2010). But the proposed method has been applied to blind estimation of noise spatial spectrum in DCT domain, not to image filtering within non-local framework. The obtained estimates of the DCT spatial spectrum have been then used to improve performance of the DCT based filter (Ponomarenko2008). Note that adaptation to spatial spectrum of noise in image filtering leads to sufficient improvement of output image quality according to both conventional

Finally, the third problem deals with accurate estimation of signal-dependent noise statistical characteristics (Zabrodina et al., 2011). Even assuming a proper variance stabilizing transform exists as, e.g., generalized Anscombe transform (Murtag et al, 1995) for mixed Poisson-like and additive noise, parameters of transform are to be adjusted to mixed noise statistics. Then, if statistical characteristics of mixed noise are estimated not accurately, variance stabilization is not perfect and this leads to reduction of filtering efficiency. Note that blind estimation of mixed noise parameters is not able nowadays to provide quite accurate estimation of parameters for all images and all possible sets of mixed noise parameters (Zabrodina et al., 2011). Besides, non-local filtering methods are usually not fast

enough since search for similar patches requires intensive computations.

As an alternative solution to three-stage procedures that employ non-local filtering, it is possible to advice using locally adaptive DCT-based filtering (Ponomarenko et al., 2011). Under condition of a priori known or accurately pre-estimated dependence of signal dependent noise variance on local mean <sup>2</sup> ( ) *tr sd f I* , it is easy to adapt local thresholds for hard thresholding of DCT coefficients in each nm-th block as

$$T(n,m) = \beta \sqrt{\hat{I}(\hat{\bar{I}}(n,m))}\tag{3}$$

where *I nm* (, ) is the estimate of the local mean for this block, is the parameter (for hard thresholding, 2.6 is recommended). If noise is spatially correlated and its normalized spatial spectrum ( ,) *W kl norm* is known in advance or accurately pre-estimated, the threshold becomes also frequency-dependent

$$T(n,m,k,l) = \beta \sqrt{\mathcal{V}\_{norm}(k,l) f(\hat{\overline{I}}(n,m))}\tag{4}$$

where *k* and *l* are frequency indices in DCT domain.

One more option is to apply the modified sigma filter (Lukin et al., 2011b) where the neighbourhood for a current *ij*-th pixel is formed as

$$I\_{\min}(\mathbf{i}\_{\prime}, \mathbf{j}) = I(\mathbf{i}\_{\prime}, \mathbf{j}) - \alpha\_{\text{sig}} \sqrt{f(I(\mathbf{i}\_{\prime}, \mathbf{j}))} \text{ } I\_{\max}(\mathbf{i}\_{\prime}, \mathbf{j}) = I(\mathbf{i}\_{\prime}, \mathbf{j}) + \alpha\_{\text{sig}} \sqrt{f(I(\mathbf{i}\_{\prime}, \mathbf{j}))} \text{ } \tag{5}$$

where *sig* is the parameter commonly set equal to 2 (Lee, 1980) and averaging of all image values for *ij*-th scanning window position that belong to the interval defined by (5) is carried out. This algorithm is very simple but not as efficient as the DCT-based filtering in the same conditions (Tsymbal et al., 2005). Moreover, the sigma filter can be in no way adapted to spatially correlated noise.

Finally, if there is no information on <sup>2</sup> ( ) *tr sd f I* and ( ,) *W kl norm* , it is possible to use an adaptive DCT-based filter version designed for removing non-stationary noise (Lukin et al., 2010a). However, for efficient filtering, it is worth exploiting all information on noise characteristics that is either available or can be retrieved from a given image.

Let us come now to considering possible approaches to vector filtering of multichannel RS data. Again, let us start from theory and recent achievements. First of all, it has been recently shown theoretically that potential output MSE for vector (3D) processing is considerably better (smaller) than for component-wise filtering of color RGB images (Uss et al., 2011b), by 1.6…2.2 times. This is due to exploiting inherent inter-channel correlation of signal components. Then, if a larger number of channel data are processed together and interchannel correlation factor is larger than for RGB color images (where it is about 0.8), one can expect even better efficiency of 3D filtering.

Similar effects but concerning practical output MSEs have been demonstrated for 3D DCT based filter (Ponomarenko et al., 2008b) and vector modified sigma filter (Kurekin et al., 1999; Lukin et al., 2006; Zelensky et al., 2002) applied to color and multichannel RS images. It is shown in these papers that vector processing provides sufficient benefit in filtering efficiency (up to 2 dB) for the cases of three-channel image processing with similar noise

Classification of Pre-Filtered Multichannel Remote Sensing Images 87

maximal efficiency of denoising according to a given quantitative criteria depends upon a filtered image, noise intensity (variance for AWGN case), thresholding type, and a metric used. In particular, for hard thresholding which is the most popular and rather efficient,

for all considered images and noise intensities. This means that if

Interestingly, if the visual quality metric PSNR-HVS-M (Ponomarenko et al., 2007) is employed as criterion of filtering efficiency, the corresponding optimal value is

one wishes to provide better visual quality of filtered image, edge/detail/texture

preservation is to be paid main attention (better preservation is provided if

(a) (b)

accuracy but with making the classification task simpler.

Fig. 9. Noise free (a) and noisy (b) test images, additive noise variance is equal to 100

In this Section, we would like to avoid a thorough discussion on possible classification approaches with application to multichannel RS images. An interested reader is addressed to (Berge & Solberg, 2004), (Melgani & Bruzzone, 2004), (Ainsworth et al., 2007), etc. General observations of modern tendencies for hyperspectral images are the following. Although there are quite many different classifiers (see Introduction), neural network, support vector machine and SAM are, probably, the most popular ones. One reason for using NN and SVM classifiers is their ability to better cope with non-gaussianity of features. Dimensionality reduction (there are numerous methods) is usually carried out without loss in classification

Classifier performance depends upon many factors as number of classes, their separability in feature space, classifier type and parameters, a methodology of training used and a training sample size, etc. If training is done in supervised manner (which is more popular for classification application), training data set should contain, at least, hundreds of feature

is usually slightly larger than 2.6 if an image is quite simple, noise is intensive

*opt* ). For complex images and small

is smaller).

*opt* is usually slightly smaller than 2.6.

optimal

0.85 *PSNR HVS M PSNR*

*opt opt*

 

**3. Classifiers and their training** 

and output PSNR or MSE are used as criteria ( *PSNR*

variance of noise (input PSNR>32..34 dB), *PSNR*

intensities in component images. This, in turn, improves classification of multichannel RS data (Lukin et al., 2006, Zelensky et al., 2002).

However, there are specific effects that might happen if 3D filtering is applied without careful taking into account noise characteristics in component images (and the corresponding pre-processing). For the vector sigma filter, the 3D neighborhood can be formed according to (5) for any a priori known dependences *f*(.) that can be individual for each component image. This is one advantage of this filter that, in fact, requires no preprocessing operations as, e.g., homomorphic transformations. Another advantage is that if noise is of different intensity in component images processed together, then the vector sigma filter considerably improves the quality of the component image(s) with the smallest SNR. A drawback is that filtering for other components is not so efficient. The aforementioned property can be useful for hyperspectral data for which it seems possible to enhance component images with low SNR by proper selection of other component images (with high SNR) to be processed jointly (in the vector manner). However, this idea needs solid verification in future.

For the 3D DCT-based filtering, two practical situations have been considered. The first one is AWGN with equal variances in all components (Fevralev et al., 2011). Channel decorrelation and processing in fully overlapping 8x8 blocks is applied. This approach provides 1…2 dB improvement compared to component-wise DCT-based processing of color images according to output PSNR and the visual quality metric PSNR-HVS-M (Ponomarenko et al., 2007). The second situation is different types of noise and/or different variances of noise in component images to be processed together. Then noise type has to be converted to additive by the corresponding variance stabilizing transforms and images are to be normalized (stretched) to have equal variances. After this, the 3D DCT based filter is to be applied. Otherwise, e.g., if noise variances are not the same, oversmoothing can be observed for component images with smaller variance values whilst undersmoothing can take place for components with larger variances. To illustrate performance of this method, we have applied it to dual-polarization SAR image composed of images presented in Fig. 5. Identical logarithmic transforms have been used first separately for each component to get two images corrupted by pure additive noise with equal variance values. Then, the 3D DCT based filtering with setting the frequency dependent thresholds as ( ,) ( ,) *Tkl W kl adc norm* has been used where *adc* denotes additive noise standard deviation after direct homomorpic transform. Finally, identical inverse homomorphic transforms have been performed for each component image. The obtained filtered HH component image is presented in Fig. 8,b. Speckle is suppressed even better than in the image in Fig 8,a and edge/detail preservation is good as well.

Note that vector filtering of multichannel images can be useful not only for more efficient denoising, but also for decreasing residual errors of image co-registration (Kurekin1997). Its application results in less misclassifications in the neighborhoods of sharp edges.

As it is seen, the DCT-based filtering methods use the parameter that, in general, can be varied. Analysis of the influence of this parameter on filtering efficiency for the threechannel LandSat image visualized in RGB in Fig. 9 has been carried out in (Fevralev et al., 2010). Similar analysis, but for standard grayscale images, has been performed in (Ponomarenko et al., 2011). It has been established that an optimal value of that provides

intensities in component images. This, in turn, improves classification of multichannel RS

However, there are specific effects that might happen if 3D filtering is applied without careful taking into account noise characteristics in component images (and the corresponding pre-processing). For the vector sigma filter, the 3D neighborhood can be formed according to (5) for any a priori known dependences *f*(.) that can be individual for each component image. This is one advantage of this filter that, in fact, requires no preprocessing operations as, e.g., homomorphic transformations. Another advantage is that if noise is of different intensity in component images processed together, then the vector sigma filter considerably improves the quality of the component image(s) with the smallest SNR. A drawback is that filtering for other components is not so efficient. The aforementioned property can be useful for hyperspectral data for which it seems possible to enhance component images with low SNR by proper selection of other component images (with high SNR) to be processed jointly (in the vector manner). However, this idea needs

For the 3D DCT-based filtering, two practical situations have been considered. The first one is AWGN with equal variances in all components (Fevralev et al., 2011). Channel decorrelation and processing in fully overlapping 8x8 blocks is applied. This approach provides 1…2 dB improvement compared to component-wise DCT-based processing of color images according to output PSNR and the visual quality metric PSNR-HVS-M (Ponomarenko et al., 2007). The second situation is different types of noise and/or different variances of noise in component images to be processed together. Then noise type has to be converted to additive by the corresponding variance stabilizing transforms and images are to be normalized (stretched) to have equal variances. After this, the 3D DCT based filter is to be applied. Otherwise, e.g., if noise variances are not the same, oversmoothing can be observed for component images with smaller variance values whilst undersmoothing can take place for components with larger variances. To illustrate performance of this method, we have applied it to dual-polarization SAR image composed of images presented in Fig. 5. Identical logarithmic transforms have been used first separately for each component to get two images corrupted by pure additive noise with equal variance values. Then, the 3D DCT based filtering with setting the frequency dependent thresholds as

deviation after direct homomorpic transform. Finally, identical inverse homomorphic transforms have been performed for each component image. The obtained filtered HH component image is presented in Fig. 8,b. Speckle is suppressed even better than in the

Note that vector filtering of multichannel images can be useful not only for more efficient denoising, but also for decreasing residual errors of image co-registration (Kurekin1997). Its

varied. Analysis of the influence of this parameter on filtering efficiency for the threechannel LandSat image visualized in RGB in Fig. 9 has been carried out in (Fevralev et al., 2010). Similar analysis, but for standard grayscale images, has been performed in

application results in less misclassifications in the neighborhoods of sharp edges.

(Ponomarenko et al., 2011). It has been established that an optimal value of

*adc* denotes additive noise standard

that, in general, can be

that provides

data (Lukin et al., 2006, Zelensky et al., 2002).

solid verification in future.

( ,) ( ,) *Tkl W kl* 

*adc norm* has been used where

image in Fig 8,a and edge/detail preservation is good as well.

As it is seen, the DCT-based filtering methods use the parameter

maximal efficiency of denoising according to a given quantitative criteria depends upon a filtered image, noise intensity (variance for AWGN case), thresholding type, and a metric used. In particular, for hard thresholding which is the most popular and rather efficient, optimal is usually slightly larger than 2.6 if an image is quite simple, noise is intensive and output PSNR or MSE are used as criteria ( *PSNR opt* ). For complex images and small variance of noise (input PSNR>32..34 dB), *PSNR opt* is usually slightly smaller than 2.6. Interestingly, if the visual quality metric PSNR-HVS-M (Ponomarenko et al., 2007) is employed as criterion of filtering efficiency, the corresponding optimal value is 0.85 *PSNR HVS M PSNR opt opt* for all considered images and noise intensities. This means that if one wishes to provide better visual quality of filtered image, edge/detail/texture preservation is to be paid main attention (better preservation is provided if is smaller).

Fig. 9. Noise free (a) and noisy (b) test images, additive noise variance is equal to 100

## **3. Classifiers and their training**

In this Section, we would like to avoid a thorough discussion on possible classification approaches with application to multichannel RS images. An interested reader is addressed to (Berge & Solberg, 2004), (Melgani & Bruzzone, 2004), (Ainsworth et al., 2007), etc. General observations of modern tendencies for hyperspectral images are the following. Although there are quite many different classifiers (see Introduction), neural network, support vector machine and SAM are, probably, the most popular ones. One reason for using NN and SVM classifiers is their ability to better cope with non-gaussianity of features. Dimensionality reduction (there are numerous methods) is usually carried out without loss in classification accuracy but with making the classification task simpler.

Classifier performance depends upon many factors as number of classes, their separability in feature space, classifier type and parameters, a methodology of training used and a training sample size, etc. If training is done in supervised manner (which is more popular for classification application), training data set should contain, at least, hundreds of feature

Classification of Pre-Filtered Multichannel Remote Sensing Images 89

Pixel-by-pixel classification has been used without exploiting any textural features since these features can be influenced by noise and filtering. The training dataset has been formed from noise-free samples of the original test image represented in Fig. 9,a, to alleviate these impairments degrading the training results and to make simpler the analysis of image classification accuracy in the presence of noise and distortions introduced by denoising. Thus, in fact, for every image pixel the feature vector has been formed as

*q qqq* **x** *xxx* , i.e. composed of brightness values of Landsat image components

Details concerning training the considered classifiers can be found in (Fevralev et al., 2010). Here we would like to mention only the following. We have used the RBF NN with one hidden layer of nonlinear elements with a Gaussian activation function (Bose & Liang, 1996) and an output layer with linear elements. The element number in the output layer equals to the number of classes (five) where every element is associated with the particular class of the sensed terrain. The classifier presumes making a hard decision that is performed by selecting the element of the output layer having the maximum output value. The RBF NN unknown parameters have been obtained by the cascade-correlation algorithm that starts with one hidden unit and iteratively adds new hidden units to reduce (minimize) the total residual error. The error function has exploited weights to provide equal contributions from

The considered SVM classifier employs nonlinear kernel functions in order to transform a feature vector into a new feature vector in a higher dimension space where linear classification is performed (Schölkopf et al., 1999). The SVM training has been based on quadratic programming, which guarantees reaching a global minimum of the classifier error function (Cristianini & Shawe-Taylor, 2000). For the considered classification task, we have applied a Radial Basis kernel function of the same form as the activation function of the RBF NN hidden layer units. To solve multi-class problem using the SVM classifier we have applied one-against-one classification strategy. It divides the multi-class problem into *S*(*S*-1)/2 separate binary classification tasks for all possible pair combinations of *S*  classes. A majority voting rule has been then applied at the final stage to find the resulting

The overall probability of correct classification reached for noise-free image is 0.906 for the RBF NN and 0.915 for the SVM classifiers, respectively. The reasons of the observed misclassifications are that the considered classes are not separable as we exploited only three simple features (intensities in channel images). The largest misclassification probabilities have been observed for the classes "Soil" and "Urban", "Soil" and "Bushes". This is not surprising since these classes are quite heterogeneous and have similar "colors"

Concerning Landsat data classification, let us start with considering overall probabilities of correct classification *Pcc*. The obtained results are presented in Table 1 for three values of AWGN variance, namely, 100, 49, and 16 (note that only two values, 100 and 49, have been analyzed in the earlier paper (Fevralev et al., 2010). The case of noise variance equal to 16 is added to study the situation when input PSNR=39 dB, i.e. noise intensity is such that noise

every image class for different numbers of class learning samples.

in the composed three-channel image (see Fig. 9,a).

**4. Filtering and classification results and examples** 

, , *RGB*

class.

associated with R, G, and B.

vectors and classification is then carried out for other pixels (in fact, voxels or feature vectors obtained for them). Validation is usually performed for thousands of voxels. Pixel-by-pixel classification is usually performed, being quite complex even in this case, although some advanced techniques exploit also texture features (Rellier et al., 2004). There is also an opportunity to post-process preliminary classification data in order to partly remove misclassifications (Yli-Harja & Shmulevich, 1999).

The situation in classification of multichannel radar imagery is another due to considerably smaller number of channels (Ferro-Famil & Pottier, 2001, Alberga et al., 2008). There is no problem with dimensionality reduction. Instead, the problem is with establishing and exploiting sets of the most informative and noise-immune features derived from the obtained images. One reason is that there are many different representations of polarimetric information where features can be not independent, being retrieved from the same original data. Another reason is intensive speckle inherent for radar imagery where SARs able to provide appropriate resolution are mostly used nowadays.

To sufficiently narrow an area of our study, we have restricted ourselves by considering the three-channel Landsat image (Fig. 9a) composed of visible band images that relate to central wavelengths 0.66 μm, 0.56 μm, and 0.49 μm associated with R, G, and B components of the obtained "color" image, respectively. Only the AWGN case has been analyzed where noise with predetermined variance was artificially added to each component independently. Radial basis function (RBF) NN and SVM classifiers have been applied. According to the recommendations given above, training has been done for several fragments for each class shown by the corresponding colors in Fig. 10b. The numbers of training samples was 1617, 1369, 375, 191 and 722 for the classes "Soil", "Grass", "Water", "Urban" (Roads and Buildings), and "Bushes", respectively. Classification has been applied to all image pixels although validation has been performed only for pixels that belong to areas marked by five colors in Fig. 10a.

Fig. 10. Ground truth map (a) and fragments used for classifier training (b)

vectors and classification is then carried out for other pixels (in fact, voxels or feature vectors obtained for them). Validation is usually performed for thousands of voxels. Pixel-by-pixel classification is usually performed, being quite complex even in this case, although some advanced techniques exploit also texture features (Rellier et al., 2004). There is also an opportunity to post-process preliminary classification data in order to partly remove

The situation in classification of multichannel radar imagery is another due to considerably smaller number of channels (Ferro-Famil & Pottier, 2001, Alberga et al., 2008). There is no problem with dimensionality reduction. Instead, the problem is with establishing and exploiting sets of the most informative and noise-immune features derived from the obtained images. One reason is that there are many different representations of polarimetric information where features can be not independent, being retrieved from the same original data. Another reason is intensive speckle inherent for radar imagery where SARs able to

To sufficiently narrow an area of our study, we have restricted ourselves by considering the three-channel Landsat image (Fig. 9a) composed of visible band images that relate to central wavelengths 0.66 μm, 0.56 μm, and 0.49 μm associated with R, G, and B components of the obtained "color" image, respectively. Only the AWGN case has been analyzed where noise with predetermined variance was artificially added to each component independently. Radial basis function (RBF) NN and SVM classifiers have been applied. According to the recommendations given above, training has been done for several fragments for each class shown by the corresponding colors in Fig. 10b. The numbers of training samples was 1617, 1369, 375, 191 and 722 for the classes "Soil", "Grass", "Water", "Urban" (Roads and Buildings), and "Bushes", respectively. Classification has been applied to all image pixels although validation has been performed only for pixels that belong to areas marked by five

misclassifications (Yli-Harja & Shmulevich, 1999).

provide appropriate resolution are mostly used nowadays.

(a) (b)

Fig. 10. Ground truth map (a) and fragments used for classifier training (b)

Image classes: -grass, -water, -roads and buildings, -bushes, -soil

colors in Fig. 10a.

Pixel-by-pixel classification has been used without exploiting any textural features since these features can be influenced by noise and filtering. The training dataset has been formed from noise-free samples of the original test image represented in Fig. 9,a, to alleviate these impairments degrading the training results and to make simpler the analysis of image classification accuracy in the presence of noise and distortions introduced by denoising. Thus, in fact, for every image pixel the feature vector has been formed as , , *RGB q qqq* **x** *xxx* , i.e. composed of brightness values of Landsat image components associated with R, G, and B.

Details concerning training the considered classifiers can be found in (Fevralev et al., 2010). Here we would like to mention only the following. We have used the RBF NN with one hidden layer of nonlinear elements with a Gaussian activation function (Bose & Liang, 1996) and an output layer with linear elements. The element number in the output layer equals to the number of classes (five) where every element is associated with the particular class of the sensed terrain. The classifier presumes making a hard decision that is performed by selecting the element of the output layer having the maximum output value. The RBF NN unknown parameters have been obtained by the cascade-correlation algorithm that starts with one hidden unit and iteratively adds new hidden units to reduce (minimize) the total residual error. The error function has exploited weights to provide equal contributions from every image class for different numbers of class learning samples.

The considered SVM classifier employs nonlinear kernel functions in order to transform a feature vector into a new feature vector in a higher dimension space where linear classification is performed (Schölkopf et al., 1999). The SVM training has been based on quadratic programming, which guarantees reaching a global minimum of the classifier error function (Cristianini & Shawe-Taylor, 2000). For the considered classification task, we have applied a Radial Basis kernel function of the same form as the activation function of the RBF NN hidden layer units. To solve multi-class problem using the SVM classifier we have applied one-against-one classification strategy. It divides the multi-class problem into *S*(*S*-1)/2 separate binary classification tasks for all possible pair combinations of *S*  classes. A majority voting rule has been then applied at the final stage to find the resulting class.

The overall probability of correct classification reached for noise-free image is 0.906 for the RBF NN and 0.915 for the SVM classifiers, respectively. The reasons of the observed misclassifications are that the considered classes are not separable as we exploited only three simple features (intensities in channel images). The largest misclassification probabilities have been observed for the classes "Soil" and "Urban", "Soil" and "Bushes". This is not surprising since these classes are quite heterogeneous and have similar "colors" in the composed three-channel image (see Fig. 9,a).

## **4. Filtering and classification results and examples**

Concerning Landsat data classification, let us start with considering overall probabilities of correct classification *Pcc*. The obtained results are presented in Table 1 for three values of AWGN variance, namely, 100, 49, and 16 (note that only two values, 100 and 49, have been analyzed in the earlier paper (Fevralev et al., 2010). The case of noise variance equal to 16 is added to study the situation when input PSNR=39 dB, i.e. noise intensity is such that noise

Classification of Pre-Filtered Multichannel Remote Sensing Images 91

can be explained by better noise suppression efficiency provided for the DCT-based filtering

(namely, for "homogeneous" classes "Water" and "Grass" that occupy about half of pixels in validation set, see Fig. 10b). Data analysis also allows concluding that more efficient filtering provided by the 3D filtering compared to component-wise processing leads to sufficient increase in *Pcc* especially for intensive noise case and SVM classifier. This shows that if filtering is more efficient in terms of conventional metrics, then, most probably, it is more expedient in terms of classification. All these conclusions are consistent for both classifiers. Although the results are slightly better for the RBF NN if noise is intensive, *Pcc*

We have also analyzed the influence of filtering efficiency on classification accuracy for particular classes. Only hard thresholding has been considered (the results for combined thresholding are given in (Fevralev et al., 2010) and they are quite close to the data for hard thresholding). Three filtering approaches have been used: component-wise denoising with

(denoted as Filtered 2.1), component-wise filtering with *PSNR*

due to "heterogeneity" of the classes "Soil" and "Bushes" (see discussion above).

For the first class "Soil", a clear tendency is observed: more efficient the filtering, larger the probability of correct classification Pcorr1. The same holds for "homogeneous" classes "Grass" (analyze Pcorr2) and "Water" (see data for Pcorr3), the attained probabilities for these classes are high and approach unity for filtered images. The dependences for the class "Bushes" (see Pcorr5) are similar to the dependences for the class "Soil". Pcorr5 increases if more efficient filtering is applied but not essentially. Quite many misclassifications remain

Finally, specific results are observed for the class "Urban" (see data for Pcorr4). The pixels that belong to this class are not classified well in noisy images, especially by the SVM classifier. Filtering, especially 3D processing that possesses the best edge/detail preservation, slightly improves the values of Pcorr4. There is practically no difference in data for the cases Filtered

Thus, we can conclude that a filter ability to preserve edges and details is of prime importance for such "heterogeneous" classes. It can be also expected that the use of texture features for such classes can improve probability of their correct classification. Note that, for other classes, image pre-filtering also indirectly incorporates spatial information to classification by taking into account neighbouring pixel values at denoising stage to

Let us now present examples of classification. Fig. 11,a, and 11,b illustrate classification results for noisy images (σ2=100) for both classifiers. There are quite many pixel-wise misclassifications due to influence of noise, especially for the SVM classifier. Even the water surface is classified with misclassifications. In turn, Figures 11,c and 11,d present

3.9 for CT, σ2=100 and 49). Because of this, we have analyzed only

which is expedient for, at least, two classes met in the studied Landsat image

for CT (that correspond to *PSNR HVS M*

*opt*

*opt* . To our opinion, this

*opt* (denoted

)

with larger

2.5 for HT and

hard thresholding for σ2=16.

The use of smaller 2.1

2.1 *PSNR HVS M*

2.1 and Filtered 2.5.

"correct" a given pixel value.

*opt* values are almost the same for non-intensive noise.

as Filtered 2.5), and 3D (vector) processing (Filtered 3D).

for HT and 3.3

results in slight reduction of *Pcc* compared to the case of setting *PSNR*


Table 1. Classification results for original and filtered images

is practically not seen in original image (Fevralev et al., 2011). Alongside with hard thresholding (HT), we have analyzed a combined thresholding (CT)

$$D\_{ct}(n,m,k,l) = \begin{cases} D(n,m,k,l), if\left|D(n,m,k,l)\right| \ge \beta \sigma(n,m,k,l) \\\ D^3(n,m,k,l) / \beta^2 \sigma^2(n,m,k,l) \text{otherwise} \end{cases} \tag{6}$$

where <sup>2</sup> ( , , ,) ( ( , ) ( ,) *nmkl f I nmW kl norm* . Note that for CT 3.9 *PSNR opt* and the aforementioned property 0.85 *PSNR HVS M PSNR opt opt* is also valid.

As it follows from analysis of data in Table 1, any considered method of pre-filtering noisy images has positive effect on classification irrespectively to a classifier used. As it could be expected, the largest positive effect associated with considerable increase of *Pcc* is observed if noise is intensive (see data for σ2=100 compared to "Noisy"). If noise variance is small (σ2=16), there is still improvement of image quality after filtering. Output PSNR becomes 42.4 dB after component-wise denoising and 43.0 after 3D DCT-based filtering. This improvement in terms of PSNR leads to increase of *Pcc* although it is not large. Probability of correct classification has sufficiently increased for classes 1 (Soil), 2 (Grass), and 5 (Bushes).

Note that for filtered image *Pcc* is practically the same as for classification of noise-free data. This shows that if *PSNR* for classified image is over 42…43 dB, the (residual) noise practically does not effect classification.

Both considered algorithms of thresholding produce approximately the same results for the same noise variance, classifier and component-wise filtering (compare, e.g., the cases

Filtered (component-wise, HT) 16 2.5 0.909 0.905

Filtered (component-wise, HT) 49 2.5 0.889 0.903 Filtered (component-wise, HT) 49 2.1 0.880 0.898 Filtered (component-wise, CT) 49 3.9 0.888 0.903 Filtered (component-wise, CT) 49 3.3 0.879 0.896

Filtered (component-wise, HT) 100 2.5 0.881 0.902 Filtered (component-wise, HT) 100 2.1 0.867 0.892 Filtered (component-wise, CT) 100 3.9 0.879 0.902 Filtered (component-wise, CT) 100 3.3 0.865 0.890

Filtered (3D, HT) 16 2.5 0.919 0.906 Noisy 49 - 0.813 0.838

Filtered (3D, HT) 49 2.6 0.917 0.911 Noisy 100 - 0.729 0.766

Filtered (3D, HT) 100 2.6 0.918 0.914

is practically not seen in original image (Fevralev et al., 2011). Alongside with hard

3 2 2 ( , , , ), ( , , , ) ( , , , ) ( , , ,) ( , , , )/ ( , , , ) *ct D n m k l if D n m k l n m k l D nmkl*

*nmkl f I nmW kl norm* . Note that for CT 3.9 *PSNR*

 

As it follows from analysis of data in Table 1, any considered method of pre-filtering noisy images has positive effect on classification irrespectively to a classifier used. As it could be expected, the largest positive effect associated with considerable increase of *Pcc* is observed if noise is intensive (see data for σ2=100 compared to "Noisy"). If noise variance is small (σ2=16), there is still improvement of image quality after filtering. Output PSNR becomes 42.4 dB after component-wise denoising and 43.0 after 3D DCT-based filtering. This improvement in terms of PSNR leads to increase of *Pcc* although it is not large. Probability of correct classification has sufficiently increased for classes 1 (Soil), 2 (Grass), and 5 (Bushes). Note that for filtered image *Pcc* is practically the same as for classification of noise-free data. This shows that if *PSNR* for classified image is over 42…43 dB, the (residual) noise

Both considered algorithms of thresholding produce approximately the same results for the same noise variance, classifier and component-wise filtering (compare, e.g., the cases

is also valid.

*D n m k l n m k l otherwise*

  (6)

*opt* and the

Noisy 16 - 0.890 0.887

Pcc for SVM Pcc for RBF NN

Image σ2

Table 1. Classification results for original and filtered images

thresholding (HT), we have analyzed a combined thresholding (CT)

*opt opt*

( , , ,) ( ( , ) ( ,)

aforementioned property 0.85 *PSNR HVS M PSNR* 

practically does not effect classification.

where <sup>2</sup>

 2.5 for HT and 3.9 for CT, σ2=100 and 49). Because of this, we have analyzed only hard thresholding for σ2=16.

The use of smaller 2.1 for HT and 3.3 for CT (that correspond to *PSNR HVS M opt* )

results in slight reduction of *Pcc* compared to the case of setting *PSNR opt* . To our opinion, this

can be explained by better noise suppression efficiency provided for the DCT-based filtering with larger which is expedient for, at least, two classes met in the studied Landsat image (namely, for "homogeneous" classes "Water" and "Grass" that occupy about half of pixels in validation set, see Fig. 10b). Data analysis also allows concluding that more efficient filtering provided by the 3D filtering compared to component-wise processing leads to sufficient increase in *Pcc* especially for intensive noise case and SVM classifier. This shows that if filtering is more efficient in terms of conventional metrics, then, most probably, it is more expedient in terms of classification. All these conclusions are consistent for both classifiers. Although the results are slightly better for the RBF NN if noise is intensive, *Pcc* values are almost the same for non-intensive noise.

We have also analyzed the influence of filtering efficiency on classification accuracy for particular classes. Only hard thresholding has been considered (the results for combined thresholding are given in (Fevralev et al., 2010) and they are quite close to the data for hard thresholding). Three filtering approaches have been used: component-wise denoising with 2.1 *PSNR HVS M opt* (denoted as Filtered 2.1), component-wise filtering with *PSNR opt* (denoted as Filtered 2.5), and 3D (vector) processing (Filtered 3D).

For the first class "Soil", a clear tendency is observed: more efficient the filtering, larger the probability of correct classification Pcorr1. The same holds for "homogeneous" classes "Grass" (analyze Pcorr2) and "Water" (see data for Pcorr3), the attained probabilities for these classes are high and approach unity for filtered images. The dependences for the class "Bushes" (see Pcorr5) are similar to the dependences for the class "Soil". Pcorr5 increases if more efficient filtering is applied but not essentially. Quite many misclassifications remain due to "heterogeneity" of the classes "Soil" and "Bushes" (see discussion above).

Finally, specific results are observed for the class "Urban" (see data for Pcorr4). The pixels that belong to this class are not classified well in noisy images, especially by the SVM classifier. Filtering, especially 3D processing that possesses the best edge/detail preservation, slightly improves the values of Pcorr4. There is practically no difference in data for the cases Filtered 2.1 and Filtered 2.5.

Thus, we can conclude that a filter ability to preserve edges and details is of prime importance for such "heterogeneous" classes. It can be also expected that the use of texture features for such classes can improve probability of their correct classification. Note that, for other classes, image pre-filtering also indirectly incorporates spatial information to classification by taking into account neighbouring pixel values at denoising stage to "correct" a given pixel value.

Let us now present examples of classification. Fig. 11,a, and 11,b illustrate classification results for noisy images (σ2=100) for both classifiers. There are quite many pixel-wise misclassifications due to influence of noise, especially for the SVM classifier. Even the water surface is classified with misclassifications. In turn, Figures 11,c and 11,d present

Classification of Pre-Filtered Multichannel Remote Sensing Images 93

(a) (b)

(c) (d)

Fig. 11. Classification maps for noisy image classified by RBF NN (a) and SVM (b) and the

image pre-processed by the 3D DCT filter classified by RBF NN (c) and SVM (d)


Table 2. Classification results for particular classes of original and filtered images

classification results for the three-channel image processed by the 3D DCT-based filter. It is clearly seen that quite many misclassifications have been corrected and the objects of certain classes have become compact. Comparison of the classification results in Figures 11,c and 11,d to the data in Figures 11,a and 11,b clearly demonstrate expedience of using RS image pre-filtering before classification if noise is intensive.

Let us give one more example for multichannel radar imaging. Fig. 12,a shows a threechannel radar image (in monochrome representation composed of HH Ka-band, VV Kaband, and HH X-band SLAR images. The result of its component-wise processing by the modified sigma filter is presented in Fig. 12,b. Noise is suppressed but the edges are smeared due to residual errors of image co-registration and low contrasts of edges.

Considerably better edge/detail preservation is provided by the vector filter (Kurekin et al., 1997) that, in fact, sharpens edges if their misalignment in component images is detected (see Fig. 12,c). Finally, the result of bare soil areas detection (pixels are shown by white) by trained RBF NN applied to filtered data is depicted in Fig. 12,d. Since we had topology map for this region, probability of correct detection has been calculated and it was over 0.93. Classification results from original co-registered images were considerably less accurate.

Image σ2 Classifier Pcorr1 Pcorr2 Pcorr3 Pcorr4 Pcorr5 Noisy 49 RBF NN 0.717 0.909 0.987 0.718 0.805 Noisy 49 SVM 0.612 0.939 0.930 0.650 0.785 Filtered 2.1 49 RBF NN 0.814 0.991 0.987 0.715 0.830 Filtered 2.1 49 SVM 0.770 0.996 0.971 0.655 0.812 Filtered2.5 49 RBF NN 0.827 0.994 0.987 0.714 0.833 Filtered 2.5 49 SVM 0.803 0.998 0.974 0.657 0.818 Filtered 3D 49 RBF NN 0.839 0.997 0.987 0.720 0.860 Filtered 3D 49 SVM 0.882 0.998 0.986 0.682 0.862 Noisy 100 RBF NN 0.649 0.790 0.984 0.718 0.776 Noisy 100 SVM 0.530 0.826 0.834 0.634 0.745 Filtered 2.1 100 RBF NN 0.811 0.983 0.986 0.718 0.819 Filtered 2.1 100 SVM 0.728 0.994 0.966 0.653 0.797 Filtered2.5 100 RBF NN 0.834 0.991 0.985 0.717 0.830 Filtered 2.5 100 SVM 0.776 0.998 0.969 0.658 0.805 Filtered 3D 100 RBF NN 0.853 0.996 0.984 0.719 0.862 Filtered 3D 100 SVM 0.888 0.998 0.985 0.687 0.858

Table 2. Classification results for particular classes of original and filtered images

pre-filtering before classification if noise is intensive.

less accurate.

classification results for the three-channel image processed by the 3D DCT-based filter. It is clearly seen that quite many misclassifications have been corrected and the objects of certain classes have become compact. Comparison of the classification results in Figures 11,c and 11,d to the data in Figures 11,a and 11,b clearly demonstrate expedience of using RS image

Let us give one more example for multichannel radar imaging. Fig. 12,a shows a threechannel radar image (in monochrome representation composed of HH Ka-band, VV Kaband, and HH X-band SLAR images. The result of its component-wise processing by the modified sigma filter is presented in Fig. 12,b. Noise is suppressed but the edges are

Considerably better edge/detail preservation is provided by the vector filter (Kurekin et al., 1997) that, in fact, sharpens edges if their misalignment in component images is detected (see Fig. 12,c). Finally, the result of bare soil areas detection (pixels are shown by white) by trained RBF NN applied to filtered data is depicted in Fig. 12,d. Since we had topology map for this region, probability of correct detection has been calculated and it was over 0.93. Classification results from original co-registered images were considerably

smeared due to residual errors of image co-registration and low contrasts of edges.

(a) (b)

(c) (d)

Fig. 11. Classification maps for noisy image classified by RBF NN (a) and SVM (b) and the image pre-processed by the 3D DCT filter classified by RBF NN (c) and SVM (d)

Classification of Pre-Filtered Multichannel Remote Sensing Images 95

Abramov, S., Zabrodina, V., Lukin, V., Vozel, B., Chehdi, K., & Astola, J. (2011). Methods for

Aiazzi, B., Alparone, L., Barducci, A., Baronti, S., Marcoinni, P., Pippi, I., & Selva, M. (2006).

Ainsworth, T., Lee, J.-S., & Chang, L.W. (2007). Classification Comparisons between Dual-Pol and Quad-Pol SAR Imagery, *Proceedings of IGARSS*, pp. 164-167 Alberga, V., Satalino, G., & Staykova, D. (2008). Comparison of Polarimetric SAR

Amato, U., Cavalli, R.M., Palombo, A., Pignatti, S., & Santini, F. (2009). Experimental

Barducci, A., Guzzi, D., Marcoionni, P., & Pippi, I. (2005). CHRIS-Proba performance

Benedetto, J.J., Czaja, W., Ehler, M., Flake, C., & Hirn, M. (2010). Wavelet packets for multi-

Berge, A. & Solberg, A. (2004). A Comparison of Methods for Improving Classification of

Bose, N.K. & Liang, P. (1996). *Neural network fundamentals with graphs, algorithms and* 

Chatterjee, P. & Milanfar, P. (2010). Is Denoising Dead? *IEEE Transactions on Image* 

Chatterjee, P. & Milanfar, P. (2011). Practical Bounds on Image Denoising: From Estimation

Chein-I Chang (Ed.) (2007). *Hyperspectral Data Exploitation: Theory and Applications*, Wiley-

Chen, G. & Qian, S. (2011). Denoising of Hyperspectral Imagery Using Principal Component

Christophe, E., Leger, D., & Mailhes, C. (2005). Quality criteria benchmark for hyperspectral

Cristianini, N. & Shawe-Taylor, J. (2000). *An Introduction to Support Vector Machines and Other* 

*Kernel-based Learning Methods*, Cambridge University Press

to Information. *IEEE Transactions on Image Processing*, , Vol. 20, No 5, (2011),

Analysis and Wavelet Shrinkage. *IEEE Transactions on Geoscience and Remote* 

imagery. *IEEE Transactions on Geoscience and Remote Sensing*, No. 43(9), pp. 2103-

Hyperspectral Data, *Proceedings of IGARSS*, Vol. 2, pp. 945-948

*Transactions on Geoscience and Remote Sensing*, Vol. 47, No 1, pp. 153-160 Astola, J., Haavisto, P. & Neuvo, Y. (1990) Vector Median Filters, Proc. IEEE, 1990, Vol. 78,

spectrometers. *Annals of Geophysics*, Vol. 49, No. 1, February 2006

*Sensing*, Vol. 29, Issue 14, (July 2008), pp. 4129-4150

*CHRIS/Proba Workshop*, Italy, March 2005

*Industrial Processing XIII,* SPIE Vol. 7535

*Processing*, Vol. 19, No 4, (April 2010), pp. 895-911

*applications*, McGraw Hill

*Sensing*, Vol. 49, pp. 973-980

pp. 221-1233

Interscience

2114.

Blind Estimation of the Variance of Mixed Noise and Their Performance Analysis, In: *Numerical Analysis – Theory and Applications*, Jan Awrejcewicz (Ed.), InTech, ISBN 978-953-307-389-7, Retrieved from <http://www.intechopen.com/articles/ show/title/methods-for-blind-estimation-of-the-variance-of-mixed-noise-and-

Noise modelling and estimation of hyperspectral data from airborne imaging

Observables in Terms of Classification Performance. *International Journal of Remote* 

approach to the selection of the components in the minimum noise fraction, *IEEE* 

evaluation: signal-to-noise ratio, instrument efficiency and data quality from acquisitions over San Rossore (Italy) test site, *Proceedings of the 3-rd ESA* 

and hyperspectral imagery, *Proceedings of SPIE Conference on Wavelet Applications in* 

**6. References** 

their-performance-analysis>

pp. 678-689

Fig. 12. Original three-channel radar image in monochrome representation (a), output for component-wise processing (b), output for vector filtering (c), classification map (d)

## **5. Conclusions**

It is demonstrated that in most modern applications of multichannel RS noise characteristics deviate from conventional assumption to be additive and i.i.d. Thus, filtering techniques are to be adapted to more sophisticated real-life models. This especially relates to multichannel radar imaging for which it is possible to gain considerably higher efficiency of denoising by taking into account spatial correlation of noise and sufficient correlation of information in component images. New approaches that take into account aforementioned properties are proposed and tested for real life data. It is also shown that filtering is expedient for RS images contaminated by considerably less intensive noise than in radar imaging. Even if noise is practically not seen (noticeable by visual inspection) in original images, its removal by efficient filters can lead to increase of data classification accuracy.

## **6. References**

94 Remote Sensing – Advanced Techniques and Platforms

(a) (b)

(c) (d)

by efficient filters can lead to increase of data classification accuracy.

**5. Conclusions** 

Fig. 12. Original three-channel radar image in monochrome representation (a), output for component-wise processing (b), output for vector filtering (c), classification map (d)

It is demonstrated that in most modern applications of multichannel RS noise characteristics deviate from conventional assumption to be additive and i.i.d. Thus, filtering techniques are to be adapted to more sophisticated real-life models. This especially relates to multichannel radar imaging for which it is possible to gain considerably higher efficiency of denoising by taking into account spatial correlation of noise and sufficient correlation of information in component images. New approaches that take into account aforementioned properties are proposed and tested for real life data. It is also shown that filtering is expedient for RS images contaminated by considerably less intensive noise than in radar imaging. Even if noise is practically not seen (noticeable by visual inspection) in original images, its removal


Classification of Pre-Filtered Multichannel Remote Sensing Images 97

Landgrebe, D. (2002). Hyperspectral image data analysis as a high dimensional signal

Lee, J.-S. (1983). Digital Image Smoothing and the Sigma Filter. *Comp. Vis. Graph. Image* 

Lukin, V., Tsymbal, O., Vozel, B., & Chehdi, K. (2006). Processing multichannel radar images

Lukin, V., Fevralev, D., Ponomarenko, N., Abramov, S., Pogrebnyak, O., Egiazarian, K., &

Lukin, V., Ponomarenko, N., Zriakhov, M., Kaarna, A., & Astola, J. (2010b). An Automatic

Lukin, V., Abramov, S., Ponomarenko, N., Egiazarian, K., & Astola, J. (2011a). Image

Lukin, V., Abramov, S., Ponomarenko, N., Uss, M, Zriakhov, M., Vozel, B., Chehdi, K.,

Makitalo, M., Foi, A., Fevralev, D., & Lukin, V. (2010). Denoising of single-look SAR images

Melgani, F. & Bruzzone, L. (2004). Classification of Hyperspectral Remote Sensing Images

Murtagh, F., Starck, J.L., & Bijaoui, A. (1995). Image restoration with noise suppression using a multiresolution support, *Astron. Astrophys. Suppl. Ser*, 112, pp. 179-189 Oliver, C. & Quegan, S. (2004). *Understanding Synthetic Aperture Radar Images*, SciTech

Phillips, R.D., Blinn, C.E., Watson, L.T., & Wynne, R.H. (2009). An Adaptive Noise-Filtering

Pizurica, A. & Philips, W. (2006). Estimating the probability of the presence of a signal of

Plataniotis, K.N. & Venetsanopoulos, A.N. (2000). *Color Image Processing and Applications*,

Plaza, J., Plaza, A., Perez, R., & Martinez, P. (2008). Parallel classification of hyperspectral

Ponomarenko, N., Lukin, V., Zriakhov, M., & Kaarna, A. (2006). Preliminary automatic

Ponomarenko, N., Silvestri, F., Egiazarian, K., Carli, M., Astola, J., & Lukin, V. (2007). On

*Transactions of GRS*, Vol. 47, No 9, (2009), pp. 3168-3179

*Transactions on Image Processing*, Vol. 15, No 3, pp. 654-665

by modified vector sigma filter for edge detection enhancement, *Proceedings of* 

Astola, J. (2010a). Discrete cosine transform-based local adaptive filtering of images corrupted by nonstationary noise. *Electronic Imaging Journal*, Vol. 19(2), No. 1,

Approach to Lossy Compression of AVIRIS Hyperspectral Data. *Telecommunications* 

Filtering: Potential Efficiency and Current Problems, *Proceedings of ICASSP*, 2011,

Astola, J. (2011b). Methods and automatic procedures for processing images based on blind evaluation of noise type and characteristics. *SPIE Journal on Advances in* 

based on variance stabilization and non-local filters, *CD-ROM Proceedings of* 

with Support Vector Machines, *IEEE Transactions on Geoscience and Remote Sensing*,

Algorithm for AVIRIS Data With Implications for Classification Accuracy. *IEEE* 

interest in multiresolution single- and multiband image denoising. *IEEE* 

images using neural networks, *Comput. Intel. for Remote Sensing*, *Springer SCI,*

analysis of characteristics of hyperspectral AVIRIS images, *Proceedings of MMET*,

between-coefficient contrast masking of DCT basis functions, *CD-ROM Proceedings of the Third International Workshop on Video Processing and Quality Metrics*, USA, 2007

problem. *IEEE Signal Processing Magazine*, No. 19, pp. 17-28

*and Radio Engineering*, Vol. 69(6), (2010), pp. 537-563.

*Remote Sensing*, 2011, DOI: 10.1117/1.3539768

*MMET*, Kiev, Ukraine, September 2010

Vol. 42, No 8, pp. 1778-1790

*Process.,* No. 24, (1983), pp. 255-269

*ICASSP*, Vol II, pp 833-836

(April-June 2010)

pp. 1433-1436

Publishing

Springer-Verlag, NY

Vol. 133, pp. 193-216

Kharkov, Ukraine, pp. 158-160


Curran, P.J. & Dungan, J., L. (1989). Estimation of signal-to-noise: a new procedure applied to

Dabov, K., Foi,A., & Egiazarian, K. (2007). Video Denoising by Sparse 3D-Transform

De Backer, S., Pizurica,A., Huysmans, B., Philips, W., & Scheunders, P. (2008). Denoising of

Deledalle, C.-A., Tupin, F., & Denis, L. (2011). Patch Similarity under Non Gaussian Noise,

Demir, B., Erturk, S., & Gullu, K. (2011). Hyperspectral Image Classification Using

Elad, M. (2010). *Sparse and Redundant Representations. From Theory to Applications in Signal and* 

Ferro-Famil, L. & Pottier, E. (2001). Multi-frequency polarimetric SAR data classification,

Fevralev, D., Lukin, V., Ponomarenko, N., Vozel, B., Chehdi, K., Kurekin, A., & Shark, L.

Fevralev, D., Ponomarenko, N., Lukin, V., Abramov, S., Egiazarian, K., & Astola, J. (2011).

Foi, A., Dabov, K., Katkovnik, V., & Egiazarian, K. (2007). Image denoising by sparse 3-D

Gungor, O. & Shan, J. (2006). An optimal fusion approach for optical and SAR images,

Jeon, B. & Landgrebe, D.A. (1999). Partially supervised classication using weighted

Kerekes, J.P. & Baum, J.E. (2003). Hyperspectral Imaging System Modeling. *Lincoln* 

Kervrann, C. & Boulanger, J. (2008). Local adaptivity to variable smoothness for exemplar-

Kulemin, G.P., Zelensky, A.A., & Astola, J.T. (2004). Methods and Algorithms for Pre-

Kurekin, A.A., Lukin, V.V., Zelensky, A.A., Ponomarenko, N.N., Astola, J.T., &

Kurekin, A.A., Lukin, V.V., Zelensky, A.A., Koivisto, P.T., Astola, J.T., & Saarinen, K.P.

*TICSP Series,* No. 28, ISBN 952-15-1293-8, Finland, TTY Monistamo

*Applications II*, San Diego, CA, USA, SPIE Vol. 3119, pp. 25-36

Domain Collaborative Filtering, *Proceedings of EUSIPCO*, 2007

*Image Processing*, Springer Science+Business Media, LLC

*Annals Of Telecommunications*, Vol. 56, No 9-10, pp. 510-522

*on Satellite Remote Sensing*, Toulouse, France, September 2010

*Pixels to Processes"*, Netherlands, May 2006, pp. 111-116

*Laboratory Journal*, Vol. 14, No 1, pp. 117-130

*Vision*, Vol. 79, No 1, (2008), pp. 45-69

*Computing*, Vol. 26, No 7, pp. 1038-1051

*Proceedings of ICIP*, 2011

Vol. 8, No 2, pp. 220-224.

*Processing*, 2011:41

No 2, pp. 1073–1079

No. 1, pp. 38-42

Vol. 6, No 8, (2007), pp. 2080-2095

AVIRIS data. *IEEE Transactions on Geoscience and Remote Sensing*, Vol. 27, pp. 20 – 628.

multicomponent images using wavelet least squares estimators. *Image and Vision* 

Denoising of Intrinsic Mode Functions. *IEEE Geoscience and Remote Sensing Letters*,

(2010). Classification of filtered multichannel images, *Proceedings of SPIE/EUROPTO* 

Efficiency analysis of color image filtering. *EURASIP Journal on Advances in Signal* 

transform-domain collaborative filtering. *IEEE Transactions on Image Processing*,

*Proceedings of ISPRS Commission VII Mid-term Symposium "Remote Sensing: from* 

unsupervised clustering. *IEEE Transactions on Geoscience and Remote Sensing*, Vol. 37,

based image regularization and representation. *International Journal of Computer* 

processing and Classification of Multichannel Radar Remote Sensing Images,

Saarinen, K.P. (1997). Adaptive Nonlinear Vector Filtering of Multichannel Radar Images, *Proceedings of SPIE Conference on Multispectral Imaging for Terrestrial* 

(1999). Comparison of component and vector filter performance with application to multichannel and color image processing, *Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing*, Antalya, Turkey, June 1999,


**5**

*Philippines* 

**Estimation of the Separable MGMRF**

*University of the Philippines, Diliman, Quezon City* 

**Parameters for Thematic Classification** 

Rolando D. Navarro, Jr., Joselito C. Magadia and Enrico C. Paringit

Because of its ability to describe interdependence between neighboring sites, the Markov Random Field (MRF) is a very attractive model in characterizing correlated observations (Moura and Balram, 1993) and it has potential applications in areas of remote sensing, such as spatio-temporal modeling and machine vision. In this study, we model image random field conditional to the texture label as a Multivariate Gauss Markov Random Field (MGMRF); whereas; the thematic map is modeled as a discrete label MRF (Li, 1995). The observations in the Gauss Markov Random Field (GMRF) are distributed with the Gaussian

There are some MGMRF models where the interaction matrices are modeled in some simplified form, including the MGMRF with isotropic interaction matrix which we shall refer here as Hazel's GMRF (Hazel, 2000), The MGFMRF with anisotropic interaction matrix proportional to the identity matrix which we shall refer here as Rellier's GMRF (Rellier et.

From these developments, the model for anisotropic GMRF was generalized and its parameter estimator for an arbitrary neighborhood system is characterized (Navarro et al., 2009). Using our model, the classification performance was analyzed and compared with the

Spectral classes are explored in segmenting image random field models to be able to extract the spatial, spectral, and temporal information. A special case is addressed when the observation includes spectral and temporal information known as the spectro-temporal observation. With respect to the spectral and temporal dimensions, the separability structure is considered based on the Kronecker tensor product of the GMRF model parameters. Separable parameters contain less parameters, compared with its non-separable counterpart. In addition, the spectral and temporal dimensions on a separable model can be analyzed separately. We analyzed whether the separability of the GMRF parameters would improve

This section covers statistical background in characterizing random fields based on the MRF. Then, we will present estimation for the thematic map and image random field parameters.

al., 2004), and the Gaussian Symmetric Clustering (GSC) (Hazel, 2000).

**2. Image random field modelling and thematic classification** 

**1. Introduction** 

distribution.

GMRF models in literature.

the classification of the thematic map.


## **Estimation of the Separable MGMRF Parameters for Thematic Classification**

Rolando D. Navarro, Jr., Joselito C. Magadia and Enrico C. Paringit *University of the Philippines, Diliman, Quezon City Philippines* 

## **1. Introduction**

98 Remote Sensing – Advanced Techniques and Platforms

Ponomarenko, N., Lukin, V., Egiazarian, K., & Astola, J. (2008a). Adaptive DCT-based

Ponomarenko, N., Lukin, V., Zelensky, A., Koivisto, P., & Egiazarian, K. (2008b). 3D DCT

Ponomarenko, N., Lukin, V., Egiazarian, K., & Astola, J. (2010). A method for blind

*Image Processing: Algorithms and Systems VII*, San Jose, USA, 2010, Vol. 7532 Ponomarenko, N., Lukin, V., & Egiazarian, K. (2011). HVS-Metric-Based Performance

Popov, M.A., Stankevich, S.A., Lischenko, L.P., Lukin, V.V., & Ponomarenko, N.N. (2011).

Renard, N., Bourennane, S., & Blanc-Talon, J. (2006). Multiway Filtering Applied on Hyperspectral Images, *Proceedings of ACIVS*, *Springer LNCS*, Vol. 4179, pp. 127-137 Renard, N., Bourennane, S., & Blanc-Talon, J. (2008). Denoising and Dimensionality

Schölkopf, B., Burges, J.C., & Smola, A.J. (1999). *Advances in Kernel Methods: Support Vector* 

Schowengerdt, R.A. (2007). *Remote Sensing: Models and Methods for Image Processing*,

Tsymbal, O.V., Lukin, V.V., Ponomarenko, N.N., Zelensky, A.A., Egiazarian, K.O., &

Uss, M., Vozel, B., Lukin, V., & Chehdi, K. (2011a). Local Signal-Dependent Noise Variance

Uss, M., Vozel, B., Lukin, V., & Chehdi, K. (2011b). Potential MSE of color image local

Yli-Harja, O. & Shmulevich, I. (1999). Correcting Misclassifications in Hyperspectral Image

Zabala, A., Pons, X., Diaz-Delgado, R., Garcia, F., Auli-Llinas, F., & Serra-Sagrista, J. (2006).

*Signal Processing*, Vol. 5, No. 2, DOI: 10.1109/JSTSP.2010.2104312

*Symposium on Nonlinear Theory and its Applications*, pp. 259-262

*Human Vision and Electronic Imaging XIII*, SPIE Vol. 6806

*Image Processing: Algorithms and Systems VI*, 2008, Vol. 6812

*Geoscience and Remote Sensing*, Vol. 42, No 7, pp. 1543-1551

*Remote Sensing Letters*, Vol. 5, No 2, pp. 138-142

*Learning*, MIT Press, Cambridge, MA.

Academic Press

(May 2005), pp. 1185-1204

February 2011, pp. 91-101

*Engineering*, No. 67, (2008), pp. 1369-1392

filtering of images corrupted by spatially correlated noise, *Proc. SPIE Conference* 

Based Filtering of Color and Multichannel Images, *Telecommunications and Radio* 

estimation of spatially correlated noise characteristics, *Proceedings of SPIE Conference* 

Analysis Of Image Denoising Algorithms, *Proceedings of EUVIP*, Paris, France, 2011

Processing of Hyperspectral Imagery for Contamination Detection in Urban Areas. *NATO Science for Peace and Security Series C: Environmental Security*, pp. 147-156. Rellier, G., Descombes, X., Falzon, F., & Zerubia, J. (2004). Texture Feature Analysis using a

Gauss-Markov Model in Hyperspectral Image Classification, *IEEE Transactions in* 

Reduction Using Multilinear Tools for Hyperspectral Images. *IEEE Geoscience and* 

Astola, J.T. (2005). Three-state Locally Adaptive Texture Preserving Filter for Radar and Optical Image Processing. *EURASIP Journal on Applied Signal Processing*, No. 8,

Estimation from Hyperspectral Textural Images. *IEEE Journal of Selected Topics in* 

filtering in component-wise and vector cases, *Proceedings of CADSM*, Ukraine,

Data Using a Nonlinear Graph-based Estimation Technique, *International* 

Effects of JPEG and JPEG2000 lossy compression on remote sensing image classification for mapping crops and forest areas, *Proceedings of IGARSS*, pp. 790-793 Zelensky, A., Kulemin, G., Kurekin, A., & Lukin, V. (2002). Modified Vector Sigma Filter for

the Processing of Multichannel Radar Images and Increasing Reliability of Its Interpretation. *Telecommunication and Radioengineering*, Vol. 58, No. 1-2, pp.100-113 Zhang, H., Peng, H., Fairchild, M.D., & Montag, E.D. (2008). Hyperspectral Image

Visualization based on Human Visual Model, *Proceedings of SPIE Conference on* 

Because of its ability to describe interdependence between neighboring sites, the Markov Random Field (MRF) is a very attractive model in characterizing correlated observations (Moura and Balram, 1993) and it has potential applications in areas of remote sensing, such as spatio-temporal modeling and machine vision. In this study, we model image random field conditional to the texture label as a Multivariate Gauss Markov Random Field (MGMRF); whereas; the thematic map is modeled as a discrete label MRF (Li, 1995). The observations in the Gauss Markov Random Field (GMRF) are distributed with the Gaussian distribution.

There are some MGMRF models where the interaction matrices are modeled in some simplified form, including the MGMRF with isotropic interaction matrix which we shall refer here as Hazel's GMRF (Hazel, 2000), The MGFMRF with anisotropic interaction matrix proportional to the identity matrix which we shall refer here as Rellier's GMRF (Rellier et. al., 2004), and the Gaussian Symmetric Clustering (GSC) (Hazel, 2000).

From these developments, the model for anisotropic GMRF was generalized and its parameter estimator for an arbitrary neighborhood system is characterized (Navarro et al., 2009). Using our model, the classification performance was analyzed and compared with the GMRF models in literature.

Spectral classes are explored in segmenting image random field models to be able to extract the spatial, spectral, and temporal information. A special case is addressed when the observation includes spectral and temporal information known as the spectro-temporal observation. With respect to the spectral and temporal dimensions, the separability structure is considered based on the Kronecker tensor product of the GMRF model parameters. Separable parameters contain less parameters, compared with its non-separable counterpart. In addition, the spectral and temporal dimensions on a separable model can be analyzed separately. We analyzed whether the separability of the GMRF parameters would improve the classification of the thematic map.

## **2. Image random field modelling and thematic classification**

This section covers statistical background in characterizing random fields based on the MRF. Then, we will present estimation for the thematic map and image random field parameters.

Estimation of the Separable MGMRF Parameters for Thematic Classification 101

**XX L Θ θΣ 1 r**

 

**s sr rs s**

 

*L L L* **s sr s**

*L L*

**Σ r 0**

1 1 , ; exp . <sup>2</sup> <sup>2</sup>

*N N*

**0**

 <sup>1</sup> cov , ; .

2 1 2

The maximum pseudo-likelihood estimation (MPLE) combines sites to form the pseudolikelihood function from the conditional probabilities (Li, 1995). The pseudo-likelihood functions for the thematic map random field and image random field parameters are given

> *PL p L* ; **s s**

**φ L φ** 

 1

invariance property, that is, if **Π**ˆ is the MPLE of the parameter **Π** , then for an arbitrary

invariance property of the MLE (Casella and Berger, 2002) since the form of the pseudolikelihood function is analogous that of the likelihood function, depending on the parameter given the data. Moreover, the MPLE converges to the MLE almost surely as the lattice size

The thematic map can be recovered by the maximum a posteriori probability (MAP) rule. It can be implemented using a numerical optimization technique such as Simulated Annealing (SA) (Jeng & Woods, 1991). Although the global convergence employing SA is guaranteed almost surely, its convergence is very slow (Aarts & Korts, 1987; Winkler, 2006). An

<sup>ˆ</sup> arg max , ; ; . <sup>1</sup> *L p Lm p L m*

*m M* **s ss Y L <sup>Θ</sup> <sup>L</sup> <sup>φ</sup> s s** (10)

 **s s s Θ L YY L Θ** 

**s**

*m m PL p* 

*M*

*<sup>N</sup> p L <sup>L</sup>* **s s s ss s YY L Θ X Σ X Σ**

<sup>1</sup> *E* **X Ls** ;**Θ 0***N* (4)

*<sup>p</sup>* **s sr** *<sup>L</sup>***<sup>s</sup> r 0 XY L Θ Σ <sup>1</sup>** (6)

*T*

, ;

1

*p*

otherwise

1

(5)

(7)

(8)

*m* thematic class. The MPLE possesses an

**Π**. The proof is similar to that of the

(9)

The noise process has the following characterization:

cov , ;

The conditional probability on the other hand is given as

where *m* is the collection of sites with the *th*

approaches infinity (Geman and Greffigne, 1987).

**Π** is the MPLE of the parameter

alternative to this is to use the ICM algorithm (Besag, 1986) given as

**2.4 Maximum pseudo-likelihood estimation** 

as follows:

function

 , <sup>ˆ</sup> 

**2.5 Thematic classification** 

Finally the thematic map classifier is presented based on the Iterated Conditional Modes (ICM) algorithm.

#### **2.1 Markov random fields**

A random field **Z Zs <sup>s</sup>** : where **s** is a site on the lattice with the neighborhood system with parameter **Π** is a MRF if for **s** (Winkler, 2003).

$$\rho\left(\mathbf{Z\_s}\,\middle|\,\mathbf{Z\_{\mathcal{S}\mathcal{S}}}\,\middle|\,\mathbf{H}\right) = \rho\left(\mathbf{Z\_s}\,\middle|\,\mathbf{Z\_{\mathcal{C}\mathcal{S}}}\,\middle|\,\mathbf{H}\right) \tag{1}$$

where : **Ζ Ζ t s <sup>s</sup> <sup>t</sup>** is the random field which consists of observations of the neighbors of **s**. Similarly, **Ζ Ζ <sup>s</sup> <sup>t</sup>** : **t s** is the random field, which consists of observations that exclude **s**.

#### **2.2 Thematic map modeling**

Let *L L* **<sup>s</sup> <sup>s</sup>** be denoted as the thematic map, where *L***<sup>s</sup>** 1, , *<sup>M</sup>* is the labeled thematic class at site **s** and *M* is the number of thematic classes. The thematic map is modeled as a discrete space, discrete domain MRF with parameters , <sup>1</sup> *a b <sup>m</sup> m M* **<sup>φ</sup> <sup>r</sup> <sup>r</sup>** where *am* is the singleton potential coefficient for the *th <sup>m</sup>* thematic class, *b***<sup>r</sup>** are made up by the pairwise potential coefficients, and is region of support (Jeng & Woods, 1991) or the neighborhood set (Kasyap & Chellappa, 1983). Its conditional probability density function (pdf) is given by

$$p\left(L\_{\mathbf{S}}\left|\mathbf{L}\_{\widehat{\mathcal{C}}\mathbf{S}},\boldsymbol{\Phi}\right)=\frac{\exp\left(\sum\_{m=1}^{M}a\_{m}\mathbf{1}\_{\left\{\mathbf{L}\_{\mathbf{S}}=m\right\}}+\sum\_{\mathbf{r}\in\mathcal{N}}b\_{\mathbf{r}}\cdot V\left(L\_{\mathbf{S}},L\_{\mathbf{S}-\mathbf{r}}\right)\right)}{\sum\_{l=1}^{M}\exp\left(a\_{l}+\sum\_{\mathbf{r}\in\mathcal{N}}b\_{\mathbf{r}}\cdot V\left(L\_{\mathbf{S}}=l,L\_{\mathbf{S}-\mathbf{r}}\right)\right)}\tag{2}$$

(Li, 1995), where

$$V\left(\mathbf{x}, \mathcal{Y}\right) = \begin{cases} 1 & \mathbf{x} = \mathcal{Y} \\ -1 & \mathbf{x} \neq \mathcal{Y} \end{cases}$$

#### **2.3 Image random field modeling**

The observation **Ys** given the thematic map **L** is modeled with the conditional distribution ~ , *NLL <sup>N</sup>* **Y L μ Σ s ss** . It is conditionally dependent on *L***<sup>s</sup>** , the thematic class at site **s** , and it is driven by an autoregressive Gaussian colored noise process ~ ,. <sup>1</sup> *N L* **XL 0** *<sup>N</sup> <sup>N</sup>* **s s <sup>Σ</sup>** Two noise processes **Xs** and **Xs r** are statistically independent if the corresponding thematic classes *L***<sup>s</sup>** and *L***s r** are different for all **r** and **s** . This model tends to avoid the blurring effect created between segment boundaries which, in turn, may yield poor classification performance. The resulting equation can be written as follows:

$$\mathbf{X}\_{\mathbf{s}} = \left(\mathbf{Y}\_{\mathbf{s}} - \boldsymbol{\mu}\left(L\_{\mathbf{s}}\right)\right) - \sum\_{\mathbf{r} \in \mathcal{N}} \boldsymbol{\Theta}\_{\mathbf{r}}\left(L\_{\mathbf{s}}\right) \mathbf{1}\_{\{L\_{\mathbf{s}} = L\_{\mathbf{s} - \mathbf{r}}\}}\left(\mathbf{Y}\_{\mathbf{s} - \mathbf{r}} - \boldsymbol{\mu}\left(L\_{\mathbf{s}}\right)\right). \tag{3}$$

The noise process has the following characterization:

100 Remote Sensing – Advanced Techniques and Platforms

Finally the thematic map classifier is presented based on the Iterated Conditional Modes

A random field **Z Zs <sup>s</sup>** : where **s** is a site on the lattice with the neighborhood

of **s**. Similarly, **Ζ Ζ <sup>s</sup> <sup>t</sup>** : **t s** is the random field, which consists of observations

Let *L L* **<sup>s</sup> <sup>s</sup>** be denoted as the thematic map, where *L***<sup>s</sup>** 1, , *<sup>M</sup>* is the labeled thematic class at site **s** and *M* is the number of thematic classes. The thematic map is modeled as a discrete space, discrete domain MRF with parameters

where *am* is the singleton potential coefficient for the *th <sup>m</sup>* thematic class, *b***<sup>r</sup>** are made up by the pairwise potential coefficients, and

support (Jeng & Woods, 1991) or the neighborhood set (Kasyap & Chellappa, 1983). Its

**<sup>1</sup> <sup>L</sup> r s sr <sup>s</sup> <sup>r</sup> <sup>s</sup> <sup>L</sup> <sup>φ</sup> <sup>s</sup>**

*a b <sup>m</sup> <sup>V</sup> <sup>L</sup> <sup>L</sup> <sup>m</sup> <sup>m</sup> p L <sup>M</sup>*

*M*

*<sup>l</sup> <sup>l</sup>*

the corresponding thematic classes *L***<sup>s</sup>** and *L***s r** are different for all **r**

**r X Y μ θ 1 Y μ** 

exp , <sup>1</sup> ,

 <sup>1</sup> , 1 . *<sup>x</sup> <sup>y</sup> V xy*

 

The observation **Ys** given the thematic map **L** is modeled with the conditional distribution ~ , *NLL <sup>N</sup>* **Y L μ Σ s ss** . It is conditionally dependent on *L***<sup>s</sup>** , the thematic class at site **s** , and it is driven by an autoregressive Gaussian colored noise process ~ ,. <sup>1</sup> *N L* **XL 0** *<sup>N</sup> <sup>N</sup>* **s s <sup>Σ</sup>** Two noise processes **Xs** and **Xs r** are statistically independent if

model tends to avoid the blurring effect created between segment boundaries which, in turn, may yield poor classification performance. The resulting equation can be written as

. *L L LL L* **s sr ss s r s sr s**

exp , <sup>1</sup>

*a b V L lL*

*x y*

**r s sr <sup>r</sup>**

 

**<sup>s</sup> <sup>t</sup>** is the random field which consists of observations of the neighbors

*p p* ; ; **Z Z s s <sup>Π</sup> Z Z <sup>Π</sup> s s** (1)

and **s** . This

(3)

is region of

(2)

system with parameter **Π** is a MRF if for **s** (Winkler, 2003).

(ICM) algorithm.

**2.1 Markov random fields** 

where : **Ζ Ζ t s**

**2.2 Thematic map modeling** 

, <sup>1</sup> *a b <sup>m</sup> m M* **<sup>φ</sup> <sup>r</sup> <sup>r</sup>**

**2.3 Image random field modeling** 

conditional probability density function (pdf) is given by

that exclude **s**.

(Li, 1995), where

follows:

$$E\left[\mathbf{X}\_{\sf s} \, \middle| \, \mathbf{L} ; \Theta \right] = \mathbf{0}\_{\sf N \times 1} \tag{4}$$

$$\operatorname{cov}\left(\mathbf{X}\_{\sf s}, \mathbf{X}\_{\sf s-\sf r} \, \big|\, \mathbf{L}; \Theta\right) = \begin{cases} \begin{array}{l} \mathbf{E}\left(L\_{\sf s}\right) & \mathbf{r} = \mathbf{0}\_{p \times 1} \\ -\mathbf{0}\_{\sf r}\left(L\_{\sf s}\right) \mathbf{E}\left(L\_{\sf s}\right) \mathbf{1}\_{\{L\_{\sf s} = L\_{\sf s-\sf r}\}} & \mathbf{r} \in \mathcal{N} \\ \mathbf{0}\_{N \times N} & \text{otherwise} \end{array} \tag{5}$$

$$\text{cov}\left(\mathbf{X}\_{\text{s}}, \mathbf{Y}\_{\text{s}-\text{r}} \, \middle| \, \mathbf{L}; \Theta\right) = \Sigma\left(L\_{\text{s}}\right) \cdot \mathbf{1}\_{\left(\mathbf{r} = \mathbf{0}\_{p \times 1}\right)}.\tag{6}$$

The conditional probability on the other hand is given as

$$p\left(\mathbf{Y}\_{\mathbf{s}}\middle|\mathbf{Y}\_{\hat{\mathbf{c}}\mathbf{s}'}\mathbf{L};\Theta\right) = \frac{1}{\left(2\pi\right)^{N/2}\left|\mathbf{E}\left(L\_{\mathbf{s}}\right)\right|^{1/2}}\exp\left(-\frac{1}{2}\mathbf{X}\_{\mathbf{s}}^{T}\Sigma^{-1}(L\_{\mathbf{s}})\mathbf{X}\_{\mathbf{s}}\right). \tag{7}$$

#### **2.4 Maximum pseudo-likelihood estimation**

The maximum pseudo-likelihood estimation (MPLE) combines sites to form the pseudolikelihood function from the conditional probabilities (Li, 1995). The pseudo-likelihood functions for the thematic map random field and image random field parameters are given as follows:

$$PL\left(\boldsymbol{\upphi}\right) = \prod\_{\boldsymbol{\upomega}\in\mathcal{S}} p\left(L\_{\boldsymbol{\upomega}} \middle| \mathbf{L}\_{\boldsymbol{\upomega}}; \boldsymbol{\upphi}\right) \tag{8}$$

$$PL\left(\Theta \left| \mathbf{L} \right.\right) = \prod\_{n=1}^{M} \prod\_{\ast \in \mathcal{S}(n)} p\left(\mathbf{Y}\_{\ast} \left| \mathbf{Y}\_{\ast \ast}, \mathbf{L}; \Theta \right.\right) \tag{9}$$

where *m* is the collection of sites with the *th m* thematic class. The MPLE possesses an invariance property, that is, if **Π**ˆ is the MPLE of the parameter **Π** , then for an arbitrary function , <sup>ˆ</sup> **Π** is the MPLE of the parameter **Π**. The proof is similar to that of the invariance property of the MLE (Casella and Berger, 2002) since the form of the pseudolikelihood function is analogous that of the likelihood function, depending on the parameter given the data. Moreover, the MPLE converges to the MLE almost surely as the lattice size approaches infinity (Geman and Greffigne, 1987).

#### **2.5 Thematic classification**

The thematic map can be recovered by the maximum a posteriori probability (MAP) rule. It can be implemented using a numerical optimization technique such as Simulated Annealing (SA) (Jeng & Woods, 1991). Although the global convergence employing SA is guaranteed almost surely, its convergence is very slow (Aarts & Korts, 1987; Winkler, 2006). An alternative to this is to use the ICM algorithm (Besag, 1986) given as

$$\hat{L}\_{\mathbf{S}} = \arg\max\_{1 \le m \le M} p\left(\mathbf{Y} \Big| L\_{\mathbf{S}} = m, \mathbf{L}\_{\mathbf{S}/\mathbf{S}}; \Theta\right) p\left(L\_{\mathbf{S}} = m \Big| \mathbf{L}\_{\mathbf{S}/\mathbf{S}}; \Phi\right). \tag{10}$$

Estimation of the Separable MGMRF Parameters for Thematic Classification 103

The observation **Ys** is a multispectral and mono-temporal vector of reflectance of the given spatial location 1 2 *<sup>s</sup>* , *<sup>s</sup>* measured at the *th <sup>k</sup>* spectral band with wavelength *<sup>k</sup>*

<sup>1</sup> 1 *k N* , and at the *th l* temporal slot with time *<sup>l</sup> T T* for <sup>2</sup> 1 *l N* . More specifically, the

*s s WWW s s s s kl ssNN* **Y Y <sup>s</sup>**

**Ys** defined by rearranging the elements of the spectro-temporal

2

*N*

2

*N*

.

**<sup>B</sup>** by the mapping <sup>1</sup> *ij k i j N* <sup>1</sup> *b a* for all <sup>1</sup> <sup>1</sup> *i N* and

**Ys** is characterized by allocating the reflectance across the bands for a given time

, 1,1 , 1,2 , 1,

*YY Y*

**s s s**

*YY Y*

**s s s**

*YY Y*

**s s s**

There is a growing interest in modeling the covariance structure with more than one attribute. For example, in spatio-temporal modeling, the covariance structure of "spatial" and "temporal" attributes is jointly considered (Kyriakidis and Journel, 1999; Huizenga, et. al., 2002). On the other hand, in the area of longitudinal studies the covariance structure between "factors" and "temporal" attributes are jointly considered (Naik and Rao, 2001). Both studies mentioned above considered covariance matrices with a separable structure between these attributes.

In the realm of remote sensing, few studies have been conducted combining the covariance structure involving spectro-temporal attributes. Campbell and Kiiveri demonstrated canonical variates calculations are reduced to simultaneous between-groups and within-group analyses of a linear combination of spectral bands over time, and the analyses of a linear combination of

In light of recent literature, we propose to model the GMRF models as applied to remote sensing image processing where the covariance structure of the "spectral" and "temporal" attributes is characterized jointly. The separable covariance structure associated with the

The matrix observation driven by a colored noise and its vectorized distribution, is assumed to be a realization from the process whose conditional form is given by

, 2 ,1 , 2,2 , 2, #

*N N N N*

, ,1 , ,2 , ,

1 1 1 2

1 2 *reshape N N* , , **Y Y s s** . The reshape function

*T*

 , for

*<sup>k</sup>* **A** *a* into the

(13)

The thematic class *L***s** at a given site **s** is modeled to be fixed over time.

<sup>1</sup> <sup>1</sup> *th*

<sup>2</sup> 1 *j N* , i.e.

The matrix #

Let us consider the matrix #

1 2 *<sup>N</sup> <sup>N</sup>* matrix 1 2 *N N*

*Observation* 1 2 1 2 1 2 12 1 2 , , ,1,1 , , , , , ,

*kl N* element of **Ys** denoted as , , *k l <sup>Y</sup>***s** is given as 1 2 , , *k l s s kl* , ,, *Y W* **<sup>s</sup>** .

given as 1 2 **B A** *reshape N N* , , transforms the vector 1 2 *N N*

observation **Ys** with the reshape operator #

by column and the reflectance across time for a given band by row.

**s**

**Y**

**3.2 Separable structure of the covariance matrix** 

the time over the spectral bands (Campbell and Kiiveri, 1988).

matrix Gaussian distribution has been considered.

**3.2.1 Non-separable covariance structure** 

*ij <sup>b</sup>*

This is interpreted as the instantaneous freezing of the annealing schedule of the SA. However, since *p* **Y L**;**Θ** is difficult to evaluate, alternatively, it is replaced by its pseudolikelihood (Hazel, 2000) given as

$$p\left(\mathbf{Y}|\mathbf{L};\Theta\right) \approx \prod\_{\mathbf{s}\in\mathcal{S}} p\left(\mathbf{Y}\_{\mathbf{s}}|\mathbf{Y}\_{\mathbf{c}s}, \mathbf{L}; \Theta\right). \tag{11}$$

Hence, the classifier is reduced to

$$\hat{L}\_{\mathbf{s}} = \arg\max\_{1 \le m \le M} \prod\_{\mathbf{s} \in \mathcal{S}} p\left(\mathbf{Y}\_{\mathbf{s}} \left| \mathbf{Y}\_{\mathbf{c}\mathbf{s}}, \mathbf{L}; \Theta \right.\right) \cdot p\left(L\_{\mathbf{s}} = m \left| \mathbf{L}\_{\mathbf{s} \not\le \mathbf{s}}; \mathbf{q} \right). \tag{12}$$

The ICM algorithm, unlike the SA, is only guaranteed to converge to the local maxima. This problem can be alleviated by initializing the thematic map from the Gaussian Spectral Clustering (GSC) model (Hazel, 2000).

#### **2.6 Numerical implementation**

The MPLE-based estimators are not in their closed form and must be evaluated numerically. The pseudocode for estimating the parameters is presented below.

Initialize **L** , **φ** , and **Θ** Estimate **φ** Estimate **Θ** Estimate **μ***m* given *m* **<sup>r</sup> θ** and **Σ***m* Estimate *m* **<sup>r</sup> θ** given **Σ***m* and **μ***m* Estimate **Σ***m* given **μ***m* and *m* **<sup>r</sup> θ** Estimate *L***s** by the ICM Algorithm

The image random field parameters are estimated using a method with some resemblance to the Gauss-Seidel iteration method (Kreyzig, 1993). The convergence criterion for estimating these parameters using this iteration method has yet to be established. As a precautionary measure, a single iteration was performed. This method was also applied in estimating the image random field estimators in Rellier's GMRF (Rellier, et. al., 2004).

### **3. Spectro-temporal MGMRF modelling**

The spectro-temporal observation image random field will be characterized with hybrid separable MGMRF parameters.

#### **3.1 Image random field modeling**

We let *M1* - number of lines, *M2* - number of samples, N1 - number of spectral bands, and N2 - number of temporal slots. The image random field is characterized as follows:

*Lattice* 1 2 11 2 2 *ss s M s M* , : 1 , 1

*s s*, *L L* **<sup>s</sup>**

*Thematic Class* 1 2

This is interpreted as the instantaneous freezing of the annealing schedule of the SA. However, since *p* **Y L**;**Θ** is difficult to evaluate, alternatively, it is replaced by its pseudo-

> *p p* ; , ; . **s s s Y L Θ YY L Θ**

 <sup>1</sup> ˆ arg max , ; ; .

*L pp* **s s <sup>s</sup>** *L***<sup>s</sup>** *m* **<sup>s</sup>**

The ICM algorithm, unlike the SA, is only guaranteed to converge to the local maxima. This problem can be alleviated by initializing the thematic map from the Gaussian Spectral

The MPLE-based estimators are not in their closed form and must be evaluated numerically.

The image random field parameters are estimated using a method with some resemblance to the Gauss-Seidel iteration method (Kreyzig, 1993). The convergence criterion for estimating these parameters using this iteration method has yet to be established. As a precautionary measure, a single iteration was performed. This method was also applied in estimating the

The spectro-temporal observation image random field will be characterized with hybrid

We let *M1* - number of lines, *M2* - number of samples, N1 - number of spectral bands, and N2


*Lattice* 1 2 11 2 2 *ss s M s M* , : 1 , 1

*s s*, *L L* **<sup>s</sup>**

*m M*

The pseudocode for estimating the parameters is presented below.

image random field estimators in Rellier's GMRF (Rellier, et. al., 2004).

**s**

**YY L Θ L φ**

(11)

(12)

likelihood (Hazel, 2000) given as

Hence, the classifier is reduced to

Clustering (GSC) model (Hazel, 2000).

Estimate **μ***m* given *m* **<sup>r</sup> θ** and **Σ***m* Estimate *m* **<sup>r</sup> θ** given **Σ***m* and **μ***m* Estimate **Σ***m* given **μ***m* and *m* **<sup>r</sup> θ**

**3. Spectro-temporal MGMRF modelling** 

**2.6 Numerical implementation** 

Estimate *L***s** by the ICM Algorithm

separable MGMRF parameters.

**3.1 Image random field modeling** 

*Thematic Class* 1 2

Initialize **L** , **φ** , and **Θ**

Estimate **φ** Estimate **Θ** The thematic class *L***s** at a given site **s** is modeled to be fixed over time.

$$\text{Observation} \qquad \mathbf{Y}\_{\ast} = \mathbf{Y}\_{\left(\iota\_{\iota\_{\ast},\iota\_{\ast}}\right)} = \begin{pmatrix} W\_{\left(\iota\_{\iota\_{\ast},\iota\_{\ast},1,1}\right)} & \cdots & W\_{\left(\iota\_{\iota\_{\iota},\iota\_{\ast},k,1}\right)} & \cdots & W\_{\left(\iota\_{\iota\_{\iota},\iota\_{\ast},N\_{\ast},N\_{\ast}}\right)} \end{pmatrix}^{r}$$

The observation **Ys** is a multispectral and mono-temporal vector of reflectance of the given spatial location 1 2 *<sup>s</sup>* , *<sup>s</sup>* measured at the *th <sup>k</sup>* spectral band with wavelength *<sup>k</sup>* , for <sup>1</sup> 1 *k N* , and at the *th l* temporal slot with time *<sup>l</sup> T T* for <sup>2</sup> 1 *l N* . More specifically, the <sup>1</sup> <sup>1</sup> *th kl N* element of **Ys** denoted as , , *k l <sup>Y</sup>***s** is given as 1 2 , , *k l s s kl* , ,, *Y W* **<sup>s</sup>** .

Let us consider the matrix # **Ys** defined by rearranging the elements of the spectro-temporal observation **Ys** with the reshape operator # 1 2 *reshape N N* , , **Y Y s s** . The reshape function given as 1 2 **B A** *reshape N N* , , transforms the vector 1 2 *N N <sup>k</sup>* **A** *a* into the 1 2 *<sup>N</sup> <sup>N</sup>* matrix 1 2 *N N ij <sup>b</sup>* **<sup>B</sup>** by the mapping <sup>1</sup> *ij k i j N* <sup>1</sup> *b a* for all <sup>1</sup> <sup>1</sup> *i N* and <sup>2</sup> 1 *j N* , i.e.

$$\begin{bmatrix} Y\_{\*,(\iota,\iota)} & Y\_{\*,(\iota,\iota)} & \cdots & Y\_{\*,(\iota,\iota)} \\ Y\_{\*,(\iota,\iota)} & Y\_{\*,(\iota,\iota)} & \cdots & Y\_{\*,(\iota,\iota)} \\ \cdots & \cdots & \cdots & \cdots \\ Y\_{\*,(\iota,\iota)} & Y\_{\*,(\iota,\iota)} & \cdots & Y\_{\*,(\iota,\iota,\iota)} \end{bmatrix} \tag{13}$$

The matrix # **Ys** is characterized by allocating the reflectance across the bands for a given time by column and the reflectance across time for a given band by row.

#### **3.2 Separable structure of the covariance matrix**

There is a growing interest in modeling the covariance structure with more than one attribute. For example, in spatio-temporal modeling, the covariance structure of "spatial" and "temporal" attributes is jointly considered (Kyriakidis and Journel, 1999; Huizenga, et. al., 2002). On the other hand, in the area of longitudinal studies the covariance structure between "factors" and "temporal" attributes are jointly considered (Naik and Rao, 2001). Both studies mentioned above considered covariance matrices with a separable structure between these attributes.

In the realm of remote sensing, few studies have been conducted combining the covariance structure involving spectro-temporal attributes. Campbell and Kiiveri demonstrated canonical variates calculations are reduced to simultaneous between-groups and within-group analyses of a linear combination of spectral bands over time, and the analyses of a linear combination of the time over the spectral bands (Campbell and Kiiveri, 1988).

In light of recent literature, we propose to model the GMRF models as applied to remote sensing image processing where the covariance structure of the "spectral" and "temporal" attributes is characterized jointly. The separable covariance structure associated with the matrix Gaussian distribution has been considered.

#### **3.2.1 Non-separable covariance structure**

The matrix observation driven by a colored noise and its vectorized distribution, is assumed to be a realization from the process whose conditional form is given by

Estimation of the Separable MGMRF Parameters for Thematic Classification 105

 1 2 ,, ,, cov , *X X kl kl kk ll* **s s L;Θ**

 1 2 ,, ,, cov , *Y Y kl kl kk ll* **s s L;Θ**

 1 2 , , , , cov , *X X i j k l ik jl* **ss L;Θ**

 1 2 , , , , cov , *Y Y i j k l ik jl* **ss L;Θ**

This corresponds to the product of the covariance associated with the reflectance at the

The number of parameters in the unpatterned covariance matrix is *N N NN NN* 1 2 12 12 1 2 . On the other hand, the number of parameters for a separable covariance matrix is *NN NN* 11 22 1 12 , which has fewer parameters

We can also model the interaction matrix coefficients with a separable structure for all

**<sup>θ</sup><sup>r</sup>** is the interaction matrix across the bands and 2 2 <sup>2</sup> *N N*

interaction matrix across time. In the next section, the interaction matrix coefficient **θ<sup>r</sup>** *<sup>m</sup>*

Furthermore, if **Σ***m* is separable, then the following is the resulting statistical

*L L*

 

**s s**

2 1

2 2 1 1

**XX L Θ θΣ 1 θ Σ 1 r**

**s sr rs s rs s**

*L L L L*

*L L L L*

**s sr s sr**

2 1

*N N*

**0 0**

This corresponds to the product of the variance associated with the reflectance at the *th k*

*L***<sup>s</sup>** . Likewise, the cross-covariance is given as (Arnold, 1981):

*L***s** and the variance associated with the reflectance at the *th l* temporal

 

 

> 

 

*L***s** and the covariance associated with the reflectance at

2 1 **θθ θ rr r** *mmm* (23)

and1 *m M* provided that **Σ***m* is separable.

<sup>1</sup> *E* ; *<sup>N</sup>* **X Ls Θ 0** (24)

**Σ Σ r 0**

*L L* **s s** (19)

*L L* **s s** . (20)

*L L* **s s** (21)

*L L* **s s** . (22)

*m* **θ<sup>r</sup>** is the

1

(25)

*p*

otherwise

then, the covariance is given as (Arnold, 1981):

spectral band <sup>1</sup>

slot <sup>2</sup> *ll* 

**r** 

*kk* 

*th <sup>i</sup>* and the *th <sup>k</sup>* spectral band <sup>1</sup>

the *th <sup>j</sup>* and the *th <sup>l</sup>* temporal slot <sup>2</sup>

compared to its non-separable counterpart.

and 1 *m M* of the form

*m*

can be made separable for **r**

where 1 1 <sup>1</sup> *N N*

characterization of **Xs** :

cov , ;

**3.2.4 Separable of interaction matrix structure** 

*ik* 

> *jl L***<sup>s</sup>** .

~ , *N N N L* **XL 0 s s <sup>Σ</sup>** . The covariance matrix *<sup>L</sup>* **<sup>s</sup> <sup>Σ</sup>** does not have any special structure, except it has to be a positive definite symmetric matrix. This covariance matrix structure referred to as an unpatterned covariance matrix (Dutilleul, 1999). The statistical characterization is similar to the MGMRF discussed in Section 2.3.

#### **3.2.2 Matrix gaussian distribution**

Let # **<sup>X</sup>** be a random matrix distributed as # # <sup>1</sup> <sup>2</sup> , ~ ,, *m n* **X M** *<sup>N</sup>* **Ξ Ξ** where # *m n* **M** is the expectation matrix, <sup>1</sup> *m m* **<sup>Ξ</sup>** is the covariance matrix across the rows, and <sup>2</sup> *n n* **Ξ** is the covariance matrix across the columns. Hence, the pdf of # **X** is given as

$$p\left(\mathbf{X}^{\boldsymbol{\theta}}\right) = \frac{1}{\left(2\pi\right)^{m\boldsymbol{\eta}/2} \left|\Xi^{(1)}\right|^{m/2} \left|\Xi^{(2)}\right|^{m/2}} \exp\left[-\frac{1}{2}tr\left(\left(\Xi^{(1)}\right)^{-1}\left(\mathbf{X}^{\boldsymbol{\theta}} - \mathbf{M}^{\boldsymbol{\theta}}\right)\left(\Xi^{(2)}\right)^{-1}\left(\mathbf{X}^{\boldsymbol{\theta}} - \mathbf{M}^{\boldsymbol{\theta}}\right)^{T}\right)\right] \tag{14}$$

(Arnold, 1981). Also, if we stack the matrix # **X** into the random vector # **X X** *vec* , then ~ , *mn* **X M** *<sup>N</sup>* **<sup>Ξ</sup>** where # *mn* **M M** *vec* is the expectation matrix and 2 1 *mn mn* **ΞΞ Ξ** is the covariance matrix (Arnold, 1981), and its pdf is given as

$$p\left(\mathbf{X}\right) = \frac{1}{\left(2\pi\right)^{mn/2}\left|\Xi\right|^{1/2}} \exp\left[-\frac{1}{2}\left(\mathbf{X} - \mathbf{M}\right)^{T}\Xi^{-1}\left(\mathbf{X} - \mathbf{M}\right)\right].\tag{15}$$

We model the associated noise process # **Xs** as a matrix Gaussian distribution, i.e. 12 1 2 # 1 2 **XL 0 <sup>s</sup>** ~ ,, *N LL NN N N* , **Σ Σ s s** where 1 1 <sup>1</sup> *N N <sup>L</sup>* **Σ <sup>s</sup>** is the covariance matrix across the bands, and 2 2 <sup>2</sup> *N N <sup>L</sup>* **Σ <sup>s</sup>** is the covariance matrix across time. Stacking the matrix # **Xs** into a random vector 1 2 # *N N vec* **X X s s** corresponds to the vectorized colored noise with conditional distribution 1 2 1 2 2 1 ~ , *NN NN* **XL 0 s s** *N LL* **<sup>s</sup> Σ Σ** .

#### **3.2.3 Separable covariance structure**

The spectro-temporal, separable covariance matrix model (Lu and Zimmerman, 2005; Fuentes, 2006) has the form

$$
\Sigma(m) = \Sigma^{(2)}(m) \otimes \Sigma^{(1)}(m) \tag{16}
$$

for <sup>1</sup> *m M* where 1 1 <sup>1</sup> <sup>1</sup> *N N m m ij* **Σ** is the covariance matrix across bands and 2 2 <sup>2</sup> <sup>2</sup> *N N m m kl* **Σ** is the covariance matrix across time. Now, since

$$\mathbf{X\_s}|\mathbf{L} \sim \mathcal{N}\_{N\_1 N\_2} \left( \mathbf{0}\_{N\_1 \times N\_2} \boldsymbol{\Sigma}^{(2)}(L\_\mathbf{s}) \otimes \boldsymbol{\Sigma}^{(1)}(L\_\mathbf{s}) \right) \tag{17}$$

$$\mathbf{Y}\_{\mathbf{s}}|\mathbf{L} \sim N\_{N\_1 N\_2} \left( \mathfrak{w}(L\_{\mathbf{s}}), \mathfrak{L}^{(2)}(L\_{\mathbf{s}}) \otimes \mathfrak{L}^{(1)}(L\_{\mathbf{s}}) \right) \tag{18}$$

then, the covariance is given as (Arnold, 1981):

104 Remote Sensing – Advanced Techniques and Platforms

~ , *N N N L* **XL 0 s s <sup>Σ</sup>** . The covariance matrix *<sup>L</sup>* **<sup>s</sup> <sup>Σ</sup>** does not have any special structure, except it has to be a positive definite symmetric matrix. This covariance matrix structure referred to as an unpatterned covariance matrix (Dutilleul, 1999). The statistical

1 1 # 1 2 ## ##

**ΞΞ Ξ** is the covariance matrix (Arnold, 1981), and its pdf is given as

2 1 2 1 1

**Ξ**

12 1 2 # 1 2 **XL 0 <sup>s</sup>** ~ ,, *N LL NN N N* , **Σ Σ s s** where 1 1 <sup>1</sup> *N N*

**Xs** into a random vector 1 2 # *N N*

*L*

noise with conditional distribution

*ij*

2 2

**X Ξ X M Ξ X M**
