**Part 2**

## **Approaches of Earth Observation Monitoring**

60 Earth Observation

Hooper, A., 2010, A Statistical-Cost Approach to Unwrapping the Phase of InSAR Time

Hooper, A., 2008, A multi-temporal InSAR method incorporating both persistent scatterer

Hooper, A., Segall, P. and Zebker, H., 2007, Persistent scatterer interferometric synthetic

Just, D. and Bamler, R., 1994, Phase statistics of interferograms with applications to synthetic

Kampes, B.M., 2005, Displacement Parameter Estimation Using Permanent Scatterer Interferometry. PhD thesis, Delft University of Technology, The Netherlands. Kuehn, F., Margane, A., Tatong, T. and Wever, T., 2004, SAR-Based Land Subsidence Map for Bangkok, Thailand. Zeitschrift für Angewandte Geologie, Germany. Lyons, S., and Sandwell, D., 2003, Fault creep along the southern San Andreas from

National Research Council of Thailand., 2000. Remote Sensing and GIS Activities in

Worawattanamateekul, J., 2006, The Application of Advanced Interferometric Radar

Resources Monitoring between 11-12 September 2000, AIT, Bangkok. Werner, C., Wegmuller, U., Strozzi, T. and Wiesmann, A., 2003, Interferometric point target

aperture radar, Applied Optics, 33(20), pp. 4361-4368.

Journal of Geophysical Research, 108(B1), pp. 2047-2070.

Symposium (IGARSS), 21-25 July 2003, Toulouse, France.

Thesis, Technical University of Munich, Germany, pp. 169.

Frascati (Rome), Italy.

10.1029/2008 GL034654.

doi:10.1029/2006JB004763.

Series. In the FRINGE Workshop 09, 30 November-4 December 2009, ESA-ESRIN,

and small baseline approaches. Geophysical Research Letters, Vol. 35, L16302, pp. 5

aperture radar for crustal deformation analysis, with application to Volcáno Alcedo. Journal of Geophysical Research, Vol. 112, B07407,

interferometric synthetic aperture radar, permanent scatterers, and stacking.

Thailand. Paper present at the 1st Earth Observation Satellites Workshop for Earth

analysis for deformation mapping. In International Geoscience and Remote Sensing

Analysis for Monitoring Ground Subsidence: A Case Study in Bangkok. Ph.D

### **Vision Goes Symbolic Without Loss of Information Within the Preattentive Vision Phase: The Need to Shift the Learning Paradigm from Machine-Learning (from Examples) to Machine-Teaching (by Rules) at the First Stage of a Two-Stage Hybrid Remote Sensing Image Understanding System, Part I: Introduction**

Andrea Baraldi *Department of Geography, University of Maryland, College Park, Maryland, USA* 

#### **1. Introduction**

One traditional, although visionary goal of the remote sensing (RS) community is the development of operational satellite-based measurement systems suitable for automating the quantitative analysis of large-scale spaceborne multi-source multi-resolution image databases (Gutman et al., 2004). In past years this goal was almost exclusively dealt with by research programs focused on land cover (LC) and land cover change (LCC) detection at global scale (Gutman et al., 2004) (pp. 451, 452). In recent years the objective of developing operational satellite-based measurement systems has become increasingly urgent due to multiple drivers. While cost-free access to large-scale low spatial resolution (SR) (above 40 m) and medium SR (from 40 to 20 m) spaceborne image databases has become a reality (GEO, 2005; GEO, 2008a; GEO, 2008b; Gutman et al., 2004; Sart et al., 2001; Sjahputera et al., 2008), in parallel, the demand for high SR (between 20 and 5 m) and very high SR (VHR, below 5 m) commercial satellite imagery has continued to increase in terms of data quantity and quality, which has boosted the rapid growth of the commercial VHR satellite industry (Sjahputera et al., 2008). In this scientific and commercial context an increasing number of on-going international research projects aim at the development of operational services requiring harmonization and interoperability of Earth observation (EO) data and derived information products generated from a variety of spaceborne imaging sensors at all scales global, regional and local. Among these on-going programs it is worth mentioning the Global EO System of Systems (GEOSS) conceived by the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008b), the Global Monitoring for the Environment and Security (GMES), which is an initiative led by the European Union (EU) in partnership with the European Space Agency (ESA) (ESA, 2008; GMES, 2011), the National Aeronautics and Space Administration (NASA) Land Cover and Land Use Change (LCLUC) program (Gutman et al., 2004) (p. 3) and the U.S. Geological Survey (USGS)-NASA Web-Enabled Landsat Data (WELD) project (USGS & NASA, 2011).

Unfortunately, to date, the increasing rate of collection of EO imagery of enhanced spatial, spectral and temporal quality outpaces the automatic or semi-automatic capability of generating information from huge amounts of multi-source multi-resolution RS data sets (Gutman et al., 2004). This may explain why the percentage of data downloaded by stakeholders from the ESA EO image archives is estimated at about 10% or less (D'Elia, 2009).

If productivity in terms of quality, quantity and value of high-level output products generated from input EO imagery is low, this is tantamount to saying that existing scientific and commercial RS image understanding (classification) systems (RS-IUSs), such as (Definiens Imaging GmbH, 2004; Esch et al., 2008; Richter, 2006), score poorly in operational contexts (Tapsall et al., 2010). For example, RS-IUSs capable of proving their competitiveness at local/regional scale, such as the inductive supervised (labeled) data learning Support Vector Machines (SVMs) (Bruzzone & Carlin, 2006; Bruzzone & Persello, 2009), typically lack robustness and scalability for seamless application to LC and LCC problems at national, continental and global scale. As an example of these difficulties the interested reader may refer to (Chengquan Huang et al., 2008), where an SVM training algorithm and model selection strategies are applied to every image of a multi-temporal image mosaic at global scale. If the conjecture that existing RS-IUSs are affected by low productivity holds in general, it applies in particular to two-stage segment-based RS-IUSs which have recently gained widespread popularity and are currently considered the stateof-the-art in both scientific and commercial RS image mapping applications (Castilla et al., 2009; Mather, 1994). In literature the conceptual foundation of two-stage segment-based RS-IUSs is well known as geographic (2-D) object-based image analysis (GEOBIA), including a so-called iterative geographic OO image analysis (GEOOIA) approach (Baatz et al., 2008) (Hay & Castilla, 2006), also called object-oriented (image) analysis (OOA) (Castilla et al., 2008).

To summarize, in operational contexts (other than toy problems at small spatial scale and coarse semantic granularity) a RS-IUS can be considered as a low performer when at least one among several operational quality indicators (QIs) scores low. In (Baraldi et al., 2010a), a set of QIs eligible for use with an operational RS-IUS comprises the following: degree of automation (equivalent to ease of use; it is monotonically decreasing with the number of system-free parameters to be user-defined), classification and spatial accuracies (Baraldi et al., 2005), efficiency (e.g., computational time, memory occupation), robustness to changes in input parameters, robustness to changes in the input data set, scalability, timeliness (defined as the time span between data acquisition and high-level product delivery to the end user; it increases monotonically with manpower and computing time) and economy. In RS common practice, one or many of the aforementioned QIs of existing RS-IUSs tend to score low at local to global scale. This observation appears in line with a well-known opinion by Zamperoni according to which computer vision (CV) remains, to date, far more problematic than might be reasonably expected (Zamperoni, 1996). In

European Space Agency (ESA) (ESA, 2008; GMES, 2011), the National Aeronautics and Space Administration (NASA) Land Cover and Land Use Change (LCLUC) program (Gutman et al., 2004) (p. 3) and the U.S. Geological Survey (USGS)-NASA Web-Enabled

Unfortunately, to date, the increasing rate of collection of EO imagery of enhanced spatial, spectral and temporal quality outpaces the automatic or semi-automatic capability of generating information from huge amounts of multi-source multi-resolution RS data sets (Gutman et al., 2004). This may explain why the percentage of data downloaded by stakeholders from the ESA EO image archives is estimated at about 10% or less (D'Elia,

If productivity in terms of quality, quantity and value of high-level output products generated from input EO imagery is low, this is tantamount to saying that existing scientific and commercial RS image understanding (classification) systems (RS-IUSs), such as (Definiens Imaging GmbH, 2004; Esch et al., 2008; Richter, 2006), score poorly in operational contexts (Tapsall et al., 2010). For example, RS-IUSs capable of proving their competitiveness at local/regional scale, such as the inductive supervised (labeled) data learning Support Vector Machines (SVMs) (Bruzzone & Carlin, 2006; Bruzzone & Persello, 2009), typically lack robustness and scalability for seamless application to LC and LCC problems at national, continental and global scale. As an example of these difficulties the interested reader may refer to (Chengquan Huang et al., 2008), where an SVM training algorithm and model selection strategies are applied to every image of a multi-temporal image mosaic at global scale. If the conjecture that existing RS-IUSs are affected by low productivity holds in general, it applies in particular to two-stage segment-based RS-IUSs which have recently gained widespread popularity and are currently considered the stateof-the-art in both scientific and commercial RS image mapping applications (Castilla et al., 2009; Mather, 1994). In literature the conceptual foundation of two-stage segment-based RS-IUSs is well known as geographic (2-D) object-based image analysis (GEOBIA), including a so-called iterative geographic OO image analysis (GEOOIA) approach (Baatz et al., 2008) (Hay & Castilla, 2006), also called object-oriented (image) analysis (OOA) (Castilla et al.,

To summarize, in operational contexts (other than toy problems at small spatial scale and coarse semantic granularity) a RS-IUS can be considered as a low performer when at least one among several operational quality indicators (QIs) scores low. In (Baraldi et al., 2010a), a set of QIs eligible for use with an operational RS-IUS comprises the following: degree of automation (equivalent to ease of use; it is monotonically decreasing with the number of system-free parameters to be user-defined), classification and spatial accuracies (Baraldi et al., 2005), efficiency (e.g., computational time, memory occupation), robustness to changes in input parameters, robustness to changes in the input data set, scalability, timeliness (defined as the time span between data acquisition and high-level product delivery to the end user; it increases monotonically with manpower and computing time) and economy. In RS common practice, one or many of the aforementioned QIs of existing RS-IUSs tend to score low at local to global scale. This observation appears in line with a well-known opinion by Zamperoni according to which computer vision (CV) remains, to date, far more problematic than might be reasonably expected (Zamperoni, 1996). In

Landsat Data (WELD) project (USGS & NASA, 2011).

2009).

2008).

addition to CV, other scientific disciplines such as Artificial Intelligence (AI)/Machine Intelligence (MAI) and Cybernetics/Machine Learning (MAL), whose origins date back to the late 1950s, still remain unable to provide their ambitious cognitive objectives with operational solutions (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b).1

To outperform existing scientific and commercial image understanding approaches, a new trend of research and development is found in both CV (Cootes and Taylor, 2004) and RS literature (Mather, 1994; Matsuyama & Shang-Shouq Hwang, 1990; Pekkarinen et al., 2009). This new trend aims at developing novel hybrid models for retrieving sub-symbolic (sensory, non-semantic, objective) continuous variables (e.g., leaf area index, LAI) and symbolic (categorical, semantic, subjective) discrete variables (e.g., land cover types) from optical multi-spectral (MS) imagery. By definition, hybrid models combine both statistical (inductive, bottom-up, fine-to-coarse, driven-without-knowledge, learning-from-examples) and physical (deductive, top-down, coarse-to-fine, prior knowledge-based, learning-byrules) models to take advantage of the unique features of each and overcome their shortcomings (Matsuyama & Shang-Shouq Hwang, 1990; Shunlin Liang, 2004).

The original contribution of this work is to revise, integrate and enrich previous analyses found in related papers about recent developments in the design and implementation of an operational automatic multi-sensor multi-resolution near real-time two-stage hybrid stratified hierarchical RS-IUS (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). These novel developments encompass the four levels of analysis of an information processing system (Baraldi, 2011a; Marr, 1982), namely: (i) computational theory (system architecture), (ii) knowledge/information representation, (iii) algorithm design and (iv) implementation.

Starting from these recent achievements the present work provides an in-depth analysis of Emanuel Diamant's works including original speculations on the conceptual framework of MAI together with image segmentation and edge detection algorithms provided as proofs of his concepts (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b). To overcome the conceptual and algorithmic drawbacks highlighted in Diamant's works, this manuscript proposes revised/new definitions of the following concepts: objective continuous subsymbolic sensory data, continuous physical information, subjective discrete semi-symbolic data structure, discrete semantic-square (semantic2) information and prior knowledge base. Continuous physical information is defined as a hierarchical description (multi-scale encoding/decoding or intra-scale transcoding) of an objective continuous sensory data set based on a given mathematical vocabulary/language, e.g., a fast Fourier transform (FFT) of a time signal. Discrete semantic2 information is naturally (automatically, instantaneously) generated from the simultaneous combination of three components: (I) an objective continuous sensory data set, (II) an external subjective supervisor (observer) and (III) his/her own subjective prior ontology (model of the (3-D) world existing before looking at the objective sensory data at hand) whose hierarchical form is equivalent to that of a story in a natural language, comprising a title, an abstract, sections, paragraphs, sentences and words. In practical contexts these definitions imply the following.

 1 In Italian, acronym AI reminds of the English expression: 'ouch'. Acronym MAI means 'never'. Acronym MAL means 'pain'. Acronym MAT means 'fool'. These choices are arbitrary, but not by chance. Ancient Latins used to say: Nomen est omen... (meaning: 'true to its name').

	- i. According to the *central limit theorem* the distribution of the sample average of *n* independent and identically distributed (iid) random variables (corresponding to, say, categorical variables) approaches the normal distribution, featuring no "distinguishable" data sub-structure, as the sample size *n* increases. In other words, the separability of "distinguishable" data structures in a given measurement space of a given objective sensory data set is monotonically non-increasing (i.e., it decreases or remains equal) with the finite number of discrete semantic concepts (e.g., land cover classes) involved with the cognitive (classification) problem at hand.
	- ii. In a given measurement space, within-class variability (vice versa, inter-class separability) is monotonically non-decreasing (i.e., it increases or remains equal) (vice versa, non-increasing) with the magnitude of the sample set per categorical variable when this variable-specific sample set size is "large" according to largesample statistics (although large sample is a synonym for 'asymptotic' rather than a reference to an actual sample magnitude, a sample set cardinality of 3050 samples per random variable is typically considered sufficiently large that, according to a special case of the central limit theorem, the distribution of many sample statistics becomes approximately normal). For example, in (Chengquan Huang et al., 2008), where a time-consuming SVM training and classification model selection strategies are applied to every image of a world-wide RS image mosaic to separate forest from non-forest pixels, a so-called training data automation (TDA) procedure identifies a forest peak in a one-band first-order statistic (histogram) of a local image window. The size of this local image window must be fine-tuned based on heuristics because the inter-class spectral separability between classes forest and non-forest (vice versa, within-class variability) decreases (vice versa, increases) monotonically with the local window size above a certain (empirical) threshold (minimum window size, below which the collected sample is not statistically significant).

Some practical conclusions of potential interest to the RS, CV, AI and MAL communities stem from these speculations. Firstly, in operational contexts (e.g., RS image classification problems at national, continental and global scale), other than toy problems (e.g., RS image mapping at coarse spatial resolution and local/regional scale), inductive classifiers capable of learning from a finite labeled data set should be considered structurally inadequate to correlate (rather than extract, see this text above) discrete semantic2 information with objective sensory data provided, *per se*, with no semantics at all.

Secondly, to increase the operational QIs of existing two-stage hybrid RS-IUSs, any firststage inductive MAL-from-examples approach should be replaced by a deductive Machine Teaching (MAT)-by-rules sub-system capable of generating a preliminary classification first

a. It is impossible to *extract* semantic2 information from objective continuous sensory data

b. It is possible to *correlate* discrete semantic2 information to objective continuous sensory data. Unfortunately, correlation between continuous sensory data and a finite and discrete set of categorical variables, corresponding to independent random variables generating separable data structures (data aggregations, data clusters, data objects), is low in real-world RS image mapping problems at large data scale or fine semantic granularity, other than toy problems at small data scale and coarse semantic

granularity. This low correlation effect is due to the combination of two factors.

classes) involved with the cognitive (classification) problem at hand.

significant).

i. According to the *central limit theorem* the distribution of the sample average of *n* independent and identically distributed (iid) random variables (corresponding to, say, categorical variables) approaches the normal distribution, featuring no "distinguishable" data sub-structure, as the sample size *n* increases. In other words, the separability of "distinguishable" data structures in a given measurement space of a given objective sensory data set is monotonically non-increasing (i.e., it decreases or remains equal) with the finite number of discrete semantic concepts (e.g., land cover

ii. In a given measurement space, within-class variability (vice versa, inter-class separability) is monotonically non-decreasing (i.e., it increases or remains equal) (vice versa, non-increasing) with the magnitude of the sample set per categorical variable when this variable-specific sample set size is "large" according to largesample statistics (although large sample is a synonym for 'asymptotic' rather than a reference to an actual sample magnitude, a sample set cardinality of 3050 samples per random variable is typically considered sufficiently large that, according to a special case of the central limit theorem, the distribution of many sample statistics becomes approximately normal). For example, in (Chengquan Huang et al., 2008), where a time-consuming SVM training and classification model selection strategies are applied to every image of a world-wide RS image mosaic to separate forest from non-forest pixels, a so-called training data automation (TDA) procedure identifies a forest peak in a one-band first-order statistic (histogram) of a local image window. The size of this local image window must be fine-tuned based on heuristics because the inter-class spectral separability between classes forest and non-forest (vice versa, within-class variability) decreases (vice versa, increases) monotonically with the local window size above a certain (empirical) threshold (minimum window size, below which the collected sample is not statistically

Some practical conclusions of potential interest to the RS, CV, AI and MAL communities stem from these speculations. Firstly, in operational contexts (e.g., RS image classification problems at national, continental and global scale), other than toy problems (e.g., RS image mapping at coarse spatial resolution and local/regional scale), inductive classifiers capable of learning from a finite labeled data set should be considered structurally inadequate to correlate (rather than extract, see this text above) discrete semantic2 information with

Secondly, to increase the operational QIs of existing two-stage hybrid RS-IUSs, any firststage inductive MAL-from-examples approach should be replaced by a deductive Machine Teaching (MAT)-by-rules sub-system capable of generating a preliminary classification first

objective sensory data provided, *per se*, with no semantics at all.

because the latter, *per se*, are provided with no semantics at all.

stage in the Marr sense (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b; Marr, 1982). As a proof of this concept the operational automatic prior knowledge-based multi-sensor multi-resolution near real-time Satellite Image Automatic Mapper™ (SIAM™) is selected from existing literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; 1 Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

Fig. 1. The taxonomy of statistical pattern recognition systems proposed in (Baraldi et al., 2006b). Clustering algorithms and classification systems map an unlabeled input data sample into a discrete and finite set of sub-symbolic and symbolic labels, respectively. These discrete output maps are called (sub-symbolic) cluster maps (consisting of, say, cluster 1, cluster 2, etc.) and (symbolic) classification maps (consisting of, say, symbolic labels such as land cover classes broad-leaf forest, needle-leaf forest, etc.), respectively.

Thirdly, in RS-IUSs, MAL-from-data algorithms, either labeled (supervised) or unlabeled (unsupervised), either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., 2-D object-based), should be adapted to work on a driven-by-knowledge stratified (semantic masked/layered) basis and moved to the second stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture recently proposed in RS literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

The rest of this work is organized as follows. For publication reasons it consists of Part I and Part II. In Part I Section 2 related works, concepts and definitions are revised to provide this multi-disciplinary study with a significant survey value and make it self-contained. Part I Section 2 includes the following sub-sections: definitions and synonyms involved with inductive and deductive inference mechanisms (see Part I Section 2.1), a critical review of the history of AI/MAI and Cybernetics/MAL including a summary of Diamant's definitions of objective data, physical information, semantic information, knowledge and intelligence (refer to Part I Section 2.2), a definition of the cognitive process of vision (see Part I Section 2.3), a critical analysis of the inherent ill-posedness of inductive data learning algorithms (see Part I Section 2.4), a review of Diamant's image segmentation and contour detections algorithms presented as proofs of his concepts summarized in Part I Section 2.2 (refer to Part I Section 2.5), a discussion of the four levels of understanding of a RS-IUS (see Part I Section 2.6), a presentation (see Part I Section 2.7) of the Quality Assurance Framework for EO (QA4EO) guidelines (GEO/CEOSS, 2008) delivered by the Working Group on Calibration and Validation (WGCV) of the Committee of Earth Observations (CEOS), the space arm of the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008b), and a list of operational QIs of an RS-IUS (refer to Part I Section 2.8).

Part II includes a review session (see Part II Section 2) and an original contribution (from Part II Section 3 to Part II Section 7). In Part II Section 2 different families of existing RS-IUSs, namely, multi-agent hybrid RS-IUSs, two-stage segment-based RS-IUSs and two-stage stratified hierarchical hybrid RS-IUSs, are compared at the architectural level of analysis (refer to Part I Section 2.6). Part II Section 3 discusses theoretical inconsistencies and algorithmic drawbacks found in Diamant's works (discussed in Part I Section 2.2 and Part I Section 2.5, respectively). Revised/novel definitions of objective continuous sensory data, continuous physical information, discrete semantic2 information and prior knowledge are provided in Part II Section 4. In Part II Section 5 practical consequences of the novel definitions provided in Part II Section 4 are considered for CV, AI and MAL applications. Part II Section 6 presents the operational automatic multi-sensor multi-resolution near realtime SIAM™ as a proof of the original concepts proposed in this work. Conclusions are reported in Part II Section 7.

#### **2. Related works, concepts, definitions and synonyms**

To provide this multi-disciplinary paper with a significant survey value and make it selfcontained, a variety of related works, concepts and definitions collected from AI, MAL, CV and RS literature are revised in this section.

#### **2.1 Inference mechanisms: Deductive top-down coarse-to-fine physical models and inductive bottom-up fine-to-coarse statistical models**

Starting from classical philosophy to end up with MAL it is well known that the general notion of inference (learning) comprises two types of learning mechanisms.

Thirdly, in RS-IUSs, MAL-from-data algorithms, either labeled (supervised) or unlabeled (unsupervised), either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., 2-D object-based), should be adapted to work on a driven-by-knowledge stratified (semantic masked/layered) basis and moved to the second stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture recently proposed in RS literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). The rest of this work is organized as follows. For publication reasons it consists of Part I and Part II. In Part I Section 2 related works, concepts and definitions are revised to provide this multi-disciplinary study with a significant survey value and make it self-contained. Part I Section 2 includes the following sub-sections: definitions and synonyms involved with inductive and deductive inference mechanisms (see Part I Section 2.1), a critical review of the history of AI/MAI and Cybernetics/MAL including a summary of Diamant's definitions of objective data, physical information, semantic information, knowledge and intelligence (refer to Part I Section 2.2), a definition of the cognitive process of vision (see Part I Section 2.3), a critical analysis of the inherent ill-posedness of inductive data learning algorithms (see Part I Section 2.4), a review of Diamant's image segmentation and contour detections algorithms presented as proofs of his concepts summarized in Part I Section 2.2 (refer to Part I Section 2.5), a discussion of the four levels of understanding of a RS-IUS (see Part I Section 2.6), a presentation (see Part I Section 2.7) of the Quality Assurance Framework for EO (QA4EO) guidelines (GEO/CEOSS, 2008) delivered by the Working Group on Calibration and Validation (WGCV) of the Committee of Earth Observations (CEOS), the space arm of the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008b), and a list of operational

Part II includes a review session (see Part II Section 2) and an original contribution (from Part II Section 3 to Part II Section 7). In Part II Section 2 different families of existing RS-IUSs, namely, multi-agent hybrid RS-IUSs, two-stage segment-based RS-IUSs and two-stage stratified hierarchical hybrid RS-IUSs, are compared at the architectural level of analysis (refer to Part I Section 2.6). Part II Section 3 discusses theoretical inconsistencies and algorithmic drawbacks found in Diamant's works (discussed in Part I Section 2.2 and Part I Section 2.5, respectively). Revised/novel definitions of objective continuous sensory data, continuous physical information, discrete semantic2 information and prior knowledge are provided in Part II Section 4. In Part II Section 5 practical consequences of the novel definitions provided in Part II Section 4 are considered for CV, AI and MAL applications. Part II Section 6 presents the operational automatic multi-sensor multi-resolution near realtime SIAM™ as a proof of the original concepts proposed in this work. Conclusions are

To provide this multi-disciplinary paper with a significant survey value and make it selfcontained, a variety of related works, concepts and definitions collected from AI, MAL, CV

**2.1 Inference mechanisms: Deductive top-down coarse-to-fine physical models and** 

notion of inference (learning) comprises two types of learning mechanisms.

Starting from classical philosophy to end up with MAL it is well known that the general

QIs of an RS-IUS (refer to Part I Section 2.8).

**2. Related works, concepts, definitions and synonyms** 

**inductive bottom-up fine-to-coarse statistical models** 

reported in Part II Section 7.

and RS literature are revised in this section.


As output, statistical and physical quantitative models of the (3-D) world (e.g., quantitative models of land surfaces observed from space) generate either *continuous sub-symbolic variables* (e.g., LAI) or *discrete symbolic (categorical) variables* (e.g., land cover types).

In addition to the synonyms presented above, the following terms are considered synonyms in the rest of this paper (Matsuyama & Shang-Shouq Hwang, 1990; Shunlin Liang, 2004).


In RS data applications, quantitative models are traditionally sorted into three major categories: *statistical*, *physical* and *hybrid*, whose main advantages and limitations are so well known in existing literature as to be summarized by Shunlin Liang in the following few words (Shunlin Liang, 2004).

a. Statistical models are inductive data learning systems (refer to this text above). Therefore, they are inherently difficult to solve (ill-posed) and their solution requires *a priori* knowledge in addition to data (Cherkassky & Mulier, 2006). Statistical pattern recognition systems are based on *correlation relationships* between objective sensory data (e.g., RS imagery) and either continuous (e.g., LAI) or categorical (e.g., land surface) variables. Statistical models are easy to develop, e.g., a human expert is not required to search for an explicit deterministic function, if any, between, say, a target physical variable (e.g., LAI) and sensory data. However, they are effective for summarizing local data exclusively, i.e., they are usually (always?) site-specific (Shunlin Liang, 2004). For example, in RS common practice no machine capable of learning from either unlabeled or labeled data scores high in operational contexts such as satellite image mapping at national/ continental/ global scale. As a proof of this concept, in (Chengquan Huang et al., 2008), a time-consuming SVM (Bruzzone & Carlin, 2006) training and classification model selection strategies are enforced for every RS image in a world-wide image mosaic. In addition, supervised data learning algorithms, either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., (2-D) objectbased (Definiens Imaging GmbH, 2004; Esch et al., 2008)), require the collection of reference training samples which are typically scene-specific, expensive, tedious, difficult or impossible to collect (Gutman et al., 2004). This means that in practical RS data applications where supervised data learning algorithms are employed, the cost, timeliness, quality and availability of adequate reference (training/testing) datasets derived from field sites, existing maps and tabular data have turned out to be the most limiting factors on RS data product generation and validation (Gutman et al., 2004). Finally, since statistical models are inherently ill-posed, they are difficult to maintain, adapt, modify and scale according to changing input data sets, sensor specifications and/or user requirements. For example, the free parameter selection phase of any image segmentation algorithm tends to be difficult because: (i) it is based on heuristic (empirical) criteria (correlation relationships) and (ii) due to its inherent ill-posedness (artificial insufficiency (Matsuyama & Shang-Shouq Hwang, 1990)), any image segmentation algorithm is site-specific and simultaneously affected by both omission and commission segmentation errors within each image at hand (Burr & Morrone, 1992; Corcoran & Winstanley, 2007; Corcoran et al., 2010; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou & Sevilla, 2006; Vecera & Farah, 1997).

b. Physical models consist of prior knowledge concerning the physical laws of the (3-D) world which is available before looking at the objective sensory data at hand. They follow the physical laws of the real (3-D) world to establish *cause-effect relationships*. They have to be learnt by a human expert based on intuition, expertise and evidence from data observation. Thus, unfortunately, it takes a long time for human experts to learn physical laws of the real (3-D) world and tune physical models (Mather, 1994; Shunlin Liang, 2004). On the other hand, physical models are more intuitive to debug, maintain and modify than statistical models. In other words, if the initial physical model does not perform well, then the system developer knows exactly where to improve it by incorporating the latest knowledge and information. For example, with a non-adaptive decision-tree classifier it is easy to find the node of the decision process in which a misclassification error occurs. In practice, a non-adaptive decision-tree classifier is well-posed (i.e., every data sample is assigned a semantic label according to a specific rule set), but subjective (i.e., different system developers may generate different nonadaptive decision-tree classifiers in the same application domain), refer to this text above.

c. Hybrid models combine both statistical and physical models to take advantage of the unique features of each and overcome their shortcomings (refer to the two previous paragraphs) (Matsuyama & Shang-Shouq Hwang, 1990; Shunlin Liang, 2004).

#### **2.2 Brief history of AI/MAI and Cybernetics/MAL**

70 Earth Observation

b. Physical models consist of prior knowledge concerning the physical laws of the (3-D) world which is available before looking at the objective sensory data at hand. They follow the physical laws of the real (3-D) world to establish *cause-effect relationships*. They have to be learnt by a human expert based on intuition, expertise and evidence from data observation. Thus, unfortunately, it takes a long time for human experts to learn physical laws of the real (3-D) world and tune physical models (Mather, 1994; Shunlin Liang, 2004). On the other hand, physical models are more intuitive to debug, maintain and modify than statistical models. In other words, if the initial physical model does not perform well, then the system developer knows exactly where to improve it by incorporating the latest knowledge and information. For example, with a non-adaptive decision-tree classifier it is easy to find the node of the decision process in which a misclassification error occurs. In practice, a non-adaptive decision-tree classifier is well-posed (i.e., every data sample is assigned a semantic label according to a specific rule set), but subjective (i.e., different system developers may generate different nonadaptive decision-tree classifiers in the same application domain), refer to this text

Sevilla, 2006; Vecera & Farah, 1997).

above.

variables. Statistical models are easy to develop, e.g., a human expert is not required to search for an explicit deterministic function, if any, between, say, a target physical variable (e.g., LAI) and sensory data. However, they are effective for summarizing local data exclusively, i.e., they are usually (always?) site-specific (Shunlin Liang, 2004). For example, in RS common practice no machine capable of learning from either unlabeled or labeled data scores high in operational contexts such as satellite image mapping at national/ continental/ global scale. As a proof of this concept, in (Chengquan Huang et al., 2008), a time-consuming SVM (Bruzzone & Carlin, 2006) training and classification model selection strategies are enforced for every RS image in a world-wide image mosaic. In addition, supervised data learning algorithms, either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., (2-D) objectbased (Definiens Imaging GmbH, 2004; Esch et al., 2008)), require the collection of reference training samples which are typically scene-specific, expensive, tedious, difficult or impossible to collect (Gutman et al., 2004). This means that in practical RS data applications where supervised data learning algorithms are employed, the cost, timeliness, quality and availability of adequate reference (training/testing) datasets derived from field sites, existing maps and tabular data have turned out to be the most limiting factors on RS data product generation and validation (Gutman et al., 2004). Finally, since statistical models are inherently ill-posed, they are difficult to maintain, adapt, modify and scale according to changing input data sets, sensor specifications and/or user requirements. For example, the free parameter selection phase of any image segmentation algorithm tends to be difficult because: (i) it is based on heuristic (empirical) criteria (correlation relationships) and (ii) due to its inherent ill-posedness (artificial insufficiency (Matsuyama & Shang-Shouq Hwang, 1990)), any image segmentation algorithm is site-specific and simultaneously affected by both omission and commission segmentation errors within each image at hand (Burr & Morrone, 1992; Corcoran & Winstanley, 2007; Corcoran et al., 2010; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou &

In every ML textbook and in the world wide web it is easy to find historical information on the multiple rises and falls of expectations and achievements in scientific disciplines such as Cybernetics/MAL and AI/MAI related to the inductive and deductive inference paradigms respectively (refer to Part I Section 2.1).

#### **2.2.1 1940s, 1950s and 1980s: Bottom-up inductive Cybernetics/MAL**

In the 1940s and 1950s, a number of researchers, mostly located at Princeton University and the Ratio Club in England, started exploring the connection between neurology and information theory to develop electronic networks capable of exhibiting rudimentary intelligence conceived as self-organizing network properties. This new scientific discipline, called Cybernetics, investigates the capability of complex distributed processing systems, consisting of multiple processing elements (agents) dynamically interacting in multiple ways based on simple local rules, to display emergent macro behaviors and persistent network structures from an input data flow, i.e., local rules lead to global network properties. For example, data regularities detected by a self-organizing network of processing elements are equivalent to a compression of input information with which the distributed system can provide an abstract representation of the external environment.

The key features of complex network systems adaptive to data are that: (i) to understand how it works, a self-organizing network must be run (learning by doing), which is to say that learning, intended as self-organizing network capability, emerges without anyone needing to define what learning and intelligence are all about, (ii) the global behavior outlasts any of the network processing elements (persistence of the whole over time), (iii) it is the competition among processing elements and their (lateral) connections which leads to the emergence of specialized network (sub-)structures; without competition all processing units would behave alike and no specializations of the units would evolve (Fritzke, 1997; Lawley, 2003; Martinetz & Schulten, 1994).

By the late 1950s, in spite of the low technological development of electronic devices, electronic networks such as W. Grey Walter's turtles and the Johns Hopkins Beast were considered eligible for proving the cybernetic concepts. However, during the 1960s, symbolic AI approaches had achieved great success at simulating high-level thinking in small demonstration programs. So, by 1960 approaches based on cybernetics were abandoned or pushed into the background.

Next, by the 1980s progress in symbolic AI seemed to stall. Many researchers started believing that symbolic systems would never be able to imitate all the processes of human cognition, such as perception, learning and pattern recognition. Again, a number of researchers looked for a "sub-symbolic" distributed approach capable of solving specific AI sub-problems. The basic idea was: "Why trouble oneself trying to grasp the principles of intelligence? Let us give the machine the chance to find (in a bottom-up approach) the best way to mimic intelligence" (Diamant, 2010b). In the middle 1980s interest in "connectionism" in general and so-called artificial neural networks in particular was revived by the works of David Rumelhart and others who focused on Multi-Layer Perceptrons (MLPs) and their Back-Propagation (BP) parameter adaptation algorithm. These and other distributed processing approaches, such as fuzzy learning systems and evolutionary computation, are now studied collectively by the emerging discipline of MAL (also called computational intelligence).

Finally, from the 1990s to date, MAL has achieved its greatest successes due to a combination of factors: the increasing computational power and memory capacity of computers, a greater emphasis on solving specific "tractable" MAL sub-problems and a new commitment by researchers to solid mathematical/statistical methods (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Duda et al., 2001; Mitchell, 1997). In practice, once its first idealistic objective failed, MAL has been "broken into pieces, disintegrated and fragmented into many partial tasks and goals" to make its problem domain more "tractable" (Diamant, 2010b).

#### **2.2.2 1956-1974, 1980s to date: Top-down deductive AI/MAI**

Starting from the seminal work of Turing in 1950, the origin of AI dates back to the summer of 1956 when a conference on the campus of Dartmouth College was attended by John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon who became the leaders of AI research for many decades. John McCarthy, who coined the term in 1956, defines AI as "the science and engineering of making intelligent machines" (Diamant, 2010b).

Intelligent agents must be able to set goals and achieve them by making choices that maximize the utility (or "value") of the available choices. To be termed intelligent these agents must be able to make predictions about how their actions will affect the present status of the world. This means they need a way to represent the current status of the world, to make predictions about the world's future status as a consequence of their actions, to have a periodical check to see if the world status matches their predictions and to change their plan as this becomes necessary, thus requiring the agent to reason under uncertainty.

Back in 1956 the excitement and hopes to reach AI goals in a short time were quite high. Herbert Simon predicted that "machines will be capable, within twenty years, of doing any work a man can do" (Diamant, 2010b). Marvin Minsky agreed by writing that "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved". Reported by Diamant (Diamant, 2010b), Steve Grand sayed that "Rodney Brooks has a copy of a memo from Marvin Minsky in which he suggested charging an undergraduate for a summer project with the task of solving vision. I don't know where that undergraduate is now, but I guess he hasn't finished yet".

Many of the cognitive problems AI was expected to solve require extensive prior knowledge of the (3-D) world. A representation of "what exists in the (3-D) world" pertaining to the cognitive problem at hand is called *world model* (Matsuyama & Shang-Shouq Hwang, 1990) or *ontology* (borrowing a word from traditional philosophy). The graphical representation and implementation of an ontology is twofold.

 An *inverted tree* whose leaves are at the bottom level (layer 0), where semantic primitives (hereafter called *semi-concepts*) are found (Diamant, 2005; Diamant, 2010a; Diamant, 2010b; Diamant, 2008).

in general and so-called artificial neural networks in particular was revived by the works of David Rumelhart and others who focused on Multi-Layer Perceptrons (MLPs) and their Back-Propagation (BP) parameter adaptation algorithm. These and other distributed processing approaches, such as fuzzy learning systems and evolutionary computation, are now studied collectively by the emerging discipline of MAL (also called computational

Finally, from the 1990s to date, MAL has achieved its greatest successes due to a combination of factors: the increasing computational power and memory capacity of computers, a greater emphasis on solving specific "tractable" MAL sub-problems and a new commitment by researchers to solid mathematical/statistical methods (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Duda et al., 2001; Mitchell, 1997). In practice, once its first idealistic objective failed, MAL has been "broken into pieces, disintegrated and fragmented into many partial tasks and goals" to make its problem domain more "tractable"

Starting from the seminal work of Turing in 1950, the origin of AI dates back to the summer of 1956 when a conference on the campus of Dartmouth College was attended by John McCarthy, Marvin Minsky, Allen Newell and Herbert Simon who became the leaders of AI research for many decades. John McCarthy, who coined the term in 1956, defines AI as "the

Intelligent agents must be able to set goals and achieve them by making choices that maximize the utility (or "value") of the available choices. To be termed intelligent these agents must be able to make predictions about how their actions will affect the present status of the world. This means they need a way to represent the current status of the world, to make predictions about the world's future status as a consequence of their actions, to have a periodical check to see if the world status matches their predictions and to change their plan as this becomes necessary, thus requiring the agent to reason under uncertainty. Back in 1956 the excitement and hopes to reach AI goals in a short time were quite high. Herbert Simon predicted that "machines will be capable, within twenty years, of doing any work a man can do" (Diamant, 2010b). Marvin Minsky agreed by writing that "within a generation ... the problem of creating 'artificial intelligence' will substantially be solved". Reported by Diamant (Diamant, 2010b), Steve Grand sayed that "Rodney Brooks has a copy of a memo from Marvin Minsky in which he suggested charging an undergraduate for a summer project with the task of solving vision. I don't know where that undergraduate is

Many of the cognitive problems AI was expected to solve require extensive prior knowledge of the (3-D) world. A representation of "what exists in the (3-D) world" pertaining to the cognitive problem at hand is called *world model* (Matsuyama & Shang-Shouq Hwang, 1990) or *ontology* (borrowing a word from traditional philosophy). The graphical representation

 An *inverted tree* whose leaves are at the bottom level (layer 0), where semantic primitives (hereafter called *semi-concepts*) are found (Diamant, 2005; Diamant, 2010a;

**2.2.2 1956-1974, 1980s to date: Top-down deductive AI/MAI** 

now, but I guess he hasn't finished yet".

and implementation of an ontology is twofold.

Diamant, 2010b; Diamant, 2008).

science and engineering of making intelligent machines" (Diamant, 2010b).

intelligence).

(Diamant, 2010b).

 A *semantic net* (*concept net*) is defined as a graph, either directed or non-oriented, either cyclic or acyclic, consisting of nodes linked by edges. Nodes represent concepts, i.e., classes of (3-D) objects in the world (see Part I Section 2.1), while edges represent relations, e.g., PART-OF, A-KIND-OF, spatial relations either topological (e.g., adjacency, inclusion) or non-topological (e.g., distance, angle), temporal transitions between nodes, physical model-based relationships between causes and effects, etc. (Hudelot et al., 2008; Matsuyama & Shang-Shouq Hwang, 1990; Pakzad et al., 1999).

Unfortunately, the number of atomic facts about the world that an average person knows is astronomical. It means that AI projects whose goal is to build a complete knowledge base of commonsense knowledge would require enormous amounts of laborious ontological engineering where one abstract concept must be built, by hand, at a time. In practice, it takes a long time for human experts to define ontologies, learn physical laws of the real (3-D) world and tune physical models based on human intuition, domain expertise and evidence from data observation. Within a decade or so it became clear that AI problems were immense, maybe even intractable. In 1974, in response to ongoing criticism and pressure to fund more productive projects, the U.S. and British governments cut off all exploratory research related to AI.

However, in the 1970s, computers with large memories became available. This drove AI researchers to began building prior knowledge into AI problem-specific "tractable" applications. In the early 1980s this led to the first commercial success of expert systems, a form of AI programs that simulated the knowledge base and analytical skills of human experts. By 1985 the market for AI reached over a billion dollars. At the same time, Japan's fifth generation computer project inspired the U.S and British governments to restore funding for academic research in the AI field. However, beginning with the collapse of the Lisp Machine market in 1987, AI once again fell into disrepute and a second, longer lasting, AI winter began.

Finally, from the 1990s to date, AI achieved its greatest successes, albeit somewhat behind the scenes. This success was due to a combination of factors, which are not surprisingly the same as those working in favor of the recent achievements of MAL (also refer to Part I Section 2.2.1), namely: the increasing computational power and memory capacity of computers, a greater emphasis on solving specific "tractable" AI sub-problems, a new commitment by researchers to solid mathematical/statistical methods and more rigorous scientific standards (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Duda et al., 2001; Mitchell, 1997), and the creation of new ties between AI and other fields working on similar problems, such as MAL, knowledge representation (e.g., fuzzy logic) and uncertainty engineering (e.g., sensitivity analysis, error propagation). For example, a major goal of contemporary AI is to have the computer understand enough concepts to be able to learn by reading from sources like the internet, and thus be able to add to its own ontology. This is called Natural Language Processing, which gives machines the ability to read and understand the languages that humans speak.

Among the longest-standing AI questions that have remained unanswered, consider the following.

 Should AI simulate natural intelligence by studying psychology or neurology? Or is human biology as irrelevant to AI research as bird biology is to aeronautical engineering?

 In the attempt to develop hybrid inference systems where both statistical and physical models are combined to overcome their shortcomings (see Section 2.1), how, when and where do continuous sensory objective sub-symbolic data become discrete symbolic subjective information? This is the well-known *information gap* existing between (subsymbolic, sensory, instantaneous, numerical, quantitative, absolute, non-semantic) sensations and (symbolic, linguistic, qualitative, vague, discrete and semantic, persistent, stable) percepts (refer to Part I Section 2.1), which has been thoroughly investigated in both philosophy and psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang, 1990). In practice, "we are always seeing objects we have never seen before at the sensation level, while we perceive familiar objects everywhere at the perception level" (Matsuyama & Shang-Shouq Hwang, 1990).

#### **2.2.3 Fundamental flaws responsible for AI and MAL derailment: The Diamant perspective**

When did AI and MAL derail from their original and ambitious goals? Diamant's answer is: They did it right at their origin dating back to the late 1950s (refer to Part I Section 2.2.1 and Part I Section 2.2.2, respectively) due to the following fundamental flows (Diamant, 2010b).


The Diamant explanations of these concepts are quoted below (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b).

#### **2.2.3.1 Kolmogorov's complexity theory**

Among definitions of "data", "information", and "knowledge", the definition of information is the most controversial. To provide it, Diamant relies on Kolmogorov's complexity theory (actually developed independently by Kolmogorov, Chaitin, and Solomonoff), whose concern is: What is the best way to represent a single data object? What are the laws of minimizing the length of a description of a single data object? Such a short-length compressed description is the information that we are seeking about a particular data object.

Theoretically two extreme cases can be distinguished: (1) the elements of a data set are absolutely random and (2) the elements of a data set form "observable" data structures. In the first case the data set can be represented only by the original sequence of its data elements. In the second case the presence of observable data structures consisting of data elements can be taken into account, which leads to a more compact and concise

 In the attempt to develop hybrid inference systems where both statistical and physical models are combined to overcome their shortcomings (see Section 2.1), how, when and where do continuous sensory objective sub-symbolic data become discrete symbolic subjective information? This is the well-known *information gap* existing between (subsymbolic, sensory, instantaneous, numerical, quantitative, absolute, non-semantic) sensations and (symbolic, linguistic, qualitative, vague, discrete and semantic, persistent, stable) percepts (refer to Part I Section 2.1), which has been thoroughly investigated in both philosophy and psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang, 1990). In practice, "we are always seeing objects we have never seen before at the sensation level, while we perceive familiar objects everywhere at the perception

**2.2.3 Fundamental flaws responsible for AI and MAL derailment: The Diamant** 

When did AI and MAL derail from their original and ambitious goals? Diamant's answer is: They did it right at their origin dating back to the late 1950s (refer to Part I Section 2.2.1 and Part I Section 2.2.2, respectively) due to the following fundamental flows (Diamant, 2010b). a. The lack of proper definitions to distinguish between objective data, physical information, semantic information, knowledge and intelligence. These definitions deal with the well-known *information gap* between physical and semantic information thoroughly investigated in both philosophy and psychophysical studies of perception (see Part I Section 2.2.2). In Diamant's words: "In my view, philosophy is not a swearword. Philosophy is a keen attempt to approach the problem from a more general standpoint, to see the problem from a wider perspective, and to yield, in such a way, a better comprehension of the problem's specificity and its interaction with other world realities. Otherwise we are ... prone to dead-ends and local traps" (Diamant, 2010b). b. Misunderstanding of the very nature of semantic information. Unlike physical information, semantics is not a property of the raw data, but the property of an external observer who observes and scrutinizes the data. Since semantics is assigned to physical data structures by an external observer, it cannot be learned from the sensory data. The Diamant explanations of these concepts are quoted below (Diamant, 2005; Diamant,

Among definitions of "data", "information", and "knowledge", the definition of information is the most controversial. To provide it, Diamant relies on Kolmogorov's complexity theory (actually developed independently by Kolmogorov, Chaitin, and Solomonoff), whose concern is: What is the best way to represent a single data object? What are the laws of minimizing the length of a description of a single data object? Such a short-length compressed description is the information that we are seeking about a particular data object. Theoretically two extreme cases can be distinguished: (1) the elements of a data set are absolutely random and (2) the elements of a data set form "observable" data structures. In the first case the data set can be represented only by the original sequence of its data elements. In the second case the presence of observable data structures consisting of data elements can be taken into account, which leads to a more compact and concise

level" (Matsuyama & Shang-Shouq Hwang, 1990).

2008; Diamant, 2010a; Diamant, 2010b). **2.2.3.1 Kolmogorov's complexity theory** 

**perspective** 

(compressed) description. In terms of Kolmogorov's theory, this compressed description (encoding) must be a trustworthy (which does not mean lossless) abstract (summary) of the original data set such that: (i) the abstract description length is definitely shorter than the original uncompressed data set description, (ii) the abstract description is sufficient to reconstruct (reproduce, re-establish, decode) the salient properties or regularities or distinguishable data structures or data objects in the original data set.

Kolmogorov's theory prescribes the way in which a data set description has to be created: Firstly, the most simplified and generalized data structures must be described. (Recall the Occam's Razor principle: Among all hypotheses consistent with the observation, choose the simplest one that is coherent with the data (Mitchell, 1997)). Then, as the level of generalization (vice versa, granularity) is gradually decreased (vice versa, becomes finer), more and more fine-grained data details (structures) can be revealed and described.

#### **2.2.3.2 Diamant's definitions of objective data, physical information, semantic information, knowledge and intelligence**

Diamant reviews two survey papers (Legg & Hutter, 2007; Zins, 2007), published in the year 2007, where definitions of data, information, knowledge and MAI are collected from existing literature for comparison purposes. In (Zins, 2007), 130 definitions of data, information and knowledge are provided by 45 scholars. In (Legg & Hutter, 2007), more than 70 definitions of MAI are collected. According to Diamant, "what these two collections undoubtedly exhibit... is that definitions offered by the leading scholars in each field have nothing in common among them, and therefore are of little use when it comes to our practical problem-solving" (Diamant, 2010b). As a result, Diamant is forced to search for his own definitions.

Starting from the Kolmogorov complexity theory (see Section 2.2.3.1), Diamant provides the following definitions about data, information and knowledge.

	- a. By physical information we mean the description of data structures that are discernable in a data set" (Diamant, 2010b). (Noteworthy,) "successful recovery and description of image structures (e.g., successful image segmentation) does not lead to image understanding. The (data) structures that are observed in an image reflect aggregations of nearby data elements on the basis of similarity among their physical attributes (e.g., color or brightness in visual signals, frequency and intensity in audio signals). These (are called) 'primary (data) structures' or 'physical (data) structures'" (Diamant, 2010a). "Physical information, being a natural property of the data, can be extracted instantly from the data and no special rule is needed for such a task accomplishment" (Diamant, 2010b). (It is) "physical information... the only information present in an image, and therefore the only

information that can be extracted from an image " (Diamant, 2008). (In other words,) "defining (primary data structures) is certainly a well-grounded procedure that does not raise any objections, because objective (physical) nature laws underpin such a procedure" (Diamant, 2010a) (refer to point 4. above).

To summarize, according to Diamant, *physical information*, non-semantic *primary data structures* and discernable non-semantic *image segments* are synonyms.

b. "By semantic information we mean the description of the relationships that may exist between the physical (data) structures of a given data set" (Diamant, 2010b). (In other words,) "'primary (data) structures'... undergo a further grouping and aggregation, which leads to formation of 'secondary (data) structures' (consisting of primary data structures) that can be called... 'semantic (data) structures'" (Diamant, 2010a)."Unlike physical information, semantics is not a property of the raw data. Semantics is assigned to physical data structures by an external observer who watches and scrutinizes the data... Semantics is a shared convention, a mutual agreement between the members of a particular group of viewers or users. Its assignment (to the primary data structures) has to be made on the basis of a consensus knowledge that is shared among the group members, and which an artificial semantic-processing system has to possess at its disposal... Therefore semantics cannot be learned straightforwardly from the raw data" (Diamant, 2010b). (In other words,) "the knowledge about the rules that underpin secondary (data) structures formation is a property of human observers and not an inherent property of the data" (Diamant, 2010a). (Since) "semantic information is a convention, an agreement, a property shared between a company of particular observers, it cannot be learned (from physical data) by any means. It can be exchanged, transferred, relocated between the group members, or between humans and intelligent machines (robots) collaborating with them in a working group, but it cannot be learned (from data)" (Diamant, 2010b). (This implies that) "MAL techniques are ... not applicable for the purposes of semantic information extraction (from the raw data set)... (Acquisition) of this knowledge presumes availability of a different and usually overlooked special learning technique, which would be best defined as Machine Teaching (MAT) – a technique that would facilitate externallyprepared-knowledge transfer to the system's disposal" (Diamant, 2010b).

To summarize, according to Diamant *semantic information* and semantic *secondary data structures*, generated from subjective aggregation (semantic labeling) of nonsemantic primary data structures, e.g., *image segments*, by an external observer, are synonyms. In addition, what Diamant calls MAT is known in traditional AI as knowledge engineering, which is a process of codifying human knowledge into an expert system (Laurini and Thompson, 1992).

6. "Both physical and semantic information descriptions are similar in that: (1) they are character strings, (2) they are top-down coarse-to-fine hierarchies, and (3) they are implemented according to a certain vocabulary/language. There is only a small difference – physical information can be described in a variety of languages while semantic information can be represented only in a human natural language... Therefore the most suitable form of semantic information representation should be a narrative, a

underpin such a procedure" (Diamant, 2010a) (refer to point 4. above).

*data structures* and discernable non-semantic *image segments* are synonyms.

prepared-knowledge transfer to the system's disposal" (Diamant, 2010b).

6. "Both physical and semantic information descriptions are similar in that: (1) they are character strings, (2) they are top-down coarse-to-fine hierarchies, and (3) they are implemented according to a certain vocabulary/language. There is only a small difference – physical information can be described in a variety of languages while semantic information can be represented only in a human natural language... Therefore the most suitable form of semantic information representation should be a narrative, a

expert system (Laurini and Thompson, 1992).

To summarize, according to Diamant *semantic information* and semantic *secondary data structures*, generated from subjective aggregation (semantic labeling) of nonsemantic primary data structures, e.g., *image segments*, by an external observer, are synonyms. In addition, what Diamant calls MAT is known in traditional AI as knowledge engineering, which is a process of codifying human knowledge into an

b. "By semantic information we mean the description of the relationships that may exist between the physical (data) structures of a given data set" (Diamant, 2010b). (In other words,) "'primary (data) structures'... undergo a further grouping and aggregation, which leads to formation of 'secondary (data) structures' (consisting of primary data structures) that can be called... 'semantic (data) structures'" (Diamant, 2010a)."Unlike physical information, semantics is not a property of the raw data. Semantics is assigned to physical data structures by an external observer who watches and scrutinizes the data... Semantics is a shared convention, a mutual agreement between the members of a particular group of viewers or users. Its assignment (to the primary data structures) has to be made on the basis of a consensus knowledge that is shared among the group members, and which an artificial semantic-processing system has to possess at its disposal... Therefore semantics cannot be learned straightforwardly from the raw data" (Diamant, 2010b). (In other words,) "the knowledge about the rules that underpin secondary (data) structures formation is a property of human observers and not an inherent property of the data" (Diamant, 2010a). (Since) "semantic information is a convention, an agreement, a property shared between a company of particular observers, it cannot be learned (from physical data) by any means. It can be exchanged, transferred, relocated between the group members, or between humans and intelligent machines (robots) collaborating with them in a working group, but it cannot be learned (from data)" (Diamant, 2010b). (This implies that) "MAL techniques are ... not applicable for the purposes of semantic information extraction (from the raw data set)... (Acquisition) of this knowledge presumes availability of a different and usually overlooked special learning technique, which would be best defined as Machine Teaching (MAT) – a technique that would facilitate externally-

information that can be extracted from an image " (Diamant, 2008). (In other words,) "defining (primary data structures) is certainly a well-grounded procedure that does not raise any objections, because objective (physical) nature laws

To summarize, according to Diamant, *physical information*, non-semantic *primary* 

story, a tale. The usual top-down hierarchical structure of such a story (a narrative, a tale) is well known from other linguistic studies. Moving top-down, a story comprises a story title, abstract, chapter or section partition, paragraph subdivision, separate phrases and sentences which end up with single words (congregations of letters) actually composing a phrase. Further structural descent leads in linguistics to syntaxes. But in our case – the lowest level of a semantic structure is stuffed with physical information which represents the physical structure of a meaningful object designated by the word in a phrase... At the lowest level of a semantic description (hierarchy) a physical information sub-hierarchy is always present" (Diamant, 2010a).

To summarize, according to Diamant *semantic information* comprises *physical information* at the lowest level of a semantic description (hierarchy) equivalent to an inverted tree (see Part I Section 2.2.2).


Together with the aforementioned theoretical considerations, Diamant presents an unlabeled (unsupervised) multi-scale image segmentation algorithm and a single-scale unlabeled (unsupervised) image contour detector as proofs of his concepts (Diamant, 2005). A critical analysis of these theoretical and algorithmic contributions by Diamant can be found in Part II Section 3.

#### **2.3 Vision as an ill-posed image understanding problem**

The main role of a biological or artificial visual system is to backproject the information in the (2-D) image domain to that in the (3-D) scene domain (Matsuyama & Shang-Shouq Hwang, 1990). In greater detail, the goal of a visual system is to provide plausible (multiple) symbolic description(s) of the scene depicted in an image by finding associations between sub-symbolic (non-semantic, sensory, instantaneous, numerical, absolute, quantitative, varying, objective, see Part I Section 2.1) (2-D) image features or sensations with symbolic (semantic, subjective, linguistic, qualitative, vague, abstract, persistent, stable, see Part I Section 2.1) (3-D) objects (concepts or percepts) in the scene (e.g., a building, a road, etc.). Sub-symbolic (2-D) image features are either points or regions or, vice versa, region boundaries, i.e., edges, provided with no semantic meaning. In literature, (2-D) image regions are also called *segments*, *(2-D) objects*, *patches*, *parcels*, or *blobs* (Carson et al., 1997; Lindeberg, 1993; Yang & Wang, 2007).

There is a well-known *information gap* between symbolic information in the (3-D) scene and sub-symbolic information in the (2-D) image, e.g., due to dimensionality reduction and occlusion phenomena, see Fig. 2 (also refer to Part I Section 2.2.2 and Part I Section 2.2.3). This is called the *intrinsic insufficiency* of image features (Matsuyama & Shang-Shouq Hwang, 1990). This information gap is also related to the inherent ill-posedness of inductive inference (see Part I Section 2.1). It means that the problem of image understanding is inherently ill-posed and, consequently, very difficult to solve (Matsuyama & Shang-Shouq Hwang, 1990; Cherkassky & Mulier, 2006).

Fig. 2. Inherently ill-posed image understanding problem (vision). There is a well-known information gap between physical information and semantic information. This is the same information gap existing between (sub-symbolic, sensory, instantaneous, numerical, quantitative, absolute, non-semantic) sensations and (symbolic, linguistic, qualitative, vague, discrete and semantic, persistent, stable) percepts (concepts) which has been thoroughly investigated in both philosophy and psychophysical studies of perception. In practice, "we are always seeing objects we have never seen before at the sensation level, while we perceive familiar objects everywhere at the perception level" (Matsuyama & Shang-Shouq Hwang, 1990). The original automatic SIAM™ software button (executable), adopted as preliminary classification first stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture (see Part II, Section 2), generates as output a mutually exclusive and totally exhaustive set of symbolic spectral-based semi-concepts, also called spectral categories or land cover class sets, e.g., 'vegetation' (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). The semantic meaning of a spectral-based semi-concept is: (a) superior to zero, which is the semantic value of traditional sub-symbolic image features, namely, pixels, (2-D) image segments or edges, and (b) equal or inferior to the semantic meaning of target (3-D) land cover classes (e.g., needle-leaf forest), also called concepts or (3-D) object-models in the (3-D) world.

The aforementioned information gap coincides with the well-known *information gap* existing between (sub-symbolic, sensory, quantitative, objective, varying) sensations and (symbolic, semantic, qualitative, subjective, stable) percepts, traditionally investigated in both philosophy and psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang, 1990) (see Part I Section 2.2.2).

In functional terms, biological vision combines preattentive (low-level) visual perception with an attentive (high-level) vision mechanism (Gouras, 1991; Kandel, 1991; Mason & Kandel, 1991).

**Imaging sensor**

**Information gap (between physical and** 

**semantic information) Physical information/** 

**(2-D) Image (3-D) World** 

**Imaging understanding system (inducer, e.g., human visual system)**

> **2. Color-based (2-D) semiconcept**

Fig. 2. Inherently ill-posed image understanding problem (vision). There is a well-known information gap between physical information and semantic information. This is the same information gap existing between (sub-symbolic, sensory, instantaneous, numerical, quantitative, absolute, non-semantic) sensations and (symbolic, linguistic, qualitative, vague, discrete and semantic, persistent, stable) percepts (concepts) which has been thoroughly investigated in both philosophy and psychophysical studies of perception. In practice, "we are always seeing objects we have never seen before at the sensation level, while we perceive familiar objects everywhere at the perception level" (Matsuyama & Shang-Shouq Hwang, 1990). The original automatic SIAM™ software button (executable), adopted as preliminary classification first stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture (see Part II, Section 2), generates as output a mutually exclusive and totally exhaustive set of symbolic spectral-based semi-concepts, also called spectral categories or land cover class sets, e.g., 'vegetation' (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). The semantic meaning of a spectral-based semi-concept is: (a) superior to zero, which is the semantic value of traditional sub-symbolic image features, namely, pixels, (2-D) image segments or edges, and (b) equal or inferior to the semantic meaning of target (3-D) land cover classes (e.g., needle-leaf forest), also called concepts or (3-D) object-models in the (3-D) world.

The aforementioned information gap coincides with the well-known *information gap* existing between (sub-symbolic, sensory, quantitative, objective, varying) sensations and (symbolic, semantic, qualitative, subjective, stable) percepts, traditionally investigated in both philosophy and psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang,

In functional terms, biological vision combines preattentive (low-level) visual perception with an attentive (high-level) vision mechanism (Gouras, 1991; Kandel, 1991; Mason &

**Context-insensitive (color) Context-sensitive (e.g., texture, geometry, morphology)**

**1. Sub-symbolic (2-D) image features: points and regions / region boundaries**

1990) (see Part I Section 2.2.2).

Kandel, 1991).

**continuous variables: Sensory, quantitative, subsymbolic (non-semantic), objective, but varying sensations**

> **3. Plausible symbolic structural description(s) of the (3- D) scene**

**Semantic information/ categorical variables: Qualitative, symbolic (semantic), subjective, abstract, vague, but stable percepts: (3-D) objectmodels (concepts)**


**Finally, it is worth mentioning that, according to Marr, "vision goes symbolic almost immediately, right at the level of zero-crossing (primal sketch)... without loss of information" (Marr, 1982)** (p. 343)**. In practice, Marr suggests the following.** 

	- **vision goes symbolic within the preattentive vision phase,**
	- **the primal sketch is a preliminary semantic map whose symbolic labels belong to a finite and discrete set of 3-D object-classes or concepts in the real (3-D) world.**

It is also noteworthy that, in contradiction with his own intuition about what functional properties characterize a biological vision system, the CV system implemented by Marr is unable to accomplish either of the two aforementioned goals (a) and (b). For example, the Marr pre-attentive vision module consists of a contour detector (zero-crossing) whose output is a sub-symbolic primal sketch. This is not at all surprising. It accounts in general for the customary distinction between a model and the algorithm used to identify it (Baraldi et al., 2010a; Baraldi, 2011a) (also refer to Part I Section 2.6) and, in particular, for the seminal nature of the conceptual work by Marr followed by his early dramatic death.

#### **2.4 A few comments about the inherent ill-posedness of inductive MAL from either labeled or unlabeled data**

Inductive machine learning from either labeled or unlabeled data (see Fig. 1) has been central to MAL research from the beginning. In particular, "induction amounts to forming generalizations from particular true facts. This is an inherently difficult (ill-posed) problem and its solution requires a priori knowledge in addition to data" (Cherkassky & Mulier, 2006) (p. 39), to make the ill-posed inductive learning-from-data problem better posed (see Part I Section 2.1). Unfortunately, although acknowledged by a significant portion of existing literature, the inherent ill-posedness of inductive MAL from either labeled or unlabeled data appears ignored or neglected by the majority of scientists and practitioners involved with MAL common practice.

#### **2.4.1 Inherently ill-posed unlabeled data learning**

Unlabeled (unsupervised) data learning is the ability to find discrete patterns or subsymbolic labeled data structures in an input stream of unlabeled data vectors. Well-known examples of discrete sub-symbolic data structures distinguishable in a stream of unlabeled data vectors are: (a) discrete sub-symbolic clusters (e.g., cluster 1, cluster 2, etc.) in a finite unlabeled data set belonging to a multi-dimensional measurement space and (b) discrete sub-symbolic (2-D) image segments (e.g., segment 1, segment 2, etc) found in a 2-D oneband (e.g., panchromatic) or multi-band (chromatic) image domain (see Fig. 1).

Inherently ill-posed unlabeled data clustering and image segmentation are further discussed below.

#### **2.4.1.1 Inherently ill-posed unlabeled data clustering**

Since the goal of clustering is to group the data at hand rather than to provide an accurate characterization of unobserved (future) samples generated from the same probability distribution, then the task of clustering may fall outside the framework of predictive learning (Cherkassky & Mulier, 2006). In spite of this, clustering analysis often employs unsupervised data learning approaches originally developed for vector quantization (such as the well-known k-means unsupervised data learning algorithm belonging to the family of the crisp competitive minimum-distance-to-means algorithms (Baraldi & Blonda, 1999a; Baraldi & Blonda, 1999b)), which is a predictive learning problem, see Fig. 1 (Cherkassky & Mulier, 2006).

Unlabeled data clustering is an inherently ill-posed data mapping problem. In fact, the goal of clustering is to separate a finite unlabeled dataset at hand into a finite and discrete set of "natural", hidden data structures on the basis of an often subjectively chosen measure of similarity/dissimilarity, i.e., a similarity measure chosen subjectively based on its ability to create "interesting" clusters (Backer & Jain, 1981; Baraldi & Alpaydin, 2002a; Baraldi & Alpaydin, 2002b; Cherkassky & Mulier, 2006; Fritzke, 1997). Thus, the subjective (ill-posed) nature of the nonpredictive data clustering problem precludes an absolute judgment as to the relative effectiveness of all clustering techniques (Backer & Jain, 1981). In spite of this, the inherent ill-posedness of unlabeled data clustering problems is not clearly stated in existing literature where, as a consequence, dozens of papers proposing alternative clustering algorithms are published every year (perhaps in search of a "final" best clustering algorithm which cannot exist…) (Xu & Wunsch II, 2005).

Crisp (hard) competitive minimum-distance-to-means algorithms, such as the k-means data quantization approach, try to minimize a sum-of-squares error function (Cherkassky & Mulier, 2006; Bishop, 1995). To reduce the risk of being trapped in a local minimum of the error function, soft-to-hard rather than hard competitive clustering algorithms have been conceived (Baraldi & Blonda, 1999a; Baraldi & Blonda, 1999b). In addition, it is well known that both crisp and fuzzy k-means data clustering algorithms cannot perform well with nonconvex types of data, i.e., they are effective if and only if data clusters are hyperspherical (Duda et al., 2001). To overcome this problem, a k-means unsupervised data learning algorithm capable of defining automatically the number of clusters splits a non-convex data cluster, say, a data cluster shaped like a banana, into several hyperspheres. Thus, these hyperspheres should be linked to map the banana-like data cluster. To perform non-convex unlabeled data mapping, topologically preserving data clustering algorithms have been developed (Baraldi & Alpaydin, 2002a; Baraldi & Alpaydin, 2002b; Fritzke, 1997; Martinetz & Schulten, 1994).

examples of discrete sub-symbolic data structures distinguishable in a stream of unlabeled data vectors are: (a) discrete sub-symbolic clusters (e.g., cluster 1, cluster 2, etc.) in a finite unlabeled data set belonging to a multi-dimensional measurement space and (b) discrete sub-symbolic (2-D) image segments (e.g., segment 1, segment 2, etc) found in a 2-D one-

Inherently ill-posed unlabeled data clustering and image segmentation are further discussed

Since the goal of clustering is to group the data at hand rather than to provide an accurate characterization of unobserved (future) samples generated from the same probability distribution, then the task of clustering may fall outside the framework of predictive learning (Cherkassky & Mulier, 2006). In spite of this, clustering analysis often employs unsupervised data learning approaches originally developed for vector quantization (such as the well-known k-means unsupervised data learning algorithm belonging to the family of the crisp competitive minimum-distance-to-means algorithms (Baraldi & Blonda, 1999a; Baraldi & Blonda, 1999b)), which is a predictive learning problem, see Fig. 1 (Cherkassky &

Unlabeled data clustering is an inherently ill-posed data mapping problem. In fact, the goal of clustering is to separate a finite unlabeled dataset at hand into a finite and discrete set of "natural", hidden data structures on the basis of an often subjectively chosen measure of similarity/dissimilarity, i.e., a similarity measure chosen subjectively based on its ability to create "interesting" clusters (Backer & Jain, 1981; Baraldi & Alpaydin, 2002a; Baraldi & Alpaydin, 2002b; Cherkassky & Mulier, 2006; Fritzke, 1997). Thus, the subjective (ill-posed) nature of the nonpredictive data clustering problem precludes an absolute judgment as to the relative effectiveness of all clustering techniques (Backer & Jain, 1981). In spite of this, the inherent ill-posedness of unlabeled data clustering problems is not clearly stated in existing literature where, as a consequence, dozens of papers proposing alternative clustering algorithms are published every year (perhaps in search of a "final" best clustering

Crisp (hard) competitive minimum-distance-to-means algorithms, such as the k-means data quantization approach, try to minimize a sum-of-squares error function (Cherkassky & Mulier, 2006; Bishop, 1995). To reduce the risk of being trapped in a local minimum of the error function, soft-to-hard rather than hard competitive clustering algorithms have been conceived (Baraldi & Blonda, 1999a; Baraldi & Blonda, 1999b). In addition, it is well known that both crisp and fuzzy k-means data clustering algorithms cannot perform well with nonconvex types of data, i.e., they are effective if and only if data clusters are hyperspherical (Duda et al., 2001). To overcome this problem, a k-means unsupervised data learning algorithm capable of defining automatically the number of clusters splits a non-convex data cluster, say, a data cluster shaped like a banana, into several hyperspheres. Thus, these hyperspheres should be linked to map the banana-like data cluster. To perform non-convex unlabeled data mapping, topologically preserving data clustering algorithms have been developed (Baraldi & Alpaydin, 2002a; Baraldi & Alpaydin, 2002b; Fritzke, 1997; Martinetz

band (e.g., panchromatic) or multi-band (chromatic) image domain (see Fig. 1).

**2.4.1.1 Inherently ill-posed unlabeled data clustering** 

algorithm which cannot exist…) (Xu & Wunsch II, 2005).

below.

Mulier, 2006).

& Schulten, 1994).

In terms of degree of automation, which decreases monotonically with the number of system-free parameters to be user-defined, it is noteworthy that, to make the inherently illposed unsupervised data clustering problem better posed, every unsupervised data clustering algorithm requires at least one free parameter to be user-defined or fixed by the application developer based on heuristics. For example, it appears paradoxical that the wellknown k-means vector quantizer, typically employed for unlabeled data clustering (refer to previous paragraphs), requires the user to pre-define the unknown number of unlabeled data clusters to be found in the finite unlabeled data set at hand.

In terms of computation time, unlabeled data clustering (either batch or on-line learning) is iterative (sub-optimal) in nature, therefore it is time-consuming with respect to prior knowledge-based one-pass data mapping algorithms (e.g., pattern-matching techniques).

In terms of effectiveness and robustness to changes in the input dataset, on-line (stochastic, sequential) learning unlabeled data clustering algorithms are typically subjected to local minima, e.g., they are sensitive to the order of presentation of the input data sequence. To enhance their robustness to changes in the order of presentation of the input sequence, semibatch unlabeled data clustering algorithms have been developed (Wilson & Martinez, 2000).

#### **2.4.1.2 Inherently ill-posed (2-D) image region extraction/contour detection**

In literature, a so-called Low-Level Vision Expert (LLVE) (Matsuyama & Shang-Shouq Hwang, 1990) includes a battery of low-level sub-symbolic (non-semantic) general-purpose domain-independent inductive-learning (fine-to-coarse, bottom-up, driven-withoutknowledge, see Part I Section 2.1) inherently ill-posed image processing (unlabeled datadriven) algorithms working at the signal level. This set of low-level image processing algorithms may comprise (Matsuyama & Shang-Shouq Hwang, 1990): edge-preserving noise filtering (Acton & Landis, 1997; Perona & Malik, 1990), either intensity- or color-based region/edge detection (Baraldi & Parmiggiani, 1996a; Canny, 1986), texture-based region/edge detection (Jain & Healey, 1998), region growing (Baraldi & Parmiggiani, 1996b), region extraction from not-close contours (Baraldi & Parmiggiani, 1995), etc.

In a (2-D) image domain, **region extraction is the dual problem of edge detection** and they are both inherently ill-posed visual tasks. In the rest of this paper, for simplicity's sake, in line with (Matsuyama & Shang-Shouq Hwang, 1990), all the aforementioned image processing operators are called "**segmentation**" algorithms. As output, an image segmentation algorithm generates *image features*, namely *points* and *regions* (also called segments, [2-D] objects, parcel or blobs (Carson et al., 1997; Lindeberg, 1993; Yang & Wang, 2007), also refer to Part I Section 2.3) or, vice versa, *region boundaries*, i.e., *edges*, provided with no semantic meaning. In general, a sub-symbolic image segment is: (1) made of connected pixels considered homogeneous in color and/or texture based on: (i) a subjective measure of similarity/dissimilarity and (ii) a subjective decision rule (e.g., thresholding), and (2) provided with a non-semantic label equivalent to a numerical segment-based identifier (integer value)*.* 

The inherent ill-posedness of any image segmentation algorithm is due to both systematic and accidental errors. The so-called *intrinsic insufficiency* of image segments is due to occlusion problems and dimensionality reduction (Matsuyama & Shang-Shouq Hwang, 1990) (refer to Part I Section 2.3). In addition, image segments are always affected by a socalled *artificial insufficiency* (Matsuyama & Shang-Shouq Hwang, 1990) due to the image segmentation algorithm at hand. This latter source of segmentation errors is related to the well-known *uncertainty principle* **according to which, for any contextual (neighborhood) property, we cannot simultaneously measure that property while obtaining accurate localization** (Corcoran & Winstanley, 2007; Petrou & Sevilla, 2006).

In practical contexts the inherent ill-posedness of any image segmentation algorithm implies the following.


To overcome these shortcomings many researchers in the field of cognitive psychology believe that object segmentation cannot be achieved in a completely bottom-up manner, which is tantamount to saying that segmentation and classification are strongly coupled (Corcoran & Winstanley, 2007; Corcoran et al., 2010; Vecera & Farah, 1997). In particular, Vecera and Farah proved that the process of human visual segmentation can be strongly influenced by top-down human (subjective) factors such as prior knowledge of the image at hand in addition to desires and expectations of an external observer (Vecera & Farah, 1997).

To date, the inherent ill-posedness of any image region/boundary detection algorithm is acknowledged by a relevant portion of the CV and RS communities (Burr & Morrone, 1992; Corcoran & Winstanley, 2007; Corcoran et al., 2010; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou & Sevilla, 2006; Vecera & Farah, 1997). For example, Castilla *et al*. observe that (Castilla et al**.**, 2008): " Image understanding is a complex cognitive process for which we may still lack key concepts. In particular, most image segmentation methods have been developed heuristically without a deeper examination of the semantic implications of the segmentation process." Well-known image segmentation algorithms, including eCognition® by Definiens AG (Definiens Imaging GmbH, 2004), "... are conceptually inconsistent with the object-oriented approach (OOA)... an underlying hypothesis of any segmentation method is that there is a correspondence between radiometric similarity in the image and semantic similarity in the imaged landscape. Thus, it is expected that image objects (segments) coincide with landscape objects (patches)." Unfortunately, the same Size-Constrained Region Merging (SCRM) algorithm proposed by Castilla *et al*. makes no exception to their criticism since its "correspondence between radiometric similarity and semantic similarity is not straightforward" (Castilla et al., 2008).

To summarize, according to Castilla *et al*. the conceptual framework of OBIA requires generation of symbolic image segments as output. This is the same claim made by cognitive psychology (see this text above) (Corcoran & Winstanley, 2007; Corcoran et al., 2010; Vecera & Farah, 1997). This also agrees with Marr's statement: "vision goes symbolic immediately, right at the level of zero-crossing (primal sketch)... without loss of information" (Marr, 1982)

called *artificial insufficiency* (Matsuyama & Shang-Shouq Hwang, 1990) due to the image segmentation algorithm at hand. This latter source of segmentation errors is related to the well-known *uncertainty principle* **according to which, for any contextual (neighborhood) property, we cannot simultaneously measure that property while obtaining accurate** 

In practical contexts the inherent ill-posedness of any image segmentation algorithm implies

(a) In real-world image segmentation problems (other than toy problems), it is inevitable for erroneous segments to be detected while genuine segments are omitted (Matsuyama

(b) Any image segmentation algorithm must rely on user-defined segmentation-free parameters based on subjective (heuristic, empirical) criteria on a site-specific basis (see Part I Section 2.1). As a consequence, any image segmentation algorithm can be considered difficult to use, i.e., its degree of automation is low, while its robustness to

To overcome these shortcomings many researchers in the field of cognitive psychology believe that object segmentation cannot be achieved in a completely bottom-up manner, which is tantamount to saying that segmentation and classification are strongly coupled (Corcoran & Winstanley, 2007; Corcoran et al., 2010; Vecera & Farah, 1997). In particular, Vecera and Farah proved that the process of human visual segmentation can be strongly influenced by top-down human (subjective) factors such as prior knowledge of the image at hand in addition to desires and expectations of an external observer (Vecera & Farah, 1997). To date, the inherent ill-posedness of any image region/boundary detection algorithm is acknowledged by a relevant portion of the CV and RS communities (Burr & Morrone, 1992; Corcoran & Winstanley, 2007; Corcoran et al., 2010; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou & Sevilla, 2006; Vecera & Farah, 1997). For example, Castilla *et al*. observe that (Castilla et al**.**, 2008): " Image understanding is a complex cognitive process for which we may still lack key concepts. In particular, most image segmentation methods have been developed heuristically without a deeper examination of the semantic implications of the segmentation process." Well-known image segmentation algorithms, including eCognition® by Definiens AG (Definiens Imaging GmbH, 2004), "... are conceptually inconsistent with the object-oriented approach (OOA)... an underlying hypothesis of any segmentation method is that there is a correspondence between radiometric similarity in the image and semantic similarity in the imaged landscape. Thus, it is expected that image objects (segments) coincide with landscape objects (patches)." Unfortunately, the same Size-Constrained Region Merging (SCRM) algorithm proposed by Castilla *et al*. makes no exception to their criticism since its "correspondence between radiometric similarity and semantic similarity is not

To summarize, according to Castilla *et al*. the conceptual framework of OBIA requires generation of symbolic image segments as output. This is the same claim made by cognitive psychology (see this text above) (Corcoran & Winstanley, 2007; Corcoran et al., 2010; Vecera & Farah, 1997). This also agrees with Marr's statement: "vision goes symbolic immediately, right at the level of zero-crossing (primal sketch)... without loss of information" (Marr, 1982)

changes in the input data set and changes in input parameters are both low.

**localization** (Corcoran & Winstanley, 2007; Petrou & Sevilla, 2006).

& Shang-Shouq Hwang, 1990) (p. 18).

straightforward" (Castilla et al., 2008).

the following.

(p. 343), refer to Part I Section 2.3. As a consequence, if this conjecture holds, then existing commercial image segmentation algorithms, whose claim is to be at the basis of the GEOBIA success (Definiens Imaging GmbH, 2004; Esch et al., 2008), are actually in contrast with the true conceptual framework of GEOBIA, which requires detection of semantic image segments (e.g., landscape objects or patches).

Unfortunately, in spite of the aforementioned contributions found in existing literature, most members of the CV and RS communities, including Diamant (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b) (refer to Part I Section 2.5), appear to ignore the inherently ill-posed (subjective) nature of the image segmentation (region extraction/ contour detection) problem. As a consequence, literally dozens of "novel" segmentation (region extraction/contour detection) algorithms are published each year (Zamperoni, 1996). For example, due to the availability of a commercial GEOBIA software developed by a German company (Definiens Imaging GmbH, 2004; Esch et al., 2008), OBIA approaches are currently considered the state-of-the-art in both scientific and commercial RS image mapping applications (Castilla et al**.**, 2008; Hay & Castilla, 2006).

In commercial GEOBIA systems, to reduce the number of empirical segmentation parameters (Esch et al., 2008), a multi-scale (hierarchical) iterative segmentation first stage is employed (Definiens Imaging GmbH, 2004). As output, a hierarchical segmentation algorithm generates multi-scale segmentation solutions in the hope that the target image will appear correctly segmented at some scale. However, quantitative multi-scale assessment of segmentation quality indices requires ground truth data at each scale which are impossible or impractical to obtain in RS common practice (Corcoran & Winstanley, 2007). Therefore, the "best" segmentation map must be selected by the user on an *a posteriori* basis from the available set of multi-scale segmentation solutions according to heuristic, subjective and/or qualitative criteria analogous to those employed in the selection of prior segmentation parameters. In practice, exploitation of a hierarchical segmentation algorithm does not make a driven-without-knowledge segmentation first stage easier to use. In addition, hierarchical segmentation algorithms are computationally intensive and require large memory occupation.

The conclusion is that, to date, in spite of its commercial success, GEOBIA remains affected by a lack of general methodological consensus and research (Hay & Castilla, 2006). Scientific disagreement on the conceptual framework of GEOBIA finds its origin in the well-known information gap existing between physical information (sensations) and semantic information (percepts) (Matsuyama & Shang-Shouq Hwang, 1990) (see Part I Section 2.2.2 and Part I Section 2.3). Since GEOBIA appears unable to generate semantic image segments (e.g., landscape objects) in the pre-attentive vision phase, it appears unsuitable for filling the information gap between raster sub-symbolic imagery and vector symbolic geospatial information (typically dealt with by geographic information systems, GIS).

#### **2.4.2 Labeled data learning for classification and function approximation**

Labeled (supervised) data learning approaches deal with either classification or function approximation (regression) problems whose output variables are discrete semantic and continuous non-semantic respectively, see Fig. 1 (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Mather, 1994; Mitchell, 1997).

In classification problems where the available training data set is assumed to be fully reliable (which may not always be the case (Bruzzone & Persello, 2009)), the goal of a classifier capable of learning from labeled data is to achieve a perfect fit of the training data set (to reduce to zero the training error) and, at the same time, make good semantic predictions for new (previously unobserved) inputs (to reduce to zero the testing error). An adaptive classifier can be trained in various ways, namely, on-line (sequential learning (Bishop, 1995), stochastic learning (Cherkassky & Mulier, 2006), when a large or infinite input data sequence is available and/or real-time adaptation is required), batch (it requires the storage of a complete and finite training data set (Bishop, 1995)) and semi-batch (Wilson & Martinez, 2000). In addition, there are many statistical classifiers. The most widely used statistical classifiers are the plug-in parametric maximum likelihood (ML) classifier, the nonparametric Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks, kernel methods (also called memory-based, which require the storage of a complete data set (Mitchell, 1997)) such as the SVM and the k-nearest neighbor (K-NN) algorithm, the naive Bayes classifier, adaptive (statistical) decision-trees such as the Classification And Regression Tree (CART), adaptive rule-based systems, mixture of experts (Jordan & Jacobs, 1994), etc. (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Duda et al., 2001; Mitchell, 1997).

Classifier performance depends greatly on the characteristics of the labeled data set to be classified (Baraldi et al., 2006b). In other words, there is no single classifier that works best on all given problems; this is also referred to as the "no free lunch" theorem. In practical contexts, classification model selection, i.e., determining a suitable classifier for a given problem, is still more an art than a science.

In reinforcement learning the agent is rewarded for good responses and punished for bad ones. These can be analyzed in terms of decision theory, using concepts such as utility (Cherkassky & Mulier, 2006).

Function regression (curve fitting) takes a finite set of numerical continuous input-output pair samples and attempts to discover an unknown continuous (smooth) deterministic function which, together with added Gaussian noise, would generate those target outputs from the inputs (Bishop, 1995). The goal of function approximation is not to learn an exact representation (interpolation) of the training data, but rather to build a statistical model of the physical process that generates the training labeled data. This statistical model ought to be capable of the best trade-off between: (a) achieving a good fit of the training data (to keep low the bias term of a sum-of-squares error function) and (b) obtaining a reasonably smooth function that is not over-fitted to the training data (to keep the variance term of a sum-ofsquares error function low). This is important if the self-organizing (adaptive) function approximation system is to exhibit good generalization, i.e., to make good numerical predictions for new (previously unobserved) inputs (Bishop, 1995).

To summarize, to properly deal with discrete semantic or continuous non-semantic output values, labeled (supervised) data learning systems feature different functional hypotheses and properties. For example:

 they adopt different cost functions, namely, the cross-entropy error function for adaptive classifiers versus the sum-of-squares error for function approximation approaches (Bishop, 1995) (p. 230).

 When the training labeled data set is assumed to be fully reliable the goal of adaptive classifiers is to reduce to zero both training and testing errors (e.g., if the training error is equal to zero then a classifier is called consistent (Baraldi & Alpaydin, 2002b; Mitchell, 1997)). Vice versa, reducing to zero the bias term in function regression is not recommended because it would imply over-fitting to the training data assumed to be inherently affected by Gaussian noise (which is not the case for exact interpolators) (Bishop, 1995).

#### **2.5 Diamant's image segmentation and contour detection algorithms as proofs of his concepts**

As proofs of his concepts (see Part I Section 2.2.3) Diamant presents an image segmentation algorithm and a contour detection algorithm which are summarized below.

#### **2.5.1 Multi-scale image segmentation algorithm**

84 Earth Observation

In classification problems where the available training data set is assumed to be fully reliable (which may not always be the case (Bruzzone & Persello, 2009)), the goal of a classifier capable of learning from labeled data is to achieve a perfect fit of the training data set (to reduce to zero the training error) and, at the same time, make good semantic predictions for new (previously unobserved) inputs (to reduce to zero the testing error). An adaptive classifier can be trained in various ways, namely, on-line (sequential learning (Bishop, 1995), stochastic learning (Cherkassky & Mulier, 2006), when a large or infinite input data sequence is available and/or real-time adaptation is required), batch (it requires the storage of a complete and finite training data set (Bishop, 1995)) and semi-batch (Wilson & Martinez, 2000). In addition, there are many statistical classifiers. The most widely used statistical classifiers are the plug-in parametric maximum likelihood (ML) classifier, the nonparametric Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks, kernel methods (also called memory-based, which require the storage of a complete data set (Mitchell, 1997)) such as the SVM and the k-nearest neighbor (K-NN) algorithm, the naive Bayes classifier, adaptive (statistical) decision-trees such as the Classification And Regression Tree (CART), adaptive rule-based systems, mixture of experts (Jordan & Jacobs, 1994), etc. (Alpaydin, 2010; Bishop, 1995; Cherkassky & Mulier, 2006; Duda et al., 2001;

Classifier performance depends greatly on the characteristics of the labeled data set to be classified (Baraldi et al., 2006b). In other words, there is no single classifier that works best on all given problems; this is also referred to as the "no free lunch" theorem. In practical contexts, classification model selection, i.e., determining a suitable classifier for a given

In reinforcement learning the agent is rewarded for good responses and punished for bad ones. These can be analyzed in terms of decision theory, using concepts such as utility

Function regression (curve fitting) takes a finite set of numerical continuous input-output pair samples and attempts to discover an unknown continuous (smooth) deterministic function which, together with added Gaussian noise, would generate those target outputs from the inputs (Bishop, 1995). The goal of function approximation is not to learn an exact representation (interpolation) of the training data, but rather to build a statistical model of the physical process that generates the training labeled data. This statistical model ought to be capable of the best trade-off between: (a) achieving a good fit of the training data (to keep low the bias term of a sum-of-squares error function) and (b) obtaining a reasonably smooth function that is not over-fitted to the training data (to keep the variance term of a sum-ofsquares error function low). This is important if the self-organizing (adaptive) function approximation system is to exhibit good generalization, i.e., to make good numerical

To summarize, to properly deal with discrete semantic or continuous non-semantic output values, labeled (supervised) data learning systems feature different functional hypotheses

 they adopt different cost functions, namely, the cross-entropy error function for adaptive classifiers versus the sum-of-squares error for function approximation

predictions for new (previously unobserved) inputs (Bishop, 1995).

Mitchell, 1997).

problem, is still more an art than a science.

(Cherkassky & Mulier, 2006).

and properties. For example:

approaches (Bishop, 1995) (p. 230).

In (Diamant, 2005), a multi-scale image segmentation algorithm is presented and applied to a toy problem, namely, a panchromatic (one-band) image of 640 × 480 pixels in size. The proposed segmentation algorithm is as follows.

1. Low-pass (smoothing) dyadic (sub-sampling by a factor of 2) image decomposition (down-scaling). Image decomposition levels are identified with integer numbers l = 0,..., L, L+1, where level 0 identifies the input image at full spatial resolution. Value L > 0 is set to 4, thus the maximum down-scale level is L+1 = 5. A simple dyadic multi-scale panchromatic (one-band) image decomposition and averaging operator is applied as follows.

$$\mathbf{g}^{\downarrow\ast 1}(\mathbf{x}, \mathbf{y}) = [\mathbf{g}^{\downarrow}(2\mathbf{x}, 2\mathbf{y}) + \mathbf{g}^{\downarrow}(2\mathbf{x} + 1, 2\mathbf{y}) + \mathbf{g}^{\downarrow}(2\mathbf{x} + 1, 2\mathbf{y} + 1) + \mathbf{g}^{\downarrow}] \mathbf{g}^{\downarrow}(\mathbf{x}, \mathbf{y})$$

$$+ \mathbf{g}(2\mathbf{x}, 2\mathbf{y} + 1)] / 4, \quad \mathbf{l} = \mathbf{0}, \dots, \mathbf{L} \ge \mathbf{0}, \tag{1-1}$$

where gl+1(x,y) is the gray-level value of a (down-scaled parent) pixel at the (x,y) coordinate position in a higher (l+1)-level image while gl (2x,2y) and its three nearest neighbors listed in Eq. (1-1) are the corresponding (up-scaled children) pixels within an image array at the lower level l.

2. Single-scale image segmentation algorithm run at the top (coarsest) (L+1)-level of the decomposition pyramid. Diamant claims that since the image size at the top level of the pyramid is significantly reduced and a severe data averaging is attained, any wellknown segmentation methodology would suffice. Diamant's proprietary segmentation technique firstly outlines image boundaries (contours) (see Part I Section 2.4.1.2). Secondly, contiguous pixels of "similar" appearance (based on an unknown similarity measure and decision rule) within non-closed contours are aggregated in spatially connected segments (this is apparently a region growing from non-closed contours approach, e.g., refer to (Baraldi & Parmiggiani, 1995)). Thirdly, the segment-based mean intensity image, called characteristic intensity, is computed (this is a piecewise constant image approximation of the input image generated by replacing every pixel with the mean value of the segment where that pixel is located).

3. (Coarse-to-fine spatial resolution) mean image and segmentation map up-scaling. At each level l = L + 1, ..., 1, with step -1, the mean image and the segmentation map are expanded to the size of the image at the nearest lower level (l-1) (at finer spatial resolution). The expansion rule is simple and the same for both up-scaling operations: the value of each parent pixel at level l is assigned to its four children at level (l-1). Diamant claims that since image regions feature a low inter-segment intensity variability, the majority of newly assigned pixels are determined in a sufficiently correct manner. Only pixels lying on object boundaries or seeds of newly emerging objects can significantly deviate from their up-scaled assigned value. Taking the corresponding l-level of the down-scaled image as a reference, these pixels can easily (!?) be detected and subjected to a refinement cycle. Here they are allowed to adjust themselves to the ''proper'' nearest neighbors, which certainly belong to one of the previously labeled regions or to the newly emerging ones. Unlike the lossless image decomposition/reconstruction procedure provided by Burt and Adelson's Gaussian/Laplacian pyramid (Burt & Adelson, 1983), in the Diamant case the exact reconstruction of an image is not required. In Diamant's opinion "only (?!) in special cases - medical, scientific, military, fine-art, and a couple (!?) of other applications the reconstruction fidelity of the original image can be critically important" (Diamant, 2005), which is to say it is critical in all quantitative rather than qualitative CV applications! For example, RS image understanding applications require small, but genuine image details, say, roads, to be well preserved, which is tantamount to saying that RS image applications are among the "couple (!?) of other applications" where high fidelity in multi-scale encoding (decomposition)/decoding (reconstruction) is required.

A critical analysis of the Diamant image segmentation algorithm can be found in Part II Section 3.1.

#### **2.5.2 Single-scale image contour detection algorithm**

In (Diamant, 2005) Diamant presents a single-scale image contour detection algorithm and applies it to a toy problem, namely, a panchromatic image 256 × 256 pixels in size. This contour detector provides a measure of local information, Iloc(x,y), as a product of two terms.

$$\mathbf{I}\_{\text{loc}}(\mathbf{x}\_{\circ}\mathbf{y}) = \mathbf{I}\_{\text{int}}(\mathbf{x}\_{\circ}\mathbf{y}) \times \mathbf{I}\_{\text{top}}(\mathbf{x}\_{\circ}\mathbf{y}) \tag{1-2}$$

where (x,y) are the central pixel coordinates in a (2-D) image array, factor Iint(x,y) is the intensity change component and factor Itop(x,y) is considered a measure of topological confidence (uncertainty). In Eq. (1-2) term Iint(x,y) is estimated as follows.

$$\mathbf{I}\_{\text{int}}(\mathbf{x}, \mathbf{y}) = \frac{1}{8} \sum\_{n=1}^{8} \left| \mathbf{g}\_{\text{c}}(\mathbf{x}, \mathbf{y}) - \mathbf{g}\_{n}(\mathbf{x}, \mathbf{y}) \right| \ge 0. \tag{1-3}$$

Thus, in Eq. (1-2) the first term Iint(x,y) is estimated as the mean absolute difference between the central pixel gray value, gc(x,y), and the gray levels of its 8-adjacency neighbors, gn(x, y), n = 1, ..., 8.

In Eq. (1-2) the second term Itop(x,y) is computed in two steps. Firstly, an expression for a pixel's interrelationship with its surrounding is defined as follows.

3. (Coarse-to-fine spatial resolution) mean image and segmentation map up-scaling. At each level l = L + 1, ..., 1, with step -1, the mean image and the segmentation map are expanded to the size of the image at the nearest lower level (l-1) (at finer spatial resolution). The expansion rule is simple and the same for both up-scaling operations: the value of each parent pixel at level l is assigned to its four children at level (l-1). Diamant claims that since image regions feature a low inter-segment intensity variability, the majority of newly assigned pixels are determined in a sufficiently correct manner. Only pixels lying on object boundaries or seeds of newly emerging objects can significantly deviate from their up-scaled assigned value. Taking the corresponding l-level of the down-scaled image as a reference, these pixels can easily (!?) be detected and subjected to a refinement cycle. Here they are allowed to adjust themselves to the ''proper'' nearest neighbors, which certainly belong to one of the previously labeled regions or to the newly emerging ones. Unlike the lossless image decomposition/reconstruction procedure provided by Burt and Adelson's Gaussian/Laplacian pyramid (Burt & Adelson, 1983), in the Diamant case the exact reconstruction of an image is not required. In Diamant's opinion "only (?!) in special cases - medical, scientific, military, fine-art, and a couple (!?) of other applications the reconstruction fidelity of the original image can be critically important" (Diamant, 2005), which is to say it is critical in all quantitative rather than qualitative CV applications! For example, RS image understanding applications require small, but genuine image details, say, roads, to be well preserved, which is tantamount to saying that RS image applications are among the "couple (!?) of other applications" where high fidelity in multi-scale encoding (decomposition)/decoding

A critical analysis of the Diamant image segmentation algorithm can be found in Part II

In (Diamant, 2005) Diamant presents a single-scale image contour detection algorithm and applies it to a toy problem, namely, a panchromatic image 256 × 256 pixels in size. This contour detector provides a measure of local information, Iloc(x,y), as a product of two terms.

 Iloc(x,y) = Iint(x,y) × Itop(x,y) (1-2) where (x,y) are the central pixel coordinates in a (2-D) image array, factor Iint(x,y) is the intensity change component and factor Itop(x,y) is considered a measure of topological

<sup>1</sup> , , <sup>8</sup> *c n*

Thus, in Eq. (1-2) the first term Iint(x,y) is estimated as the mean absolute difference between the central pixel gray value, gc(x,y), and the gray levels of its 8-adjacency neighbors, gn(x, y),

In Eq. (1-2) the second term Itop(x,y) is computed in two steps. Firstly, an expression for a

*g xy g xy*

0. (1-3)

confidence (uncertainty). In Eq. (1-2) term Iint(x,y) is estimated as follows.

8

*n*

1

Iint(x,y) =

pixel's interrelationship with its surrounding is defined as follows.

(reconstruction) is required.

**2.5.2 Single-scale image contour detection algorithm** 

Section 3.1.

n = 1, ..., 8.

$$\text{distatus}(\mathbf{x}, \mathbf{y}) = \mathbf{8g}\_{\mathcal{C}}(\mathbf{x}, \mathbf{y}) \text{ - } \sum\_{n=1}^{8} \mathbf{g}\_{n}(\mathbf{x}, \mathbf{y}) \text{ .}\tag{1-4}$$

It is worthy of note that status(x, y) is equivalent to a contrast value computed by an isotropic *mexican-hat* operator centered on pixel (x, y). The shortest status(x, y) description (encoding) would be in a binary form, for example, 0 if status is negative, and 1 otherwise. Status(x, y) is evaluated for every pixel (x, y) in an image and mapped into a binary status map of the same size as the input image. Secondly, the spatial (topological) interactions of a pixel with its 8-adjacency neighbors can be estimated using the binary status map:

$$\mathbf{I}\_{\rm top}(\mathbf{x}, \mathbf{y}) = \mathbf{p} \ (\mathbf{1} \ \mathbf{-} \ \mathbf{p}) \mathbf{=} \left(\mathbf{m} / 8\right) \left[\left(8 \ \mathbf{-} \ \mathbf{m}\right) / 8\right] \mathbf{m} \in \left\langle \mathbf{0}, 8 \right\rangle \tag{1-5}$$

where p is the probability that the central pixel and its surrounding ones share the same status, such that m {0, 8} is the number of 8-adjacency pixels that share the same status with the central pixel in the 2-D array position (x, y). Any Itop(x, y) value is computed for every pixel (x, y) and saved in a special image of the size of the input image.

Diamant considers peaks (local extrema) in Iloc(x,y) = Eq. (1-2) = Iint(x,y) × Itop(x,y) = Eq. (1- 3) × Eq. (1-5) as signs of a visible edge present at a given location. However, establishing a proper threshold for local extrema has always been a hard and sophisticated matter. To overcome this difficulty, Diamant proposes to gather a cumulative histogram of Iloc values. At first, a number of equal intervals (bins) is selected and a histogram (first-order statistic) of the Iloc image is constructed in sequence for every histogram bin as follows: if the pixel-based Iloc value is greater than or equal to the bin's lower bound, then this bin counter is increased by one. As a result, the first bin represents the cardinality of all Iloc values > 0. It is now explicitly visible what part of the whole ''image information content'' is carried out by Iloc values equal to or greater than a particular bin lower bound. This can be used as a (subjective!) threshold for appropriate image point assignment (marking). In such a way, a set of different information content-related thresholds can be established, which can address diversified task-related requirements. For example, the most prominent image points are marked in dark gray, carrying more than 50% of the whole information content. Less important image parts can be marked in half-gray, carrying between 50 and 70% of information content, and the lowest importance image parts are marked in light gray, carrying 70 to 85% residuals of the information content. The proposed image point marking technique can be effectively used to create more enhanced low-level information content descriptors. For example, based on the status image generated from Eq. (1-4), an edge-localization image can be displayed where dark-gray is assigned to the lower intensity sides of the edges and light-gray to the higher intensity edge sides (Diamant, 2005).

A critical analysis of the Diamant image contour detection algorithm can be found in Part II Section 3.3.

#### **2.6 Four levels of understanding of an RS-IUS**

It is important to remember that there are four levels of analysis (understanding) of any information processing device, including RS-IUSs. They are listed below (Baraldi et al., 2010b; Baraldi, 2011a; Marr, 1982).


#### **2.7 Quality Assurance Framework for EO (QA4EO)**

Delivered by the Working Group on Calibration and Validation (WGCV) of the Committee of Earth Observations (CEOS), the space arm of the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008b), the QA4EO guidelines (GEO/CEOSS, 2008) consider mandatory the following actions: (i) calibration and validation (Cal/Val) activities from sensor build to end-of-life and (ii) every sensor-derived data product must be provided with metrological/ statistically-based quality indicators (QIs) featuring a degree of uncertainty in measurement. Unfortunately, in RS common practice, these international guidelines are often ignored by scientists, practitioners and whole institutions (Baraldi, 2009).

#### **2.7.1 Calibration and validation (Cal/Val) activities from sensor build to end-of-life**

QA4EO considers mandatory an appropriate coordinated program of Cal/Val activities throughout all stages of a spaceborne mission, from sensor build to end-of-life (GEO/CEOSS, 2008). This ensures the harmonization and interoperability of multi-source observational data and derived products required by international programs such as the ongoing GEOSS and GMES projects (GEO, 2008b; GEO, 2005) (refer to Part I Section 1).

In spite of the QA4EO recommendations and although it is regarded as common knowledge in the RS community, *radiometric calibration*, i.e., the transformation of dimensionless digital numbers (DNs) into a physical unit of measure related to a community-agreed radiometric scale, is often neglected in literature and surprisingly ignored by scientists, practitioners and

i. Computational theory (system architecture). According to Marr, the linchpin of success in attempting to solve the CV problem is that of addressing the computational theory rather than algorithms or implementations (Marr, 1982). In other words, if the vision device architecture is inadequate, even sophisticated algorithms can produce lowquality outputs. On the contrary, improvement in the vision system architecture might achieve twice the benefit with half the effort (which is an adaptation of the original words by Wang (Fangju Wang, 1990)). For example, a two-stage stratified hierarchical hybrid RS-IUS architecture (see Part II Fig. 3) has been proposed in recent literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b), as an alternative to the current state-of-the-art two-stage GEOBIA architecture, hereafter referred to as two-stage segment-based hybrid RS-IUS

ii. Knowledge/information representation. According to Wang, "if knowledge representation is poor, even sophisticated algorithms can produce inferior outputs. On the contrary, improvement in representation might achieve twice the benefit with half the effort" (Fangju Wang, 1990). For example, in (Baraldi et al., 2010c; Baraldi, 2011b) a crisp-to-fuzzy SIAM™ transition has been accomplished to model class mixtures. iii. Algorithm design. This level deals with the design of the algorithm selected to fill each of the data processing modules comprised in the system architecture (refer to point (i) above). According to (Page-Jones, 1988), structured system design is "everything but

iv. Implementation. This level deals with the source code generation for every algorithm

Delivered by the Working Group on Calibration and Validation (WGCV) of the Committee of Earth Observations (CEOS), the space arm of the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008b), the QA4EO guidelines (GEO/CEOSS, 2008) consider mandatory the following actions: (i) calibration and validation (Cal/Val) activities from sensor build to end-of-life and (ii) every sensor-derived data product must be provided with metrological/ statistically-based quality indicators (QIs) featuring a degree of uncertainty in measurement. Unfortunately, in RS common practice, these international guidelines are often ignored by

**2.7.1 Calibration and validation (Cal/Val) activities from sensor build to end-of-life** 

going GEOSS and GMES projects (GEO, 2008b; GEO, 2005) (refer to Part I Section 1).

QA4EO considers mandatory an appropriate coordinated program of Cal/Val activities throughout all stages of a spaceborne mission, from sensor build to end-of-life (GEO/CEOSS, 2008). This ensures the harmonization and interoperability of multi-source observational data and derived products required by international programs such as the on-

In spite of the QA4EO recommendations and although it is regarded as common knowledge in the RS community, *radiometric calibration*, i.e., the transformation of dimensionless digital numbers (DNs) into a physical unit of measure related to a community-agreed radiometric scale, is often neglected in literature and surprisingly ignored by scientists, practitioners and

architecture (see Part II Fig. 2).

designed at point (iii) above.

**2.7 Quality Assurance Framework for EO (QA4EO)** 

scientists, practitioners and whole institutions (Baraldi, 2009).

code".

institutions involved with RS common practice including large-scale spaceborne image mosaicking and mapping (Baraldi et al., 2006a; Baraldi, 2009; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a).

A relevant extension of the QA4EO recommendation for radiometric calibration of multisource EO data is the following.

"Radiometric calibration not only ensures the harmonisation and interoperability of multisource observational data according to the QA4EO guidelines, but is a necessary, although insufficient, condition for automating the quantitative analysis of EO data" (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a) in RS data understanding problems other than toy problems at small data scale and coarse semantic granularity. By definition, a data processing system is *automatic* when it requires no user-defined parameter to run, therefore its user-friendliness cannot be surpassed (refer to Part I Section 2.8).

This necessary condition for automatic EO data understanding agrees with common sense, summarized by the expression: "garbage in means garbage out". In the terminology of MAL and CV, the radiometric calibration constraint augments the degree of prior knowledge of a RS-IUS required to complement the intrinsic insufficiency (illposedness) of (2-D) image features, i.e., radiometric calibration makes the inherently illposed CV problem better posed (Baraldi et al., 2010a; Baraldi, 2011a; Matsuyama & Shang-Shouq Hwang, 1990).

To summarize, in disagreement with the QA4EO guidelines, most existing scientific and commercial RS-IUSs, such as those listed in Table 1, do not require RS images to be radiometrically calibrated and validated. As a consequence, according to the aforementioned necessary condition for automating the quantitative analysis of EO data, these RS-IUSs are semi-automatic and/or site-specific (since one scene may represent, say, apples, while any other scene, even if contiguous or overlapping, may represent, say, oranges), refer to Table 1. Secondly, Table 1 shows that unlike SIAM™, the ERDAS Atmospheric Correction for satellite imagery (ATCOR3) (Richter, 2006) requires as input an MS image radiometrically calibrated into surface reflectance values exclusively. This implies that the ERDAS ATCOR3 software considers mandatory the inherently ill-posed and difficult-to-solve MS image atmospheric correction pre-processing stage which requires user intervention to make it better posed (Baraldi, 2011a). Thus, unlike SIAM™, the ERDAS ATCOR3 satisfies the necessary condition for automating the quantitative analysis of EO data, but is insufficient to provide a RS image classification problem with an automatic workflow requiring no user-defined empirical parameter to be based on heuristic criteria.

#### **2.7.2 Quality Indicators (QIs) with a degree of uncertainty**

In addition to considering mandatory an appropriate coordinated program of Cal/Val activities throughout all stages of a spaceborne mission, from sensor build to end-of-life (see Section 2.7.1), the QA4EO guidelines require that every sensor-derived data product generated across a satellite-based measurement system's processing chain be provided with metrological/ statistically-based QIs featuring a degree of uncertainty in measurement (GEO/CEOSS, 2008). Unfortunately, in RS common practice, as well as in existing literature,


Table 1. Existing commercial RS-IUSs and their degree of match with the international QA4EO quidelines.

these international guidelines are often ignored by scientists, practitioners and whole institutions (Baraldi, 2009). For example, most works published in RS literature assess and compare spaceborne image classification algorithms in terms of mapping accuracy exclusively, which corresponds to only one of several operational QIs of a RS-IUS (refer to Part I Section 2.8). Moreover, these classification accuracy estimates are rarely provided with a degree of uncertainty in measurement. This violates well-known laws of sample statistics (Congalton & Green, 1999; Foody, 2002; Jain et al., 2000), together with common sense envisaged under the international guidelines of the QA4EO (GEO/CEOSS, 2008).

**Radiometric calibration (RAD. CAL.) requirement according to the international QA4EO guidelines** 

NO RAD. CAL. Þ semi-automatic and site-

NO RAD. CAL. Þ semi-automatic and site-

NO RAD. CAL. Þ semi-automatic and site-

recommendations: surface reflectance, SURF Þ inherently ill-posed atmospheric correction first stage Þ semi-automatic and

recommendations: top-of-atmosphere (TOA) reflectance (TOARF) or surface reflectance (SURF) values, with TOARF SURF atmospheric correction is

optional. Automatic and robust to changes in RS optical imagery acquired across time,

Consistent with the QA4EO

Sub-symbolic pixels NO RAD. CAL. Þ semi-automatic and site-

specific

specific

specific

specific

site-specific.

space and sensors.

Sub-symbolic pixels Consistent with the QA4EO

**Sub-symbolic (asemantic) versus symbolic (semantic) information primitives, namely, pixels / (2-D) objects (regions, segments) / strata** 

Unsupervised data learning sub-symbolic objects

Either sub-symbolic pixels or unsupervised data learning subsymbolic objects

> Supervised data learning symbolic objects

Prior knowledge-based symbolic pixels symbolic objects symbolic strata

Table 1. Existing commercial RS-IUSs and their degree of match with the international

envisaged under the international guidelines of the QA4EO (GEO/CEOSS, 2008).

these international guidelines are often ignored by scientists, practitioners and whole institutions (Baraldi, 2009). For example, most works published in RS literature assess and compare spaceborne image classification algorithms in terms of mapping accuracy exclusively, which corresponds to only one of several operational QIs of a RS-IUS (refer to Part I Section 2.8). Moreover, these classification accuracy estimates are rarely provided with a degree of uncertainty in measurement. This violates well-known laws of sample statistics (Congalton & Green, 1999; Foody, 2002; Jain et al., 2000), together with common sense

**Commercial RS-IUSs** 

PCI Geomatics GeomaticaX

eCognition Server by Definiens AG

Pixel- and Segmentbased versions of the Environment for Visualizing Images (ENVI) by ITT VIS

ERDAS IMAGING

Novel two-stage stratified

hierarchical RS-IUS employing SIAM™ as its preliminary classification first

QA4EO quidelines.

Objective

ERDAS Atmospheric Correction-3 (ATCOR3) (Richter,

2006)

stage

It is well known, but often forgotten in common practice that any evaluation measure is inherently non-injective (Baraldi, 2011a). For example, in classification map accuracy assessment and comparison, different classification maps may produce the same confusion matrix while different confusion matrices may generate the same confusion matrix accuracy measure, such as overall accuracy. These observations suggest that *no single universally acceptable measure of quality, but instead a variety of quality indices, should be employed in practice* (Congalton & Green, 1999; Foody, 2002). To date, this general conclusion is neither obvious nor community-agreed. For example, this conclusion implies that when a test image and a reference (original) image pair is given, common attempts to identify a unique (universal) reliable image quality index, such as the relative dimensionless global error ERGAS proposed in (Wald et al., 1997), the universal image quality index Q (Wang & Bovik, 2002), the global image quality measure Q4 (Alparone et al., 2004), and the quality index with no reference QNR (Alparone et al., 2006), are inherently undermined as contradictions in terms.

In recent years the issue of uncertainty in spatial data has become increasingly recognized by the RS and geographic information systems (GIS) communities (Friedl et al., 2001). Spatial uncertainty analysis investigates sources of inaccuracies in geospatial data acquisition and understanding and investigates error propagation through a RS (2-D) image processing chain. For example, post-classification change detection between two classification maps of overall accuracy OA1 [0, 1] and OA2 [0, 1], respectively, features a change detection OA (COA) such that COA (OA1 × OA2) (Lunetta & Elvidge, 1999). For example, Friedl *et al*. identify three primary sources of errors in spatial information generated from RS imagery (Friedl et al., 2001).


#### **2.8 Operational Quality Indicators (QIs) of an RS-IUS**

In operational contexts a RS-IUS is defined as a low performer if at least one among several operational QIs scores low. Typical operational qualities of a RS-IUS encompass the following (Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a).


*The aforementioned list of operational QIs is neither irrelevant nor obvious*. For example, a low score in operational QIs may explain why the literally hundreds of so-called novel low-level (sub-symbolic) and high-level (symbolic) image processing algorithms presented each year in scientific literature typically have a negligible impact on commercial RS image processing software (Zamperoni, 1996). This conjecture is consistent with the fact that most works published in RS literature assess and compare spaceborne image classification algorithms in terms of mapping accuracy exclusively, which corresponds to the sole operational performance indicator (ii) listed above. Moreover, these classification accuracy estimates are rarely provided with a degree of uncertainty in measurement. This violates well-known laws of sample statistics (Congalton & Green, 1999; Foody, 2002; Jain et al., 2000), together with common sense envisaged under the international guidelines of the QA4EO (see Part I Section 2.7.2) (GEO/CEOSS, 2008).

#### **3. Conclusions**

The goal of this work is to revise, integrate and enrich previous analyses found in related papers about recent developments in the design and implementation of an operational automatic multi-sensor multi-resolution near real-time two-stage hybrid stratified hierarchical RS-IUS (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a).

For publication reasons this work is split into Part I and Part II. In Part I Section 2, related works, concepts and definitions are revised to provide this paper with a significant survey value and make it self-contained. In Part II Section 2, the survey of past works is completed. The original contribution of this work can be found in Part II Section 3 to Part II Section 7.

#### **4. Acknowledgments**

This material is partly based upon work supported by the National Aeronautics and Space Administration under Grant/Contract/Agreement No. NNX07AV19G issued through the Earth Science Division of the Science Mission Directorate. The research leading to these results has also received funding from the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement n° 263435. This author wishes to thank the Editorial Board of InTech for its competence and willingness to help.

#### **5. References**

92 Earth Observation

iv. Economy (costs). Related to manpower and computing power. For example, open source solutions are welcome to reduce costs of software licenses. Supervised data learning approaches (e.g., SVMs, OBIA systems, etc.) require reference training samples which are typically scene-specific, expensive, tedious, difficult or impossible to collect.

vii. Maintainability / scalability / re-usability to keep up with changes in users' needs and

viii. Timeliness, defined as the time span between data acquisition and product delivery to the end user. It increases monotonically with manpower, e.g., the manpower required

*The aforementioned list of operational QIs is neither irrelevant nor obvious*. For example, a low score in operational QIs may explain why the literally hundreds of so-called novel low-level (sub-symbolic) and high-level (symbolic) image processing algorithms presented each year in scientific literature typically have a negligible impact on commercial RS image processing software (Zamperoni, 1996). This conjecture is consistent with the fact that most works published in RS literature assess and compare spaceborne image classification algorithms in terms of mapping accuracy exclusively, which corresponds to the sole operational performance indicator (ii) listed above. Moreover, these classification accuracy estimates are rarely provided with a degree of uncertainty in measurement. This violates well-known laws of sample statistics (Congalton & Green, 1999; Foody, 2002; Jain et al., 2000), together with common sense envisaged under the international guidelines of the QA4EO (see Part I

The goal of this work is to revise, integrate and enrich previous analyses found in related papers about recent developments in the design and implementation of an operational automatic multi-sensor multi-resolution near real-time two-stage hybrid stratified hierarchical RS-IUS (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi,

For publication reasons this work is split into Part I and Part II. In Part I Section 2, related works, concepts and definitions are revised to provide this paper with a significant survey value and make it self-contained. In Part II Section 2, the survey of past works is completed. The original contribution of this work can be found in Part II Section 3 to Part

This material is partly based upon work supported by the National Aeronautics and Space Administration under Grant/Contract/Agreement No. NNX07AV19G issued through the Earth Science Division of the Science Mission Directorate. The research leading to these results has also received funding from the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement n° 263435. This author wishes to thank the Editorial

Board of InTech for its competence and willingness to help.

v. Robustness to changes in the input data set, e.g., changes due to noise in the data.

iii. Efficiency, e.g., computation time, memory occupation.

vi. Robustness to changes in input parameters, if any exist.

to collect site-specific training samples.

sensor properties.

Section 2.7.2) (GEO/CEOSS, 2008).

**3. Conclusions** 

2011a).

II Section 7.

**4. Acknowledgments** 


Baraldi, A. (2009). Impact of radiometric calibration and specifications of spaceborne optical

Baraldi, A.; Durieux, L.; Simonetti, D.; Conchedda, G.; Holecz, F. & Blonda, P. (2010a).

Baraldi, A.; Durieux, L.; Simonetti, D.; Conchedda, G.; Holecz, F. & Blonda, P. (2010b).

Baraldi, A. Beyond Geographic Object-Based and Object-Oriented Image Analysis

Baraldi, A. (2011b). Fuzzification of a crisp near real-time operational automatic spectral

Bishop, C. M. (1995). *Neural Networks for Pattern Recognition*. Clarendon Press, Oxford,

Bruzzone, L. & Carlin, L. (2006). A multilevel context-based system for classification of very

Bruzzone, L. & Persello, C. (2009). A novel context-sensitive semisupervised SVM classifier

Burr, D. C. & Morrone, M. C. (1992). A nonlinear model of feature detection, In: *Nonlinear* 

Burt, P. & Adelson, E. (1983). The Laplacian pyramid as a compact image code. *IEEE Trans.* 

Canny, J. (1986). A computational approach to edge detection. *IEEE Trans. Pattern Anal.* 

Carson, C.; Belongie, S.; Greenspan, H. & Malik, J. (1997). Region-Based Image Querying,

& N. Bahram, (Eds.), pp. 309-327, CRC Press, Boca Raton, Florida.

*Communications*, Vol. COM-31, No. 4, pp. 532-540.

*Machine Intell.*, Vol. 8, pp. 679-714.

Puerto Rico, June 20, 1997.

*Observations and Remote Sensing*, Vol. 2, No. 2, pp. 104-134*.* 

3482 - 3502.

publication, rs‐10905, 2011.

publication, July 2011.

United Kingdom.

No. 7, pp. 2142-2154.

2587–2600.

imaging sensors on the development of operational automatic remote sensing image understanding systems. *IEEE Journal of Selected Topics in Applied Earth* 

Automatic spectral rule-based preliminary classification of radiometrically calibrated SPOT-4/-5/IRS, AVHRR/MSG, AATSR, IKONOS/QuickBird/ OrbView/GeoEye and DMC/SPOT-1/-2 imagery – Part I: System design and implementation. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 3, pp. 1299 - 1325.

Automatic spectral rule-based preliminary classification of radiometrically calibrated SPOT-4/-5/IRS, AVHRR/MSG, AATSR, IKONOS/QuickBird/ OrbView/GeoEye and DMC/SPOT-1/-2 imagery – Part II: Classification accuracy assessment. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 3, pp. 1326 - 1354. Baraldi, A.; Wassenaar, T. & Kay, S. (2010c). Operational performance of an automatic

preliminary spectral rule-based decision-tree classifier of spaceborne very high resolution optical images. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 9, pp. pp.

(GEOBIA/GEOOIA): Levels of understanding and degrees of novelty of an operational automatic two-stage stratified hierarchical hybrid remote sensing image understanding system, Remote Sens., submitted for consideration for

rule-based decision-tree preliminary classifier of multi-source multi-spectral remotely-sensed images, *IEEE Trans. Geosci. Remote Sensing*, accepted for

high spatial resolution images. IEEE *Trans. Geosci. Remote Sens*., Vol. 44, No. 9, pp.

robust to mislabeled training samples. *IEEE Trans. Geosci. Remote Sensing*, Vol. 47,

*Vision: Determination of Neural Receptive Fields, Functions, and Networks*, R. B. Pinter

*Proc. Int'l Workshop Content-Based Access of Image and Video libraries*, San Juan ,


*Implications for Remote Sensing and GIS Applications*, C.T. Hunsaker, M.F. Goodchild, M.A. Friedl & T.J. Case, (Eds), pp. 258–283, Springer, New York.


http://calvalportal.ceos.org/CalValPortal/showQA4EO.do?section=qa4eoIntro


www.fabricadebani.ro/userfiles/GEO\_press\_release.doc


Fritzke, B. (1997). *Some competitive learning methods* (Draft document), 17.04.2011, Available

GEO/CEOSS. (2008). A Quality Assurance Framework for Earth Observation, Version 2.0,

Gouras, P. (1991). Color vision, In: *Principles of Neural Science*, E. Kandel & J. Schwartz,

Group on Earth Observations (GEO). (2005). The Global Earth Observation System of

Group on Earth Observations (GEO). (2008a). GEO announces free and unrestricted access

Group on Earth Observations (GEO). (2008b). GEO 2007-2009 Work Plan: Toward Convergence, 17.04.2011, Available from: http://earthobservations.org Gutman, G. *et al*., (Eds.). (2004). *Land Change Science*, Kluwer Academic Publishers,

Hay, G. J. & Castilla, G. (2006). Object-based image analysis: Strengths, weaknesses,

www.commission4.isprs.org/obia06/Papers/01\_Opening%20Session/OBIA2006\_

Hudelot, C.; Atif, J. & Bloch, I. (2008). Fuzzy spatial relation ontology for image interpretation. *Fuzzy Sets and Systems Archive*, Vol. 159 , No. 15, pp. 1929-1951. Jain, A. K.; Duin, R. & Mao, J. (2000). Statistical pattern recognition: A review. *IEEE Trans.* 

Jain, A. & Healey, G. (1998). A multiscale representation including opponent color features for texture recognition. *IEEE Trans. Image Proc.*, Vol. 7, No. 1, pp. 124-128. Jordan, M. & Jacobs, R. (1994). Hierarchical mixtures of experts and the EM algorithm.

Kandel, E. R. (1991). Perception of motion, depth and form, In: *Principles of Neural Science*, E.

Lawley, J. (2003). Self-organising complex-adaptive systems, In: *Large Group Metaphor* 

Legg, S. & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence.

*Process* (at the Findhorn Community), 17.04.2011, Available from: http://www.cleanlanguage.co.uk/articles/articles/216/1/Self-Organising-

Kandel & J. Schwartz, (Eds.), pp. 441-466, Appleton and Lange, Norwalk,

17.04.2011, Available from: http://arxiv.org/abs/0706.3639.Marr, D. (1982). *Vision*.

opportunities and threats (SWOT), *Proc. 1st Int. Conf. Object-based Image Analysis*

Systems (GEOSS) 10-Year Implementation Plan, 17.04.2011, Available from: http://www.earthobservations.org/docs/10-Year%20Implementation%20Plan.pdf

 http://calvalportal.ceos.org/CalValPortal/showQA4EO.do?section=qa4eoIntro Global Monitoring for Environment and Security (GMES) (2011). 17.04.2011, Available from:

M.A. Friedl & T.J. Case, (Eds), pp. 258–283, Springer, New York.

(Eds.), pp. 467-479, Appleton and Lange, Norwalk, Connecticut.

to full Landsat archive, 17.04.2011, Available from: www.fabricadebani.ro/userfiles/GEO\_press\_release.doc

*Pattern. Anal. Machine Intell.*, Vol. 22, No. 1, pp. 4-37.

DemoGNG.

17.04.2011, Available from:

http://www.gmes.info

Dordrecht, The Netherlands.

Hay\_Castilla.pdf

Connecticut.

(OBIA), 2006. 17.04.2011, Available from:

*Neural Computation*, Vol. 6, pp.181–214.

Systems-Findhorn/Page1.html

Freeman and C., New York.

*Implications for Remote Sensing and GIS Applications*, C.T. Hunsaker, M.F. Goodchild,

from: http://www.neuroinformatik.ruhr-unibochum.de/ini/VDM/research/gsn/


http://www.geog.umontreal.ca/donnees/geo6333/atcor23\_manual.pdf


**Vision Goes Symbolic Without Loss of Information Within the Preattentive Vision Phase: The Need to Shift the Learning Paradigm from Machine-Learning (from Examples) to Machine-Teaching (by Rules) at the First Stage of a Two-Stage Hybrid Remote Sensing Image Understanding System, Part II: Novel Developments and Conclusions** 

> Andrea Baraldi *Department of Geography, University of Maryland, College Park, Maryland, USA*

#### **1. Introduction**

98 Earth Observation

Sjahputera, O.; Davis, C.H.; Claywell, B.; Hudson, N.J.; Keller, J.M.; Vincent, M.G.; Li, Y.;

T.F. Cootes and C.J.Taylor, Imaging Science and Biomedical Engineering, Draft Report,

Tapsall, B.; Milenov, P. & Tasdemir, K. (2010). Analysis of RapidEye imagery for annual

USGS & NASA. (2011). Web-enabled Landsat data (WELD) Project. 17.04.2011, Available

Vecera, S. P. & Farah, M. J. (1997). Is visual image segmentation a bottom-up or an interactive process?. *Perception & Psychophysics*, Vol. 59, pp. 1280–1296. Wald, L.; Ranchin, T. & Mangolini, M. (1997). Fusion of satellite images of different spatial

Wang, Z. & Bovik, A. C. (2002). A universal image quality index. *IEEE Signal Proc. Letters*,

Wilson, H. R. & Bergen, J. R. (1979). A four mechanism model for threshold spatial vision.

Wilson, D. R. & Martinez, T. R. (2000). The Inefficiency of batch training for large training

Yang, J. & Wang, R. S. (2007). Classified road detection from satellite images based on perceptual organization. *Int. J. Remote Sensing*, Vol. 28, No. 20, pp. 4653-4669. Zamperoni, P. (1996). Plus ça va, moins ça va. *Pattern Recognition Letters*, Vol. 17, No. 7,

Zins, C. (2007). Conceptual approaches for defining data, information, and knowledge.

http://www.isbe.man.ac.uk/~bim/Models/app\_models.pdf.

\_Tapsall\_BG\_RapidEye\_Project\_JRC\_ver2.pdf

*IEEE Trans. Neural Netw*., Vol. 16, No. 3, pp. 645-678.

from: http://landsat.usgs.gov/WELD.php

*Sens.*, Vol. 63, No. 6, pp. 691-699.

*Vision Res*., Vol. 19, pp. 19-32.

Vol. 9, No. 3, pp. 81-84.

(1996), pp. 671-677.

pp. 479-493.

FR4.101.1.

Klaric, M. & Shyu, C.R. (2008). GeoCDX: An automated change detection and exploitation system for high resolution satellite imagery, *Proc. Int. Geoscience and Remote Sensing Symposium* (IGARSS), Boston (MT), July 6-11, 2008, paper no.

University of Manchester, Manchester, March 8, 2004, Available from:

land cover mapping as an aid to European Union (EU) Common agricultural policy, W. Wagner, B. Székely, (Eds*.): ISPRS TC VII Symposium – 100 Years ISPRS*, Vienna, Austria, July 5–7, 2010, IAPRS, Vol. XXXVIII, Part 7B. 17.04.2011, Available from: http://mars.jrc.it/mars/content/download/1648/8982/file/P4-5\_Milenov

resolutions: Assessing the quality of resulting images. *Photogramm. Eng. Remote* 

sets. *IEEE-INNS-ENNS International Joint Conference on Neural Networks* (IJCNN'00), 2000, vol. 2, pp.2113.Xu, R. & Wunsch II, D. (2005). Survey of clustering algorithms.

*Journal of the American Society for Information Science and Technology*, Vol. 58, No. 4,

The goal of this work is to revise, integrate and enrich previous analyses found in related papers about recent developments in the design and implementation of an operational automatic multi-sensor multi-resolution near real-time two-stage hybrid stratified hierarchical remote sensing (RS) image understanding system (RS-IUS) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a).

For publication reasons this work consists of two companion papers, Part I and Part II respectively. In Part I related papers, concepts and definitions are revised from existing literature to provide this work with a significant survey value and make it self-contained. The survey of past works is completed in Part II Section 2, where differences at the architectural level between different families of existing RS-IUSs, namely, multi-agent hybrid RS-IUSs, two-stage segment-based RS-IUSs and two-stage stratified hierarchical hybrid RS-IUSs, are highlighted.

The original contribution of Part II is to propose novel definitions of objective continuous sub-symbolic sensory data, continuous physical information, subjective discrete semisymbolic data structure, discrete semantic-square (semantic2) information (which is naturally generated from the simultaneous combination of three components: (I) an objective continuous sensory data set, (II) an external subjective supervisor (observer) and (III) his/her own subjective prior ontology equivalent to a model of the (3-D) world existing before looking at the objective sensory data at hand) and prior knowledge base.

In practical contexts the aforementioned original definitions imply the following.


Some practical conclusions of potential interest to the RS, computer vision (CV), artificial intelligence (AI) and machine learning (MAL) communities stem from these speculations. Firstly, in operational contexts (e.g., RS image classification problems at national/ continental/ global scale), other than toy problems (e.g., RS image mapping at coarse spatial resolution and local/regional scale), inductive classifiers capable of learning from a finite labeled data set are considered structurally inadequate to correlate (rather than extract, see this text above) discrete semantic2 information with objective sensory data provided, *per se*, with no semantics at all.

Secondly, to increase the operational quality indicators (QIs) of existing two-stage hybrid RS-IUSs (namely, degree of automation, accuracy, efficiency, robustness to changes in input parameters, robustness to changes in the input data set, scalability, timeliness and economy), any first-stage inductive MAL-from-examples approach should be replaced by a deductive Machine Teaching (MAT)-by-rules capable of generating a preliminary classification first stage where small, but genuine image details are well preserved (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a).

Thirdly, in RS-IUSs, MAL-from-data algorithms, either labeled (supervised) or unlabeled (unsupervised), either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., 2-D object-based), should be adapted to work on a driven-by-knowledge stratified (semantic masked, layered) basis and moved to the second stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture recently proposed in RS literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

As a proof of these concepts, the operational automatic multi-sensor multi-resolution near real-time Satellite Image Automatic Mapper™ (SIAM™), recently presented in RS literature1 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b), is adopted as first stage.

The rest of Part II of this work is organized as follows. Part II Section 3 discusses theoretical inconsistencies and algorithmic drawbacks found in Diamant's works (discussed in Part I Section 2.2 and Part I Section 2.5). Revised/novel definitions of objective continuous sensory data, continuous physical information, discrete semantic2 information and prior knowledge are provided in Part II Section 4. In Part II Section 5 practical consequences of the novel definitions provided in Part II Section 4 are considered for CV, AI and MAL applications. Part II Section 6 presents the operational automatic multi-sensor multi-resolution near realtime SIAM™ as a proof of the original concepts proposed in this work. Conclusions are reported in Part II Section 7.

<sup>1</sup> SIAM™ - Patent pending - © Andrea Baraldi University of Maryland.

### **2. Related works (continued): Taxonomy of hybrid RS-IUS architectures**

As reported in Part I Section 2.1, there is a new trend of research and development in both CV (Cootes & Taylor, 2004) and RS literature (Matsuyama & Shang-Shouq Hwang, 1990; Shunlin Liang, 2004) to outperform existing scientific and commercial image understanding systems. This novel trend focuses on the development of quantitative hybrid models for retrieving sub-symbolic continuous variables (e.g., LAI) and symbolic categorical discrete variables (e.g., land cover composition) from multi-spectral (MS) imagery. By definition, hybrid models combine both statistical and physical models to take advantage of the unique features of each and overcome their shortcomings (see Part I Section 2.1). The study of hybrid quantitative models is also called AI systems integration. In this section, the taxonomy of hybrid RS-IUSs is summarized in line with (Baraldi et al., 2010a). It consists of:

multi-agent hybrid RS-IUSs,

100 Earth Observation

a. It is impossible to *extract* semantic2 information from objective continuous sensory data

b. It is possible to *correlate* discrete semantic2 information to objective continuous sensory data. Unfortunately, correlation between continuous sensory data and a finite and discrete set of categorical variables, corresponding to independent random variables generating separable data structures (data aggregations, data clusters, data objects), is low in realworld RS image mapping problems at large data scale or fine semantic granularity, other

Some practical conclusions of potential interest to the RS, computer vision (CV), artificial intelligence (AI) and machine learning (MAL) communities stem from these speculations. Firstly, in operational contexts (e.g., RS image classification problems at national/ continental/ global scale), other than toy problems (e.g., RS image mapping at coarse spatial resolution and local/regional scale), inductive classifiers capable of learning from a finite labeled data set are considered structurally inadequate to correlate (rather than extract, see this text above) discrete semantic2 information with objective sensory data provided, *per se*,

Secondly, to increase the operational quality indicators (QIs) of existing two-stage hybrid RS-IUSs (namely, degree of automation, accuracy, efficiency, robustness to changes in input parameters, robustness to changes in the input data set, scalability, timeliness and economy), any first-stage inductive MAL-from-examples approach should be replaced by a deductive Machine Teaching (MAT)-by-rules capable of generating a preliminary classification first stage where small, but genuine image details are well preserved (Baraldi et al., 2006; Baraldi et al.,

Thirdly, in RS-IUSs, MAL-from-data algorithms, either labeled (supervised) or unlabeled (unsupervised), either context-insensitive (e.g., pixel-based) or context-sensitive (e.g., 2-D object-based), should be adapted to work on a driven-by-knowledge stratified (semantic masked, layered) basis and moved to the second stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture recently proposed in RS literature (Baraldi et al., 2006a; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). As a proof of these concepts, the operational automatic multi-sensor multi-resolution near real-time Satellite Image Automatic Mapper™ (SIAM™), recently presented in RS literature1 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi,

The rest of Part II of this work is organized as follows. Part II Section 3 discusses theoretical inconsistencies and algorithmic drawbacks found in Diamant's works (discussed in Part I Section 2.2 and Part I Section 2.5). Revised/novel definitions of objective continuous sensory data, continuous physical information, discrete semantic2 information and prior knowledge are provided in Part II Section 4. In Part II Section 5 practical consequences of the novel definitions provided in Part II Section 4 are considered for CV, AI and MAL applications. Part II Section 6 presents the operational automatic multi-sensor multi-resolution near realtime SIAM™ as a proof of the original concepts proposed in this work. Conclusions are

because the latter, *per se*, are provided with no semantics at all.

than toy problems at small data scale and coarse semantic granularity.

with no semantics at all.

2010a; Baraldi et al., 2010b; Baraldi, 2011a).

2011a; Baraldi, 2011b), is adopted as first stage.

<sup>1</sup> SIAM™ - Patent pending - © Andrea Baraldi University of Maryland.

reported in Part II Section 7.


#### **2.1 Multi-agent hybrid RS-IUSs**

In existing literature multi-agent hybrid RS-IUSs provide application-specific combinations of inductive and deductive inference mechanisms (Matsuyama & Shang-Shouq Hwang, 1990). A traditional multi-agent hybrid RS-IUS architecture comprises the following modules (see Fig. 1).


2.3). The combination of top-down with bottom-up inference strategies achieves two operational advantages: (a) provides better conditions for an otherwise ill-posed drivenwithout-knowledge segmentation first stage (refer to Part I Section 2.3) and (b) allows restriction of intensive processing to a small portion of the image data (Matsuyama & Shang-Shouq Hwang, 1990), analogously to a focus of visual attention in pre-attentive biological vision (Mason & Kandel, 1991; Gouras, 1991; Kandel, 1991). The high-level processing second stage comprises (Matsuyama & Shang-Shouq Hwang, 1990): (I) a Spatial Reasoning Expert (SRE) whose aim is to trigger the instantiation, within a candidate local area, of plausible generic (3-D) object models found in the available world model, e.g., house, and (II) a SOMSE (refer to this text above) which uses domaindependent knowledge about specific applications to: (i) prune the search space of specialized (3-D) object models (e.g., rectangular house, L-shaped house, etc.) linked by A-KIND-OF relations to the generic target (3-D) object model (e.g., house) provided by SRE; (ii) transform the 3-D appearance properties of the specialized (3-D) object model into a selected set of 2-D appearance properties based on the imaging sensor model; (iii) transform a target spatial relation in fuzzy terms (e.g., in front of) provided by SRE into a local area based on a trial-and-error heuristic search with no concrete theoretical basis and (iv) provide a consistency examination between quantitative absolute image features collected by LLVE in a local area and the target 2-D appearance constraints. In other words, the 2-D appearance properties must be satisfied by image features extracted by LLVE from a local area. Since the image structure in a local area is very simple compared with that of the entire image, image feature extraction performed by an object modeldriven and locational constrained LLVE can be very efficient and reliable compared with that performed by the same LLVE run image-wide at the first stage (Matsuyama & Shang-Shouq Hwang, 1990) (p. 41).

Fig. 1. Multi-agent hybrid systems for RS image understanding (derived from Figure 2.1 in (Matsuyama & Shang-Shouq Hwang, 1990), p. 36).

Shouq Hwang, 1990) (p. 41).

properties.

**(3-D) World model**  1. 3-D object model appearance

2. Generalization / specialization hierarchy based on A-KIND-OF relations. 3. Hierarchy based on PART-OF relations. 4. Ontology of fuzzy spatial relations between different classes of objects.

> **3-D scene features 2-D image features**

(Matsuyama & Shang-Shouq Hwang, 1990), p. 36).

2.3). The combination of top-down with bottom-up inference strategies achieves two operational advantages: (a) provides better conditions for an otherwise ill-posed drivenwithout-knowledge segmentation first stage (refer to Part I Section 2.3) and (b) allows restriction of intensive processing to a small portion of the image data (Matsuyama & Shang-Shouq Hwang, 1990), analogously to a focus of visual attention in pre-attentive biological vision (Mason & Kandel, 1991; Gouras, 1991; Kandel, 1991). The high-level processing second stage comprises (Matsuyama & Shang-Shouq Hwang, 1990): (I) a Spatial Reasoning Expert (SRE) whose aim is to trigger the instantiation, within a candidate local area, of plausible generic (3-D) object models found in the available world model, e.g., house, and (II) a SOMSE (refer to this text above) which uses domaindependent knowledge about specific applications to: (i) prune the search space of specialized (3-D) object models (e.g., rectangular house, L-shaped house, etc.) linked by A-KIND-OF relations to the generic target (3-D) object model (e.g., house) provided by SRE; (ii) transform the 3-D appearance properties of the specialized (3-D) object model into a selected set of 2-D appearance properties based on the imaging sensor model; (iii) transform a target spatial relation in fuzzy terms (e.g., in front of) provided by SRE into a local area based on a trial-and-error heuristic search with no concrete theoretical basis and (iv) provide a consistency examination between quantitative absolute image features collected by LLVE in a local area and the target 2-D appearance constraints. In other words, the 2-D appearance properties must be satisfied by image features extracted by LLVE from a local area. Since the image structure in a local area is very simple compared with that of the entire image, image feature extraction performed by an object modeldriven and locational constrained LLVE can be very efficient and reliable compared with that performed by the same LLVE run image-wide at the first stage (Matsuyama & Shang-

> **Low-Level Vision Expert (LLVE)**

> > **(2-D) Image**

**Specialized Object Model Selection Expert (SOMSE)** 

Query Answer

Query Answer

**Spatial Reasoning Expert (SRE)** 

1÷4

1, 2

Fig. 1. Multi-agent hybrid systems for RS image understanding (derived from Figure 2.1 in


Table 1. SIAM™ system of systems. List of spaceborne optical imaging sensors eligible for use as input.

Multi-agent hybrid systems typically suffer from two main limitations.


To overcome these limitations, an alternative two-stage stratified hierarchical hybrid RS-IUS architecture, such as that shown in Fig. 3, was proposed in recent literature (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a; Baraldi, 2011b; Baraldi et al., 2010c).

#### **2.2 Two-stage segment-based RS-IUSs**

Two-stage segment-based RS-IUSs comprise an inductive driven-without-knowledge image segmentation first stage and a second-stage object-based classifier, see Fig. 2. The latter can be implemented based on deductive or inductive inference mechanisms, say, as a prior knowledge-based non-adaptive decision-tree or a supervised data learning classifier (e.g., a Support Vector Machine, SVM (Bruzzone & Carlin, 2006)).

Due to the availability of a commercial GEOBIA software developed by a German company (Definiens Imaging GmbH, 2004; Esch et al., 2008), two-stage segment-based RS-IUSs have recently gained widespread popularity and are currently considered the state-of-the-art in both scientific and commercial RS image mapping application domains (Mather, 1994; Pekkarinen, Reithmaier & Strobl, 2009). In practice, under the guise of 'flexibility' current commercial 2-D object-based software provides overly complicated options to choose from (Hay & Castilla, 2006). This means that with their increasing diffusion commercial two-stage segment-based RS-IUSs show an increasing lack of productivity (Tapsall et al., 2010), consensus and research (Castilla et al., 2008; Hay & Castilla, 2006) (refer to Part I Section 2.4.1.2).

#### **2.3 Two-stage stratified hierarchical hybrid RS-IUS employing SIAM™ as its preliminary classification first stage**

Accounting for the customary distinction between a model and the algorithm used to identify it (Baraldi et al., 2010a; Baraldi, 2011a), an original two-stage stratified hierarchical hybrid RS-IUS architecture (see Fig. 3) was identified starting from several RS-IUS

 In addition to the intrinsic insufficiency of image features, e.g., due to occlusion and dimensionality reduction (refer to Part I Section 2.3), these systems are affected by the so-called artificial insufficiency caused by the inherent ill-posedness of the image segmentation problem (Matsuyama & Shang-Shouq Hwang, 1990) (see Part I Section 2.4.1.2). This means that in RS common practice any first-stage image segmentation algorithm is simultaneously affected by both omission and commission segmentation errors. Although the inherent ill-posedness of image segmentation is acknowledged by a reasonable portion of existing literature (Burr & Morrone, 1992; Corcoran et al., 2010; Corcoran & Winstanley, 2007; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou & Sevilla, 2006; Vecera & Farah, 1997), this is often forgotten by a large segment of the RS community where literally dozens of "novel" segmentation algorithms are published each year (Zamperoni, 1996) (refer to Part I

 Semantic nets lack flexibility and scalability to cope with changes in sensor characteristics and users' changing needs, i.e., they are unsuitable for commercial RS image processing software toolboxes and remain limited to scientific applications. To overcome these limitations, an alternative two-stage stratified hierarchical hybrid RS-IUS architecture, such as that shown in Fig. 3, was proposed in recent literature (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a; Baraldi, 2011b; Baraldi et al.,

Two-stage segment-based RS-IUSs comprise an inductive driven-without-knowledge image segmentation first stage and a second-stage object-based classifier, see Fig. 2. The latter can be implemented based on deductive or inductive inference mechanisms, say, as a prior knowledge-based non-adaptive decision-tree or a supervised data learning classifier (e.g., a

Due to the availability of a commercial GEOBIA software developed by a German company (Definiens Imaging GmbH, 2004; Esch et al., 2008), two-stage segment-based RS-IUSs have recently gained widespread popularity and are currently considered the state-of-the-art in both scientific and commercial RS image mapping application domains (Mather, 1994; Pekkarinen, Reithmaier & Strobl, 2009). In practice, under the guise of 'flexibility' current commercial 2-D object-based software provides overly complicated options to choose from (Hay & Castilla, 2006). This means that with their increasing diffusion commercial two-stage segment-based RS-IUSs show an increasing lack of productivity (Tapsall et al., 2010), consensus and research (Castilla et al., 2008; Hay & Castilla, 2006) (refer to Part I Section

**2.3 Two-stage stratified hierarchical hybrid RS-IUS employing SIAM™ as its** 

Accounting for the customary distinction between a model and the algorithm used to identify it (Baraldi et al., 2010a; Baraldi, 2011a), an original two-stage stratified hierarchical hybrid RS-IUS architecture (see Fig. 3) was identified starting from several RS-IUS

Multi-agent hybrid systems typically suffer from two main limitations.

Section 2.4.1.2).

**2.2 Two-stage segment-based RS-IUSs** 

**preliminary classification first stage** 

Support Vector Machine, SVM (Bruzzone & Carlin, 2006)).

2010c).

2.4.1.2).

Fig. 2. Two-stage segment-based hybrid RS-IUS architecture adopted, for example, by the eCognition commercial software toolbox (Definiens Imaging GmbH, 2004). Preliminary image simplification is pursued by means of an (ill-posed hierarchical) image segmentation approach which generates as output a segmented (discrete) map, either single-scale or multi-scale. Worthy of note is that first-stage output sub-symbolic informational primitives, namely, labeled segments (2-D objects, parcels), e.g., segment 1, segment 2, etc., are provided with no semantic meaning.

implementations proposed by Shackelford and Davis in recent years (Shackelford & Davis, 2003a; Shackelford & Davis, 2003b). This novel RS-IUS architecture comprises the following phases (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

a. A radiometric calibration pre-processing stage, where DNs are transformed into top-ofatmosphere reflectance (TOARF) or surface reflectance (SURF) values, with TOARF SURF, the latter being an ideal (atmospheric noise-free) case of the former. This radiometric calibration constraint not only ensures the harmonization and interoperability of multi-source observational data in line with the Quality Assurance Framework for EO (QA4EO) guidelines (GEO/CEOSS, 2008), but is considered a necessary, although not sufficient, condition for input Earth observation (EO) imagery to be automatically interpreted (see Part I Section 2.7.1). It is worth mentioning that a RS-IUS suitable for mapping TOARF values into surface categories makes the inherently ill-posed (therefore, difficult to solve) atmospheric correction problem an optional MS image pre-processing stage unlike competing classification approaches employing surface reflectance spectra, such as the ERDAS ATCOR3 (Richter, 2006) (see Part I Section 2.7.1).

Fig. 3. Novel hybrid two-stage stratified hierarchical RS-IUS architecture. This data flow diagram (DFD) shows processing blocks as rectangles and sensor derived data products as circles. In this example, a SPOT-5 MS image is adopted as input. The panchromatic (PAN) image can be generated from the MS image. The MS image is input to the preliminary classification first stage and, if useful, to second-stage class-specific classification modules. The PAN image is exclusively employed as input to second-stage stratified class-specific context-sensitive classification modules, where color information is dealt with by stratification. For example, stratified texture detection is computed in the PAN image domain, which reduces computation time.


In (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b), the abovementioned first-stage pixel-based preliminary classifier was designed and implemented as an original operational automatic near-real-time per-pixel multi-source multi-resolution application-independent SIAM™. To employ as input a radiometrically calibrated MS image acquired by almost any of the ongoing or future planned satellite optical missions, SIAM™ is designed as an integrated system of systems. It comprises a "master" 7-band Landsat-like SIAM™ (L-SIAM™) together with five downscaled ("slave", derived) versions of L-SIAM™ whose input is a MS image featuring a spectral resolution that overlaps with, but is inferior to, Landsat's. To summarize, SIAM™ combines six sub-systems (refer to Table 1).

Morpho. Top-hat open and close, Contrast Texture Feature, Length-Width Local Feature

> Stratum 4 Three-class Fuzzy Rule-based Classifier (Semantic net)

Road/Building/Barren land Mask

> Stratum 4 (betterposed) Segmentation

7 6

8

Radiometric calibration of DNs into TOA reflectance values

Preliminary spectral rulebased classification (SRC)

> Water/Shadow Mask

Stratum 3 Two-class Fuzzy Rule-based Classifier (Semantic net)

<sup>2</sup> <sup>3</sup> <sup>4</sup> <sup>5</sup>

modules for class-specific feature extraction and classification.

Stratum 2 Fuzzy Rule-based Classifier (Semantic net)

Defuzzification (Crisp 1-of-7 class label)

Fig. 3. Novel hybrid two-stage stratified hierarchical RS-IUS architecture. This data flow diagram (DFD) shows processing blocks as rectangles and sensor derived data products as circles. In this example, a SPOT-5 MS image is adopted as input. The panchromatic (PAN) image can be generated from the MS image. The MS image is input to the preliminary classification first stage and, if useful, to second-stage class-specific classification modules. The PAN image is exclusively employed as input to second-stage stratified class-specific context-sensitive classification modules, where color information is dealt with by stratification. For example, stratified texture detection is computed in the PAN image

b. A first-stage application-independent per-pixel (non-contextual) top-down (prior knowledge-based, see Part I Section 2.1) preliminary classifier in the Marr sense (Marr,

c. A second-stage battery of stratified hierarchical context-sensitive application-dependent

In (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b), the abovementioned first-stage pixel-based preliminary classifier was designed and implemented as an original operational automatic near-real-time per-pixel multi-source multi-resolution application-independent SIAM™. To employ as input a radiometrically calibrated MS image acquired by almost any of the ongoing or future planned satellite optical missions, SIAM™ is designed as an integrated system of systems. It comprises a "master" 7-band Landsat-like SIAM™ (L-SIAM™) together with five downscaled ("slave", derived) versions of L-SIAM™ whose input is a MS image featuring a spectral resolution that overlaps with, but is inferior to, Landsat's. To summarize, SIAM™

Rest of the world (e.g., Snow Mask, Cloud Mask, Smoke plume Mask

SPOT-5 XS 10 m resolution or below Input Image

SPOT-5 PAN 10 m resolution or below Input Image

Morpho. Top-hat open and close, Contrast Texture Feature

Stratum 1 Two-class Fuzzy Rule-based Classifier (Semantic net)

1

1982).

Grass/Tree Mask

domain, which reduces computation time.

combines six sub-systems (refer to Table 1).


Table 2. Preliminary classification map legend adopted by L-SIAM™ at fine semantic granularity. Pseudo-colors of the 95 spectral categories are gathered based on their spectral end member (e.g., bare soil or built-up) or parent spectral category (e.g., "high" LAI vegetation types). The pseudo-color of a spectral category is chosen as to mimic natural colors of pixels belonging to that spectral category.

Table 3. Preliminary classification map legend adopted by I-SIAM™ at fine semantic granularity. Pseudo-colors of the 52 spectral categories are gathered based on their spectral end member (e.g., bare soil or built-up) or parent spectral category (e.g., "high" LAI vegetation types). The pseudo-color of a spectral category is chosen as to mimic natural colors of pixels belonging to that spectral category.

Fig. 4 to Fig. 6 show qualitatively that, in disagreement with a common opinion in the RS community where GEOBIA is considered indispensable for spaceborne VHR image understanding (Bruzzone & Carlin, 2006; Bruzzone & Persello, 2009; Persello & Bruzzone, 2010), the pixel-based SIAM™ is very successful in the automatic mapping of RS imagery, including VHR images (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). This means that SIAM™ is not affected by the well-known salt-and-pepper classification noise effect which traditionally affects ordinary pixel-based classifiers (e.g., maximum-likelihood classifiers (Cherkassky and Mulier, 2006)), which is tantamount to saying that SIAM™ is successful in modeling the within-spectralcategory variance.

Fig. 4(a). Web-Enabled Landsat Data (WELD) Project (USGS & NASA, 2011). This is a joint NASA and USGS project providing seamless consistent mosaics of fused Landsat-7 Enhanced TM Plus (ETM+) and MODIS data radiometrically calibrated into top-ofatmosphere reflectance (TOARF) and surface reflectance. These mosaics are made freely available to the user community. Each consists of 663 fixed location tiles. Spatial resolution: 30 m. Area coverage: Continental USA and Alaska. Period coverage: 7-year. Product time coverage: weekly, monthly, seasonal and annual composites.

Fig. 4 to Fig. 6 show qualitatively that, in disagreement with a common opinion in the RS community where GEOBIA is considered indispensable for spaceborne VHR image understanding (Bruzzone & Carlin, 2006; Bruzzone & Persello, 2009; Persello & Bruzzone, 2010), the pixel-based SIAM™ is very successful in the automatic mapping of RS imagery, including VHR images (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). This means that SIAM™ is not affected by the well-known salt-and-pepper classification noise effect which traditionally affects ordinary pixel-based classifiers (e.g., maximum-likelihood classifiers (Cherkassky and Mulier, 2006)), which is tantamount to saying that SIAM™ is successful in modeling the within-spectral-

Fig. 4(a). Web-Enabled Landsat Data (WELD) Project (USGS & NASA, 2011). This is a joint NASA and USGS project providing seamless consistent mosaics of fused Landsat-7 Enhanced TM Plus (ETM+) and MODIS data radiometrically calibrated into top-ofatmosphere reflectance (TOARF) and surface reflectance. These mosaics are made freely available to the user community. Each consists of 663 fixed location tiles. Spatial resolution: 30 m. Area coverage: Continental USA and Alaska. Period coverage: 7-year. Product time

coverage: weekly, monthly, seasonal and annual composites.

category variance.

Fig. 4(b). Including the map of Alaska at the top right. Preliminary classification map automatically generated by L-SIAM™ from the 2008 annual WELD mosaic shown in Fig. 4(a). Output spectral categories are depicted in pseudo colors. Map legend: refer to Table 2. To generate this map at national scale L-SIAM™ was run overnight by L. Boschetti (Univ. of Maryland) in Dec. 2010. To the best of this author's knowledge, this is the first example of such a high-level product automatically generated at both the NASA and USGS.

Fig. 5(a). 4-band GMES-IMAGE2006 Coverage 1 mosaic, consisting of approximately two thousand 4-band IRS-P6 LISS-III, SPOT-4, and SPOT-5 images, mostly acquired during the year 2006, depicted in false colors: Red – Band 4 (Short Wave InfraRed, SWIR), Green – Band 3 (Near IR, NIR), Blue – Band 1 (Visible Green). Down-scaled spatial resolution: 25 m.

Fig. 5(b). Preliminary classification map automatically generated by S-SIAM™ from the mosaic shown in Fig. 5(a). Output spectral categories are depicted in pseudo colors. A map legend similar to Table **2** is adopted: water and shadow areas are in blue, clouds in white, snow and ice in light blue, vegetation types in different shades of green, rangeland types in different shades of light green, barren land types in different shades of brown and grey. To the best of this author's knowledge, this is the first example of such a high-level product automatically generated at the European Commission – Joint Research Center (EC-JRC).

Fig. 5(a). 4-band GMES-IMAGE2006 Coverage 1 mosaic, consisting of approximately two thousand 4-band IRS-P6 LISS-III, SPOT-4, and SPOT-5 images, mostly acquired during the year 2006, depicted in false colors: Red – Band 4 (Short Wave InfraRed, SWIR), Green – Band 3 (Near IR, NIR), Blue – Band 1 (Visible Green). Down-scaled spatial resolution: 25 m.

Fig. 5(b). Preliminary classification map automatically generated by S-SIAM™ from the mosaic shown in Fig. 5(a). Output spectral categories are depicted in pseudo colors. A map legend similar to Table **2** is adopted: water and shadow areas are in blue, clouds in white, snow and ice in light blue, vegetation types in different shades of green, rangeland types in different shades of light green, barren land types in different shades of brown and grey. To the best of this author's knowledge, this is the first example of such a high-level product automatically generated at the European Commission – Joint Research Center (EC-JRC).

Fig. 6(a). QuickBird-2 image, 2.4 m spatial resolution, acquisition date 2010-03-16, radiometrically calibrated into TOARF values, depicted in false colors (R: 3, G: 4, B: 1). Default image histogram stretching: ENVI linear stretching 2%.

Fig. 6(b). Automatic Q-SIAM™ preliminary mapping of the QB-2 image shown in Fig. 6(a). Spectral categories are depicted in pseudo colors. Map legend: see Table 3. It is noteworthy that, within the Q-SIAM™ mutually exclusive and completely exhaustive classification scheme, cloud detection is *per se* an interesting operational product with relevant commercial applications and, to the best of these authors' knowledge, without alternative solutions in either commercial or scientific RS-IUSs.

Fig. 7(a). Zoomed area of a Landsat 7 ETM+ image of Virginia, USA (path: 16, row: 34, acquisition date: 2002-09-13), depicted in false colors (R: band ETM5, G: band ETM4, B: band ETM1), 30 m resolution, calibrated into TOARF values.

Fig. 7(b). 2nd-stage stratified vegetated land cover classification map generated in series with the L-SIAM™ first stage from Fig. 7(a). This 2nd-stage map consists of 19 vegetated/non-vegetated land cover classes, depicted in pseudo-colors, including: crop field or grassland, broad-leaf forest, needle-leaf forest and non-vegetated pixels (in black). Input features are: spectral layers generated by L-SIAM™, (achromatic) brightness and multi-scale isotropic texture features extracted from the brightness image.

Fig. 7(a). Zoomed area of a Landsat 7 ETM+ image of Virginia, USA (path: 16, row: 34, acquisition date: 2002-09-13), depicted in false colors (R: band ETM5, G: band ETM4, B: band

Fig. 7(b). 2nd-stage stratified vegetated land cover classification map generated in series

vegetated/non-vegetated land cover classes, depicted in pseudo-colors, including: crop field or grassland, broad-leaf forest, needle-leaf forest and non-vegetated pixels (in black). Input features are: spectral layers generated by L-SIAM™, (achromatic) brightness and multi-scale

with the L-SIAM™ first stage from Fig. 7(a). This 2nd-stage map consists of 19

isotropic texture features extracted from the brightness image.

ETM1), 30 m resolution, calibrated into TOARF values.

To the best of this author's knowledge no unifying automatic multi-sensor multi-resolution near real-time RS image classification platform alternative to SIAM™ can be found in existing literature. This is tantamount to saying that SIAM™ provides the first operational example of an automatic multi-sensor multi-resolution near real-time EO system of systems envisaged under on-going international research programs such as the Global EO System of Systems (GEOSS) conceived by the Group on Earth Observations (GEO) (GEO, 2005; GEO, 2008a) and the Global Monitoring for the Environment and Security (GMES), which is an initiative led by the European Union (EU) in partnership with the European Space Agency (ESA) (ESA, 2008; GMES, 2011) (see Part I Section 1).

Fig. 7 shows an example of an automatic 2nd-stage stratified rule-based vegetated land cover classification system in series with the L-SIAM™ first stage. The two-stage automatic classifier employing L-SIAM™ as preliminary classification first stage (refer to Fig. 3) is input with a 7-band Landsat image radiometrically calibrated into TOARF values, shown in Fig. 7(a). The 2nd-stage stratified rule-based vegetated land cover classification system in series with the L-SIAM™ first stage employs as input features: spectral-based layers (strata, generated by L-SIAM™ at first stage), (achromatic) brightness and multi-scale isotropic texture extracted from the brightness image. The 2nd-stage classifier provides as output a classification map consisting of 19 vegetated/non-vegetated land cover classes, depicted in pseudo-colors, including: crop field or grassland, broad-leaf forest, needle-leaf forest and non-vegetated pixels (in black), see Fig. 7(b).

#### **3. Inconsistencies and limitations of the Diamant computational theory and algorithms**

An original analysis of the Diamant definitions reported in Part I Section 2.2.3 and Diamant's image segmentation and contour detection algorithms summarized in Part I Section 2.5 is provided below.

#### **3.1 Comments on the Diamant definitions of data, information and knowledge**

According to this author, the Diamant definitions reported in Part I Section 2.2.3 are affected by three major drawbacks.

i. Diamant states that "information elicitation (extraction) does not require incorporation of any high-level knowledge" (Diamant, 2010a; Diamant, 2010b), which is tantamount to saying that detection of non-semantic primary data structures (data objects), e.g., (2-D) image segments, in an unlabeled data set, e.g., a (2-D) image, does not require incorporation of any high-level (prior) knowledge. Based on this statement it is possible to conclude that despite his theoretical anti-conformism, namely, his willingness to replace the MAL-from-examples paradigm with the MAT-by-rules approach, Diamant is a conformist in practice. In fact, the Diamant image contour detection and image segmentation algorithms (see Part I Section 2.5) fit existing CV system architectures well established in literature, such as, respectively, the Marr CV system architecture, conceived in the 1980s and comprising a zero-crossings (contour detection) primal sketch, and RS-IUSs where an image segmentation first stage is adopted in agreement with the GEOBIA approach (see Part I Section 2.4.1.2). In other words, there is a clear contradiction in terms between the Diamant claim of replacing the MAL-from-examples with a MAT-by-rules paradigm and his practical proofs of concept, consisting of image segmentation and contour detection algorithms 100% consistent with the same MALfrom-examples paradigm he intends to overcome.


To conclude, Diamant appears to have totally misunderstood one of two facts about the MAL-from-examples paradigm. These two facts hold true for MAL from unlabeled data and MAL from labeled data algorithms, respectively, as described below.

a. MAL from unlabeled (unsupervised) data (see Part I Section 2.1 and Part I Section 2.4.1). Any machine learning from unlabeled data approach (e.g., unlabeled data clustering, image segmentation) is inherently ill-posed and requires prior knowledge to become better posed. It means that any attempt to extract non-semantic primary data structures (data objects), e.g., image segments and unlabeled data clusters, from an unlabeled data set (e.g., an image) without incorporation of high-level knowledge provided by an

ii. If the Diamant CV system coincides with a Marr CV system or an GEOBIA approach (refer to paragraph (i) above), then, in practical contexts, its operational QIs (see Part I Section 2.8) are expected to score as low as Marr's or OBIA's (refer to Part I Section 1, Part I Section 2.4.1.2 and Part II Section 2). At the level of understanding of an information processing system known as computational theory (system architecture, see Part I Section 2.6), GEOBIA scores low in operational contexts because, according to the present author, it goes symbolic as late as possible, namely, at the output of its second and last stage (see Fig. 2). This is in contrast with an important intuition by Marr stating that "vision goes symbolic almost immediately, right at the level of zero-crossings (first-stage primal

sketch)… without loss of information" (Marr, 1982) (p. 343) (see Part I Section 2.3). iii. To recover from the gap existing between Diamant's theoretical anti-conformism, but practical conformism (refer to paragraphs (i) and (ii) above), it is sufficient to observe that statements such as "information elicitation (aggregation) does not require incorporation of any high-level knowledge" (Diamant, 2010a; Diamant, 2010b), are in clear contradiction with a relevant section of existing literature (see Part I Section 2.4.1.2). In particular, Diamant considers primary data structures, equivalent to nonsemantic data objects (e.g., image segments), as "natural data structures which reflect some similarities among neighboring elements in the data. Therefore, defining them is certainly a well-grounded procedure that does not raise any objection, because objective (physical) laws underpin such a procedure" (Diamant, 2010a) (see Part I Section 2.2.3.2). In other words, "physical information, being a natural property of the data, can be extracted instantly from the data, and any special rules for such task accomplishment are not needed" (Diamant, 2010a). Unfortunately, no well-grounded (well-posed) inductive learning-from-unlabeled-data approach exists (see Part I Section 2.1). For example, both unlabeled data clustering and (2-D) image segmentation algorithms are inherently illposed (see Part I Section 2.4.1). By adopting the Diamant terminology it is possible to state that detection of "discernable" data structures is not at all a physical problem of objective nature: it is rather a typical semantic problem of a qualitative (subjective) nature, where prior knowledge (provided by an external supervisor) must come into play to make the inherently ill-posed inductive learning-from-data problem better posed, although subjective (see Part I Section 2.1). This is tantamount to saying that the conceptual foundation of GEOBIA, i.e., the relationship between inherently ill-posed sub-symbolic (2-D) image segments and symbolic (3-D) landscape objects, remains affected by a lack of general consensus and research (Hay & Castilla, 2006) (see Part I

To conclude, Diamant appears to have totally misunderstood one of two facts about the MAL-from-examples paradigm. These two facts hold true for MAL from unlabeled data and

a. MAL from unlabeled (unsupervised) data (see Part I Section 2.1 and Part I Section 2.4.1). Any machine learning from unlabeled data approach (e.g., unlabeled data clustering, image segmentation) is inherently ill-posed and requires prior knowledge to become better posed. It means that any attempt to extract non-semantic primary data structures (data objects), e.g., image segments and unlabeled data clusters, from an unlabeled data set (e.g., an image) without incorporation of high-level knowledge provided by an

MAL from labeled data algorithms, respectively, as described below.

from-examples paradigm he intends to overcome.

Section 2.4.1.2).

with a MAT-by-rules paradigm and his practical proofs of concept, consisting of image segmentation and contour detection algorithms 100% consistent with the same MAL-

external supervisor is a fatal misconception, committed by Diamant himself, stemming from the fallacies (inherent ill-posedness) of the MAL-from-examples paradigm.

b. MAL from labeled (supervised) data (see Part I Section 2.1 and Part I Section 2.4.2). It is true that, in Diamant's words, "knowledge about the rules that underpin (semantic) secondary (data) structures formation (from primary data structures considered as nonsemantic and driven-without-knowledge) is a property of human observers (or their artificial counterparts) and not an inherent property of the data... (therefore) attempts to extract semantics from data are a fatal misconception stemming from the fallacies of the data-processing paradigm..." (Diamant, 2010a). This quote implies that no semantic information can be extracted from objective sensory data, but a correlation function can be established between semantic concepts and objective data for toy data understanding problems exclusively (refer to Part I Section 1 and Part I Section 2.1).

#### **3.2 Comments on the Diamant image segmentation algorithm**

In practical terms, the image segmentation algorithm proposed by Diamant can be subjected to the following criticisms.

	- Degree of automation. The following questions remain unanswered. What is the number of the image segmentation-free parameters to be user-defined? Have these user-defined parameters a physical meaning? What is their range of change?
	- Robustness to changes in input parameters to be user-defined.
	- Robustness to changes in the input data set acquired across time, space and sensors. In his paper (Diamant, 2005) Diamant applies his image segmentation algorithm to a single toy problem whose input data set consists of a panchromatic image 640×480 pixels in size. What about color images? What about satellite imagery? What about synthetic images of known visual properties?
	- Scalability. For example, does this image segmentation algorithm apply to data sets of different spatial scales, e.g., mosaics of hundreds of satellite images to generate classification maps at global scale where small but genuine image details (e.g., one pixel-wide roads) must be well preserved? I am afraid it does not... Does it apply to different sensors and users?
	- Efficiency in computation time and memory occupation.
	- Accuracy in terms of spatial quality of the segment boundaries (Baraldi et al., 2005; Persello & Bruzzone, 2010).

The conclusion is that based on existing literature the overall quality of the Diamant image segmentation algorithm remains unknown, which is often the case with the dozens of alternative image segmentation algorithms published in RS and CV literature each year (refer to Part I Section 2.4.1.2). Perhaps it is also due to these implementation shortcomings that so many researchers and practitioners ignored or criticized Diamant's methodological speculations.

	- In (Diamant, 2005) Diamant writes "segmentation/classification" and then "spatially connected regional groups (of pixels)" as "clusters" rather than segments, blobs or regions (see Part I Section 2.3). It is well known that (2-D) image segmentation, labeled (supervised) data classification and unlabeled (unsupervised) data clustering are completely different inductive learning-from-data problems (see Part I Section 2.4). Mixing these terms is a relevant conceptual mistake.
	- It is well known that image region extraction is the dual task of edge detection, in fact they are both inherently ill-posed inductive learning-from-unlabeled data problems (see Part I Section 2.4.1.2). In (Diamant, 2005), quite surprisingly Diamant acknowledges the ill-posedness of edge detection, but appears to ignore the inherent ill-posedness (subjective nature) of image region extraction acknowledged by a relevant portion of existing literature (see Part I Section 2.4.1.2). In fact, he states: "the efficiency of (my own) unsupervised top-down directed region-based (learning from unlabeled data) image segmentation is hard to disprove today" (Diamant, 2005). For example, by replacing pixels belonging to the same segment with their segment-based mean value (often called mean image), Diamant's image segmentation algorithm provides as output a piecewise constant approximation of the input image. Of course, researchers and practitioners interested in texture segmentation would find the Diamant piecewise constant image segmentation of little utility. In fact, the Diamant image segmentation algorithm incorporates no texture model. In practice, it detects texture elements (textons) rather than textures (made of textons) in the image. This accounts for the subjective nature of the image segmentation problem which is apparently ignored by Diamant.

To summarize, the Diamant image segmentation algorithm appears as "yet another image segmentation algorithm" (Baraldi et al., 2010a) based on heuristics whose superiority against alternative approaches is completely unproved. In other words, the image segmentation algorithm proposed by Diamant cannot be considered as adequate proof of his concepts (see Part I Section 2.2.3.2).

#### **3.3 Comments on the Diamant contour detector**

In practical terms, the contour detection algorithm proposed by Diamant can be subjected to the following criticisms.

 The Diamant image segmentation algorithm is not quantitatively compared (see Part I Section 2.8) against at least one alternative approach in a test image set consisting of

The image segmentation algorithm proposed in (Diamant, 2005) is not technically

 In (Diamant, 2005) Diamant writes "segmentation/classification" and then "spatially connected regional groups (of pixels)" as "clusters" rather than segments, blobs or regions (see Part I Section 2.3). It is well known that (2-D) image segmentation, labeled (supervised) data classification and unlabeled (unsupervised) data clustering are completely different inductive learning-from-data problems (see Part I Section

 It is well known that image region extraction is the dual task of edge detection, in fact they are both inherently ill-posed inductive learning-from-unlabeled data problems (see Part I Section 2.4.1.2). In (Diamant, 2005), quite surprisingly Diamant acknowledges the ill-posedness of edge detection, but appears to ignore the inherent ill-posedness (subjective nature) of image region extraction acknowledged by a relevant portion of existing literature (see Part I Section 2.4.1.2). In fact, he states: "the efficiency of (my own) unsupervised top-down directed region-based (learning from unlabeled data) image segmentation is hard to disprove today" (Diamant, 2005). For example, by replacing pixels belonging to the same segment with their segment-based mean value (often called mean image), Diamant's image segmentation algorithm provides as output a piecewise constant approximation of the input image. Of course, researchers and practitioners interested in texture segmentation would find the Diamant piecewise constant image segmentation of little utility. In fact, the Diamant image segmentation algorithm incorporates no texture model. In practice, it detects texture elements (textons) rather than textures (made of textons) in the image. This accounts for the subjective nature of the image

both real and synthetic images (Baraldi et al., 2010c).

2.4). Mixing these terms is a relevant conceptual mistake.

segmentation problem which is apparently ignored by Diamant.

Breaking points and failure modes of the implemented algorithm are not documented

 Conclusions are not properly supported by results contained in the manuscript. Indeed claims such as "the efficiency of (my own) unsupervised top-down directed regionbased (learning from unlabeled data) image segmentation is hard to disprove today" (Diamant, 2005) are completely unjustified in both theoretical and practical terms (see

To summarize, the Diamant image segmentation algorithm appears as "yet another image segmentation algorithm" (Baraldi et al., 2010a) based on heuristics whose superiority against alternative approaches is completely unproved. In other words, the image segmentation algorithm proposed by Diamant cannot be considered as adequate proof of his concepts (see

In practical terms, the contour detection algorithm proposed by Diamant can be subjected to

sound.

in the paper.

Part I Section 2.2.3.2).

the following criticisms.

previous comments).

**3.3 Comments on the Diamant contour detector** 

	- Correlation between Iint = Eq. (1-3) and status = Eq. (1-4) can be relevant, i.e., Iloc = Eq. (1-2) = Eq. (1-3) × Eq. (1-5) is the product of two correlated contrast values where one-of-two is absolute valued.
	- Term Iint = Eq. (1-3) is not consistent with the psychophysical phenomenon of the Mach bands: where a luminance (radiance, intensity) ramp meets a plateau, there are spikes of brightness (perceived luminance), whereas there are none in the luminance profile. This is the sole case of continuity in the luminance profile capable of generating spikes of brightness (Baraldi & Parmiggiani, 1996a).

To summarize, the Diamant contour detector appears to be neither new nor biologically plausible. It can be considered as "yet another contour detector" (Baraldi et al., 2010a) based on heuristics whose superiority against alternative approaches is completely unproved. In other words, the contour detector proposed by Diamant cannot be considered as adequate proof of his concepts (see Part I Section 2.2.3.2).

#### **4. Revised/novel definitions of objective continuous sub-symbolic sensory data, continuous physical information, subjective discrete semi-symbolic data structure, discrete semantic-square (semantic<sup>2</sup> ) information and prior knowledge base**

As a revision of Diamant's works (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b), a new set of definitions of: (i) sub-symbolic objective primary data element in an objective sensory data set, (ii) semi-symbolic subjective secondary data structure, (iii) objective physical information, (iv) subjective semantic-square (semantic2) information and (v) subjective prior knowledge base (ontology or model of the 3-D world) provided by an external subjective supervisor (human, God or equivalent machine).

#### **4.1 Levels of aggregation of objective continuous sub-symbolic sensory data**

There are five fine-to-coarse possible levels of aggregation of objective continuous subsymbolic sensory data. These levels of aggregation are either sub-symbolic (non-semantic), semi-symbolic or symbolic. Semi-concepts are defined as stable concepts (percepts, classes of 3-D objects in the world) whose semantic meaning is adopted at the bottom level (layer 0) of an ontology (see Part I Section 2.2.2). The semantic information of semi-concepts (e.g., in a RS image, land cover semi-concepts are spectral categories such as *water or shadow*, *snow or ice*, *bare soil or built-up*, *vegetation,* etc.) is superior to that of objective data, whose semantic information is null, but equal or inferior (i.e., not superior) to that of concepts belonging to higher levels of abstraction (aggregation) in the ontology at hand (e.g., in a RS image classification taxonomy such as the International Global Biosphere Programme (IGBP) land cover classification scheme (FAO, 2000), target (3-D) land cover classes are *water bodies*, *snow or ice*, *barren*, *urban and built-up*, *needle-leaf forest, broad-leaf forest, mixed forest, shrubland, grassland, cropland,* etc.) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). An ontology is a hierarchical abstract representation (model) of the (3-D) world. For example, well-known examples of RS data classification taxonomies are the aforementioned IGBP land cover classification scheme (FAO, 2000), the Co-ordination of Information on the Environment (CORINE) (European Commission Joint Research Center, 2005), the U.S. Geological Survey (USGS) classification hierarchy (Lillesand & Kiefer, 1994) and the Food and Agriculture Organization of the United Nations (FAO) Land Cover Classification System (LCCS) (Di Gregorio & Jansen, 2000; Herold et al., 2006). An ontology can be modeled as a semantic network consisting of a hierarchical class taxonomy, represented as an inverted tree whose leaves are at the bottom layer 0, plus relationships between classes as arcs between nodes (refer to Part I Section 2.2.2).

The five fine-to-coarse possible levels of aggregation of objective sub-symbolic sensory data are listed below.


information is null, but equal or inferior (i.e., not superior) to that of concepts belonging to higher levels of abstraction (aggregation) in the ontology at hand (e.g., in a RS image classification taxonomy such as the International Global Biosphere Programme (IGBP) land cover classification scheme (FAO, 2000), target (3-D) land cover classes are *water bodies*, *snow or ice*, *barren*, *urban and built-up*, *needle-leaf forest, broad-leaf forest, mixed forest, shrubland, grassland, cropland,* etc.) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). An ontology is a hierarchical abstract representation (model) of the (3-D) world. For example, well-known examples of RS data classification taxonomies are the aforementioned IGBP land cover classification scheme (FAO, 2000), the Co-ordination of Information on the Environment (CORINE) (European Commission Joint Research Center, 2005), the U.S. Geological Survey (USGS) classification hierarchy (Lillesand & Kiefer, 1994) and the Food and Agriculture Organization of the United Nations (FAO) Land Cover Classification System (LCCS) (Di Gregorio & Jansen, 2000; Herold et al., 2006). An ontology can be modeled as a semantic network consisting of a hierarchical class taxonomy, represented as an inverted tree whose leaves are at the bottom layer 0, plus relationships between classes as arcs between nodes (refer to Part I Section

The five fine-to-coarse possible levels of aggregation of objective sub-symbolic sensory data

1. **An unlabeled objective continuous (quantitative) sub-symbolic (non-semantic) sensory** *scalar data element***.** For example, a one-band pixel value in an image, a character in a vocabulary, etc. This is a scalar (simple, atomic, elementary, primitive) fact (measurement, sign, symbol, character, element) resulting from an observation

2. **An unlabeled objective continuous sub-symbolic** *primary data vector* **/** *primary data n-tuple* **/** *primary data element*, where *n* 1 is the vector dimensionality. Each primary data n-tuple consists of *n* 1 scalar data elements, e.g., a multi-spectral pixel value in an image, a word in a dictionary, etc. In the rest of this paper, if an unlabeled objective data set consisting of primary data elements is discrete and finite (e.g., an image as a 2-D data array), then its cardinality is identified as *p* (e.g., an image consists of *p* pixels). In this case primary data elements may be identified by integer numbers, e.g., a pixel is identified by a (row, column) coordinate pair in a (2-D) image domain. A set of subsymbolic primary data elements (e.g., an image) can be described according to a given mathematical vocabulary/language. For example, a 2-D array of pixels (image) can be encoded as a 2-D spatial frequency function by means of a 2-D fast Fourier transform

3. **A finite set** (e.g., a (2-D) image array) **of** *p* **unlabeled objective continuous subsymbolic primary data elements** (e.g., pixels), with *p* {1, )**.** To be described in physical terms, a set of objective sub-symbolic primary data elements requires a mathematical vocabulary/language, e.g., a 2-D FFT of a (2-D) image. This is related to the concept of continuous physical information in an objective sensory data set (refer to

4. **A labeled subjective discrete semi-symbolic** *secondary data structure* **/** *secondary data object***.** It consists of one or more primary data elements of a given objective data set grouped together (based on any possible subjective aggregation criterion) and labeled

(examination, inspection, monitoring, measurement) of the (3-D) world.

2.2.2).

are listed below.

(FFT).

this text below).

as one semi-symbolic secondary data structure. Each label belongs to a discrete and finite set of semi-concepts. The semantic meaning of semi-concepts (e.g., *vegetation*) is superior to zero (like that of unlabeled primary data elements) and not superior (i.e., equal or inferior) to that of concepts in the real (3-D) world. A discrete and finite quantitative data set consisting of *p* unlabeled objective primary data elements (e.g., a multi-spectral image consisting of *p* pixels, refer to point 3. above) always consists of a discrete and finite set of semi-symbolic secondary data structures whose cardinality is identified hereafter as *s*, such that inequality (*s p*) always holds. It is noteworthy that if equality (*s* == *p*) holds, this does not correspond to a trivial case since secondary data structures are semi-symbolic while primary data elements are sub-symbolic. To the best of this author's knowledge, it is at the level of subjective semi-symbolic secondary data structures that the view of the present author starts diverging from *all* existing CV algorithms and implementations, including GEOBIA-based RS-IUSs and Diamant's image segmentation and contour detection algorithms. This degree of novelty is consistent with well-known evidence collected in CV and MAL domains. For example:


In practice, the following definition holds.

Discrete semi-symbolic secondary data structure = Continuous sub-symbolic primary data element(s) + discrete semi-symbolic label belonging to a discrete and finite set of semi-concepts (e.g., in RS image understanding, possible semi-concepts are spectral categories equivalent to land cover class sets consisting of one or more land cover classes; examples of spectral categories are *vegetation*, *water or shadow*, *bare soil or built-* *up*, etc. (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a; Baraldi, 2011b; Baraldi et al., 2010c)).

This also means that the set of discrete semi-symbolic secondary data structures incorporates the continuous objective sensory data set.

	- in line with the CV system proposed by Marr at the level of computational theory (see Part I Section 2.6) when he states: "vision goes symbolic almost immediately, right at the level of zero-crossings (primal sketch)… without loss of information" (Marr, 1982) (p. 343) (refer to Part I Section 2.3)
	- In contrast with the CV system proposed by Marr at the level of algorithm design and implementation (see Part I Section 2.5), where the term primal sketch identifies the non-symbolic output of a zero-crossings algorithm, which is an instance of the unlabeled data learning class of image edge detectors/region extractors (Marr, 1982).

It is noteworthy that in a (2-D) preliminary classification map domain, a labeled semisymbolic segment may be defined as a spatially connected set of secondary semisymbolic data structures featuring the same label, say, connected pixels featuring label *vegetation*. Therefore, in a (2-D) preliminary classification map domain, semi-symbolic pixels belong to semi-symbolic image segments which belong to semi-symbolic image strata (layers) defined as image-wide sets of semi-symbolic segments featuring the same semi-symbolic label. In other words, in the preliminary classification map domain, three spatial types co-exist: **semi-symbolic pixels in semi-symbolic image segments in semisymbolic image strata**. This would end the bad-faith antagonism between unlabeled pixels versus labeled non-symbolic segments (e.g., segment 1, segment 2, etc.) which affects traditional pixel-based versus object-based RS-IUSs and CV systems (refer to Part I Table 1). A labeled subjective semi-symbolic quantitative data set can be described (encoded) according to a given pair of one mathematical and one natural vocabulary/language capable of accounting for both the quantitative and semantic (qualitative, subjective) nature of labeled subjective semi-symbolic secondary data structures (refer to point 4. above).

#### **4.2 Continuous physical information**

**Continuous physical (quantitative, objective, sensory) information. This is a hierarchical (i.e., multi-scale, including one-scale as a special case) description (representation), namely, down-scale encoding (decomposition), up-scale decoding (reconstruction) or onescale transcoding (from one data format to another at the same hierarchical level), of the physical objective data set based on a given mathematical non-natural vocabulary/language.** This hierarchical description/ representation of the objective sensory data set can be either lossless or lossy, depending on the exact/non-exact reconstruction (decoding) of the original data set from its representation (encoding). For example, an FFT of a time-signal is a one-scale transcodification of the signal from the time to the frequency domain. A well-known example of down-scale encoding/up-scale decoding is the Gaussian-Laplacian image pyramid (Burt & Adelson, 1983). It means that physical information stems from the combination of an objective data set with a mathematical nonnatural vocabulary/language. To summarize the concept of physical information, we can write the following definition.

Continuous objective data set + (arbitrary) multi-scale down-scale encoding, up-scale decoding or one-scale transcoding/description/data format = hierarchical physical information encompassing down-scale/ fine-to-coarse resolution/ compression/ encoding, up-scale/ coarse-to-fine resolution/ decompression/ decoding, and/or one-scale transcodification (from one data format to another at the same hierarchical level), either lossless or lossy.

#### **4.3 Discrete semantic-square information**

120 Earth Observation

Baraldi, 2011b; Baraldi et al., 2010c)).

structures (refer to point 4. above).

**4.2 Continuous physical information** 

These terms are:

incorporates the continuous objective sensory data set.

(Marr, 1982) (p. 343) (refer to Part I Section 2.3)

*up*, etc. (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi, 2011a;

This also means that the set of discrete semi-symbolic secondary data structures

 in line with the CV system proposed by Marr at the level of computational theory (see Part I Section 2.6) when he states: "vision goes symbolic almost immediately, right at the level of zero-crossings (primal sketch)… without loss of information"

 In contrast with the CV system proposed by Marr at the level of algorithm design and implementation (see Part I Section 2.5), where the term primal sketch identifies the non-symbolic output of a zero-crossings algorithm, which is an instance of the unlabeled data learning class of image edge detectors/region extractors (Marr, 1982). It is noteworthy that in a (2-D) preliminary classification map domain, a labeled semisymbolic segment may be defined as a spatially connected set of secondary semisymbolic data structures featuring the same label, say, connected pixels featuring label *vegetation*. Therefore, in a (2-D) preliminary classification map domain, semi-symbolic pixels belong to semi-symbolic image segments which belong to semi-symbolic image strata (layers) defined as image-wide sets of semi-symbolic segments featuring the same semi-symbolic label. In other words, in the preliminary classification map domain, three spatial types co-exist: **semi-symbolic pixels in semi-symbolic image segments in semisymbolic image strata**. This would end the bad-faith antagonism between unlabeled pixels versus labeled non-symbolic segments (e.g., segment 1, segment 2, etc.) which affects traditional pixel-based versus object-based RS-IUSs and CV systems (refer to Part I Table 1). A labeled subjective semi-symbolic quantitative data set can be described (encoded) according to a given pair of one mathematical and one natural vocabulary/language capable of accounting for both the quantitative and semantic (qualitative, subjective) nature of labeled subjective semi-symbolic secondary data

**Continuous physical (quantitative, objective, sensory) information. This is a hierarchical (i.e., multi-scale, including one-scale as a special case) description (representation), namely, down-scale encoding (decomposition), up-scale decoding (reconstruction) or onescale transcoding (from one data format to another at the same hierarchical level), of the physical objective data set based on a given mathematical non-natural vocabulary/language.** This hierarchical description/ representation of the objective sensory data set can be either lossless or lossy, depending on the exact/non-exact reconstruction (decoding) of the original data set from its representation (encoding). For example, an FFT of a time-signal is a one-scale transcodification of the signal from the time to the frequency domain. A well-known example of down-scale encoding/up-scale decoding is the Gaussian-Laplacian image pyramid (Burt & Adelson, 1983). It means that physical

5. **A finite set of (***s p***) labeled secondary subjective semi-symbolic data structures, which include the objective sensory data set** (refer to point 4. above**),** with *s* {1, *p*}. In this author's terminology, it is called *preliminary classification map* or *primal sketch*.

> **Discrete semantic-square (semantic2)** (where semantic is a synonym of categorical, symbolic, subjective, abstract, qualitative, vague, but persistent, stable, see Part I Section 2.1) **information (concepts, percepts) stems from the semantic2 labeling of an objective data set performed by an external subjective supervisor** (human, God or equivalent machine) **provided with a subjective hierarchical prior knowledge base** (ontology or model of the (3-D) world, equivalent to an inverted tree with leaves at the bottom level 0, see Part I Section 2.2.2). **Semantic2 labeling occurs when a subjective supervisor (first source of subjectivity), provided with his/her own subjective ontology (second source of subjectivity), observes and scrutinizes the objective data set, consisting of** *p* **sub-symbolic primary data elements (refer to point 3. in Section 4.1), to achieve the following.**


This definition of semantic2 labeling disagrees at the level of the aforementioned point a. with the traditional definition of semantic labeling provided by MAL, which encompasses existing CV systems (e.g., Diamant's (Diamant, 2005)) and RS-IUSs (e.g., (Definiens Imaging GmbH, 2004; Matsuyama & Shang-Shouq Hwang, 1990)). In fact, point a. above states that **semantic2 information stems naturally (automatically, instantaneously) from the simultaneous interaction of three necessary and sufficient components.** 


The aforementioned points i.-iii. imply that **objective sensory data**, *per se*, **do not possess any semantic2 information, but physical information exclusively.** Rather, **semantic2 information incorporates objective data as one-of-three components**. This also means that nobody should disagree with Diamant when he repeats over and over that sensory data do not possess semantic information, therefore semantic information cannot be *extracted* from sensory data (Diamant, 2010a). On the contrary, Diamant's statement should not be considered original at all because it has been perfectly acknowledged in philosophy for hundreds of years, as well as in psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang, 1990) and MAL in the last 50 years (Cherkassky & Mulier, 2006). This concept is summarized below.


The foregoing comments also mean that Diamant is right, although vague, when he states that "semantics is a property of a human observer" (Diamant, 2010a). **To state this more precisely, since semantic2 information naturally (automatically, instantaneously) stems from the interaction of three necessary and sufficient components i.-iii. (see above in this text), then semantic2 information cannot be separated from any of its three components.**  For example, let us think of a piano (symbolic data structure) whose objective presence (fact) requires the simultaneous presence of a subjective human actor (or equivalent machine) to generate whatever sound (semantic information). The sound (generated semantic information) is neither in the piano, nor in the piano player, nor in his/her prior knowledge of what a piano is all about, but in the instantaneous combination of these three factors. This also means that **semantic2 information** quite obviously **changes with the objective data set, the subjective human supervisor and his/her own subjective ontology.** In particular (refer to this text above), **semantic2 information means there are two subjective actors in the semantic labeling of objective sensory data, namely, the subjective external observer and scrutinizers (or equivalent machine) and his/her own ontology or semantic (abstract) model of the world**. In fact, it is well known that all humans do not adopt the same ontology and two humans who adopt the same ontology do not apply this ontology the same way through time in interpreting a given observation. For example, two players will never generate the same music when playing the same musical score on the same piano. Not even the same player will ever generate the same music when playing twice the same musical score on the same piano. To summarize these concepts we can write the following definition.

Objective sensory data set + subjective supervisor provided, as such, with a subjective prior hierarchical knowledge base (ontology) = hierarchical semantic2 (subjective2) information, which includes physical information at the bottom level 0 of the inverted tree which deals with the semantic granularity of semi-concepts assigned to semi-symbolic secondary data structures.

#### **4.4 Subjective hierarchical (multi-scale) prior knowledge base**

**Subjective hierarchical (multi-scale) prior knowledge base (ontology, model of the (3-D) world) equivalent to a semantic net or inverted tree with leaves at the bottom level 0**  where physical information is incorporated. Refer to this text above.

#### **4.5 Intelligence**

122 Earth Observation

iii. A subjective hierarchical (multi-scale) prior ontology which exists before looking at the data. Since it deals with semantic information, a prior knowledge base is subjective by definition (since subjective and semantic are synonyms, refer to Part I Section 2.1). In practice, this ontology acts as the second source of subjectivity in the labeling (mapping) process. According to Diamant this hierarchical ontology is equivalent to a narrative story or tale which requires a natural language to comprise, in a top-down representation: the story title, index, sections, paragraphs, sentences and words. It is graphically represented and implemented as a semantic net or inverted tree whose leaves are at the bottom level 0 where physical information is incorporated (refer to this

The aforementioned points i.-iii. imply that **objective sensory data**, *per se*, **do not possess any semantic2 information, but physical information exclusively.** Rather, **semantic2 information incorporates objective data as one-of-three components**. This also means that nobody should disagree with Diamant when he repeats over and over that sensory data do not possess semantic information, therefore semantic information cannot be *extracted* from sensory data (Diamant, 2010a). On the contrary, Diamant's statement should not be considered original at all because it has been perfectly acknowledged in philosophy for hundreds of years, as well as in psychophysical studies of perception (Matsuyama & Shang-Shouq Hwang, 1990) and MAL in the last 50 years (Cherkassky & Mulier, 2006). This

 Philosophy and psychophysical studies of perception. The statement that sensory data do not possess semantic information is tantamount to saying there is an information gap between physical information and semantic information, which is the well-known information gap between (sensory and varying) sensations and (vague, but stable) perceptions. In practice, "we are always seeing objects we have never seen before at the sensation level, while we perceive familiar objects everywhere at the perception level" (Matsuyama & Shang-Shouq Hwang, 1990)

 MAL. In unlabeled data learning algorithms (e.g., unlabeled data clustering), no semantics is detected as output (e.g., unlabeled data cluster 1, unlabeled data cluster 2), see Fig. 1. In labeled data learning algorithms for classification applications (see Part I,Fig. 1), no semantic information is extracted from a finite set of training data pairs consisting of an (objective data vector, subjective discrete label), but a correlation function can be estimated between continuous sensory data and a discrete and finite set of subjective labels (refer to Part I Section 2.1 and Part I Section 2.4.2).

The foregoing comments also mean that Diamant is right, although vague, when he states that "semantics is a property of a human observer" (Diamant, 2010a). **To state this more precisely, since semantic2 information naturally (automatically, instantaneously) stems from the interaction of three necessary and sufficient components i.-iii. (see above in this text), then semantic2 information cannot be separated from any of its three components.**  For example, let us think of a piano (symbolic data structure) whose objective presence (fact) requires the simultaneous presence of a subjective human actor (or equivalent machine) to

interprets/scrutinizes the objective data set to match (label) data with his/her own

observes the objective data set and

text above and Part I Section 2.2.2).

(see Part I Section 1 and Part I Section 2.2.2).

concept is summarized below.

ontology.

**Intelligence** (cognition) is the system's ability to aggregate bottom-up (from-data-to-concepts) and disassemble top-down (from-concepts-to-data) semantic information (which incorporates physical information) across the hierarchical levels of a subjective prior knowledge base.

#### **4.6 Information processing system**

**An information processing system, cognitive system or intelligent system** transforms an input sensory data set into an output instantiation of a story in natural language whose hierarchical structure is provided by an ontology or inverted tree retained in the system's memory before looking at the sensory data.

To summarize, the aforementioned novel definitions sketch a RS-IUS where information goes symbolic during the pre-attentive vision phase to generate a semi-symbolic primal sketch (preliminary classification map). This is in line with the CV system proposed by Marr at the level of computational theory (see Part I Section 2.6) when he states: "vision goes symbolic almost immediately, right at the level of zero-crossings (primal sketch)" (Marr, 1982), p. 343 (see Part I Section 2.3). However, it differs from the CV system proposed by Marr at the level of primal sketch implementation (see Part I Section 2.6) consisting of a subsymbolic zero-crossing algorithm (Marr, 1982). In addition, the novel RS-IUS sketched above differs at the level of both computational theory and algorithm design and implementation from existing CV systems such as GEOBIA systems (Definiens Imaging GmbH, 2004; Esch et al., 2008), including Diamant's (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b), where an unlabeled data learning (driven-without-knowledge) algorithm is adopted at the first stage.

#### **5. Practical consequences of the proposed definitions on CV, AI and MAL system design and implementation strategies**

Practical consequences of the definitions proposed in Part II Section 4 on CV, AI and MAL system design and implementation strategies are several, more detailed, better posed and, therefore, far more relevant than Diamant's (Diamant, 2010a). Thus, they should benefit from more favorable consideration by the scientific community.

	- semantic in nature (see Part I Section 2.3), therefore it is called *preliminary classification map*;
	- capable of preserving small, but genuine image details (high spatial frequency image components). This requirement is inconsistent with existing image segmentation algorithms which are inherently affected by the *uncertainty principle*  according to which, for any contextual (neighborhood) property, we cannot simultaneously measure that property while obtaining accurate localization (Corcoran & Winstanley, 2007; Petrou & Sevilla, 2006) (see Part I Section 2.4.1.2).

Although he stated that vision goes symbolic right at the output of the preattentive vision phase, which has to affect the architectural level of understanding of a CV system (see Part I Section 2.6), Marr selected a sub-symbolic edge detection (zero-crossing) algorithmic for primal sketch generation (Marr, 1982). By embracing the Marr computational theory rather than his algorithmic solutions, the present author concludes that, as output, **the preattentive visual phase no longer generates subsymbolic image primitives, namely, non-semantic points and edges or, vice versa, image regions (which is what was implemented by Marr (Marr, 1982)), but semisymbolic secondary data structures, namely, semi-symbolic pixels in semi-symbolic segments in semi-symbolic strata** (see Part II Section 4) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).


symbolic zero-crossing algorithm (Marr, 1982). In addition, the novel RS-IUS sketched above differs at the level of both computational theory and algorithm design and implementation from existing CV systems such as GEOBIA systems (Definiens Imaging GmbH, 2004; Esch et al., 2008), including Diamant's (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b), where an unlabeled data learning (driven-without-knowledge) algorithm is adopted

**5. Practical consequences of the proposed definitions on CV, AI and MAL** 

Practical consequences of the definitions proposed in Part II Section 4 on CV, AI and MAL system design and implementation strategies are several, more detailed, better posed and, therefore, far more relevant than Diamant's (Diamant, 2010a). Thus, they should benefit

1. Definitions provided in Part II Section 4 are consistent with the Marr statement: "vision goes symbolic almost immediately, right at the level of zero-crossings (primal sketch)… without loss of information" (Marr, 1982) (p. 343) (refer to Part I Section 2.3). This is tantamount to saying that exploitation of the deductive subjective prior knowledgebased inference paradigm must regard the preattentive visual phase whose output, the

semantic in nature (see Part I Section 2.3), therefore it is called *preliminary* 

 capable of preserving small, but genuine image details (high spatial frequency image components). This requirement is inconsistent with existing image segmentation algorithms which are inherently affected by the *uncertainty principle*  according to which, for any contextual (neighborhood) property, we cannot simultaneously measure that property while obtaining accurate localization (Corcoran & Winstanley, 2007; Petrou & Sevilla, 2006) (see Part I Section 2.4.1.2). Although he stated that vision goes symbolic right at the output of the preattentive vision phase, which has to affect the architectural level of understanding of a CV system (see Part I Section 2.6), Marr selected a sub-symbolic edge detection (zero-crossing) algorithmic for primal sketch generation (Marr, 1982). By embracing the Marr computational theory rather than his algorithmic solutions, the present author concludes that, as output, **the preattentive visual phase no longer generates subsymbolic image primitives, namely, non-semantic points and edges or, vice versa, image regions (which is what was implemented by Marr (Marr, 1982)), but semisymbolic secondary data structures, namely, semi-symbolic pixels in semi-symbolic segments in semi-symbolic strata** (see Part II Section 4) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b). 2. **It is impossible to** *extract* **semantic2 information from objective continuous sensory data because the latter,** *per se***, are provided with no semantics at all.** This is the wellknown information gap between semantic2 information and physical information (refer

3. **Although it is impossible to** *extract* **semantic2 information from objective continuous sensory data, it is possible to** *correlate* **discrete semantic2 information to objective** 

**system design and implementation strategies** 

from more favorable consideration by the scientific community.

so-called *primal sketch* (Marr, 1982), must be as follows:

to Part I Section 2.2.2 and Part I Section 2.3).

*classification map*;

at the first stage.

**continuous sensory data.** This conclusion is by no means novel as it is well known in literature. For example, Shunlin Liang summarizes this concept in a few words: statistical pattern recognition systems are based on correlation relationships between objective sensory (e.g., RS) data and either continuous (e.g., LAI) or categorical (e.g., land surface) variables (see Part I Section 2.1) (Shunlin Liang, 2004). **Unfortunately, low or no correlation can be found between continuous sensory data and a finite and discrete set of categorical variables, corresponding to independent random variables generating "distinguishable" data structures (data aggregations, data clusters) in realworld data mapping problems at large data scale or fine semantic granularity, other than toy problems at small data scale and coarse semantic granularity.** This low correlation effect is due to the combination of two factors.


Does this mean the relevant effort spent by the MAL community to develop drivenwithout-knowledge image segmentation algorithms (Castilla et al., 2008) or, say, selforganizing topology-preserving unlabeled data clustering algorithms (Fritzke, 1997; Martinetz & Schulten, 1994), has been worthless? Fortunately, not. It rather means the following.

	- I It should be replaced by a deductive MAT-by-rules approach where community-agreed prior knowledge is conveyed to generate as output a lossless semi-symbolic product (consisting of semi-concepts). For example, in a RS-IUS, the MAT-by-rules first stage should generate a preliminary classification map (see Part II Section 4) where small, but genuine image details are well preserved (refer to this text above).
	- II If useful, it should be:
		- a. adapted to work on a driven-by-knowledge stratified (semantic masked) basis and
		- c. next, moved to the second stage of a two-stage stratified hierarchical hybrid cognitive system. For example, a two-stage stratified hierarchical hybrid RS-IUS architecture has been proposed in recent literature, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).
	- i. The main application domain of supervised data learning algorithms should be considered function regression where input and output variables are continuous non-semantic, see Fig. 1.
	- ii. When a supervised data learning classifier (see Part I Section 2.4.2) is adopted as the first stage of a two-stage hybrid cognitive system, CV system or RS-IUS, it should be considered highly inappropriate. An experimental proof of this concept is that supervised MAL algorithms (say, SVMs), either context-insensitive (e.g.,

pixel-based) or context-sensitive (Bruzzone & Carlin, 2006; Bruzzone & Persello, 2009; Persello & Bruzzone, 2010), considered successful in terms of operational QIs (refer to Part I Section 2.7.2) at local/regional scale, become impracticable in mapping RS image mosaics consisting of hundreds of images at national/continental/global scale (Chengquan Huang et al., 2008). In these real world problems the cost, timeliness, quality and availability of adequate reference (training) data sets derived from field sites, existing maps and tabular data are currently considered the most limiting factors on RS data product generation and validation (Gutman et al., 2004). In particular, the first-stage supervised data learning classifier of a two-stage hybrid RS-IUS should be:


126 Earth Observation

following.

Does this mean the relevant effort spent by the MAL community to develop drivenwithout-knowledge image segmentation algorithms (Castilla et al., 2008) or, say, selforganizing topology-preserving unlabeled data clustering algorithms (Fritzke, 1997; Martinetz & Schulten, 1994), has been worthless? Fortunately, not. It rather means the

i. The main application domain of, say, self-organizing topology-preserving unlabeled data clustering algorithms should remain the modeling of stationary and

ii. When an unlabeled (unsupervised) data learning algorithm, either a drivenwithout-knowledge image segmentation algorithm or an unlabeled data clustering algorithm (see Part I Section 2.4.1), is adopted as the first stage of a two-stage hybrid cognitive system, CV system or RS-IUS, it should be considered highly

I It should be replaced by a deductive MAT-by-rules approach where community-agreed prior knowledge is conveyed to generate as output a lossless semi-symbolic product (consisting of semi-concepts). For example, in a RS-IUS, the MAT-by-rules first stage should generate a preliminary classification map (see Part II Section 4) where small, but genuine image details

a. adapted to work on a driven-by-knowledge stratified (semantic masked)

c. next, moved to the second stage of a two-stage stratified hierarchical hybrid cognitive system. For example, a two-stage stratified hierarchical hybrid RS-IUS architecture has been proposed in recent literature, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et

non-stationary distributions, see Part I Fig. 1.

are well preserved (refer to this text above).

al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

5. As an extension of points 2. and 3. above, **labeled (supervised) data learning classifiers (**see Part I Section 2.4.2**) should be considered highly inappropriate** (like using a fork for cutting food; unless the food is particularly soft, it will never work) **in real-world data mapping problems at large data scale or fine semantic granularity** (where withinclass variability is monotonically non-decreasing (i.e., it increases or remains equal) with the cardinality of the objective sensory data set), other than toy problems at small data scale and coarse semantic granularity. This conclusion is by no means novel. Rather, it is well known in literature. For example, Shunlin Liang summarizes this concept in few words: statistical model are usually site-specific (see Part I Section 2.1) (Shunlin Liang, 2004). Does this mean the relevant effort spent by the MAL community to develop supervised data learning classifiers has been worthless? Fortunately, no. It

i. The main application domain of supervised data learning algorithms should be considered function regression where input and output variables are continuous

ii. When a supervised data learning classifier (see Part I Section 2.4.2) is adopted as the first stage of a two-stage hybrid cognitive system, CV system or RS-IUS, it should be considered highly inappropriate. An experimental proof of this concept is that supervised MAL algorithms (say, SVMs), either context-insensitive (e.g.,

inappropriate. In particular:

II If useful, it should be:

basis and

rather means the following.

non-semantic, see Fig. 1.


#### **6. SIAM™ as a proof of the efficacy of the required shift of learning paradigm from MAL-from-examples to MAT-by-rules at the first stage of two-stage hybrid RS-IUSs**

To the best of this author's knowledge SIAM™ provides the first experimental proof of the efficacy of the required switch of learning paradigm from MAL-from-examples to MAT-byrules at the first stage of a two-stage hybrid RS-IUS architecture (refer to Part II Section 2.3), see Table 4. SIAM™ is an operational (good-to-go, press-and-go, turnkey) software button (executable). In particular, SIAM™ is automatic, efficient, scalable, accurate and robust to changes in the input data acquired across time, space and sensors. For example, the automatic SIAM™ is consistent and accurate across sensors at the national/ continental/ global scale (refer to Part II Section 2.3) (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b), whereas semi-automatic inductive data learning neural network approaches, such as SVMs, require to be re-trained (supervised) image-wide (Chengquan Huang et al., 2008).

SIAM™ belongs to the family of physical models that follow the physical laws of the real (3-D) world to represent an abstract of the reality (see Part I Section 2.1) (Shunlin Liang, 2004). In particular, SIAM™ follows the physical laws of spaceborne optical imaging devices to provide a two-stage hybrid RS-IUS with a first-stage deductive prior knowledge-based inference mechanism. Unfortunately, it takes a long time for human experts to learn physical laws of the real (3-D) world and tune physical models based on human intuition, domain expertise and evidence from data observations (Mather, 1994; Shunlin Liang, 2004). For example, the development of the SIAM™ dates back to the year 2002 (Baraldi, 2011a).


Table 4. QIs of SIAM™ versus state-of-the-art RS-IUSs' (refer to Part I Section 2.8). Legend of fuzzy sets: Very low (VL), Low (L), Medium (M), High (H), Very High (VH). Legend of colors: Red-Bad, Blue-Average, Green-Good

Part I Section 2.2.2 reported the question: is human biology as irrelevant to AI research as bird biology is to aeronautical engineering? Actually, biological vision has always represented a fundamental source of inspiration for the CV community. While SIAM™ considers its degree of biological plausibility as a value added, straightforward imitation of biological vision solutions is not always possible. This is the reason why SIAM™ cannot be considered highly plausible in biological terms although it is very useful in practice. For example, SIAM™ cannot work with panchromatic imagery whereas the human visual system is perfectly able to interpret gray-tone images.

#### **7. Conclusions**

It is well known that semantic information is not in objective sensory data, which is tantamount to saying there is a well-known information gap between semantic2 information and physical information. This conceptual work observes that semantic2 information is naturally (automatically, instantaneously) generated by the simultaneous interaction of a subjective external supervisor who observes and scrutinizes an objective sensory data set based on his/her own subjective prior knowledge base (ontology, model of the 3-D world). Semantic2 information resulting from this interaction takes the intermediate form of semisymbolic secondary data structures that incorporate physical information at the bottom level (layer 0) of an ontology represented as an inverted tree.

A shift of learning paradigm from MAL-from-examples to MAT-by-rules in the first stage of two-stage hybrid RS-IUSs is recommended. Experimental proof of this concept is provided by the operational automatic SIAM™ recently proposed in RS literature.

The practical conclusion of this conceptual work is twofold.

128 Earth Observation

**Quality Indicators (Qis) State-of-the-art RS-IUSs SIAM™**

*forest***)**

**VH (e.g., the collection of reference samples is a difficult and expensive task)**

**VL, L, high costs in manpower and also computing power**

**VL, L VH (fully automatic, it cannot be** 

**VL VH (it works with any existing** 

**M, H, VH VH**

**VL, L in training (hours per images) VH (5 m to 30 s per Landsat image** 

**surpassed)**

**Spectral semi-concept (e.g.,**  *vegetation***)**

**in a laptop)**

**spaceborne sensor)**

**VL, i.e., timeliness is reduced to almost zero**

**VH, i.e., costs in manpower and computing power are reduced to almost zero**

**Degree of automation:** (a) number, physical meaning and range of variation of user-defined parameters, (b)collection of the required training data set, if any.

**Effectiveness** : (a) semantic accuracy and (b) spatial

**Efficiency:** (a) computation time and (b) memory

**Scalability to changes in the sensor's specifications**

colors: Red-Bad, Blue-Average, Green-Good

system is perfectly able to interpret gray-tone images.

(layer 0) of an ontology represented as an inverted tree.

by the operational automatic SIAM™ recently proposed in RS literature.

**Timeliness** (from data acquisition to high-level product generation, increases with manpower and

**Economy** (inverse of costs increasing with manpower and computing power).

**Semantic information level Land cover class (e.g.,** *deciduous* 

**Robustness to changes in input image VL (specific training per image) VH**

**Robustness to changes in input parameters VL VH (it cannot be surpassed)**

Table 4. QIs of SIAM™ versus state-of-the-art RS-IUSs' (refer to Part I Section 2.8). Legend of fuzzy sets: Very low (VL), Low (L), Medium (M), High (H), Very High (VH). Legend of

Part I Section 2.2.2 reported the question: is human biology as irrelevant to AI research as bird biology is to aeronautical engineering? Actually, biological vision has always represented a fundamental source of inspiration for the CV community. While SIAM™ considers its degree of biological plausibility as a value added, straightforward imitation of biological vision solutions is not always possible. This is the reason why SIAM™ cannot be considered highly plausible in biological terms although it is very useful in practice. For example, SIAM™ cannot work with panchromatic imagery whereas the human visual

It is well known that semantic information is not in objective sensory data, which is tantamount to saying there is a well-known information gap between semantic2 information and physical information. This conceptual work observes that semantic2 information is naturally (automatically, instantaneously) generated by the simultaneous interaction of a subjective external supervisor who observes and scrutinizes an objective sensory data set based on his/her own subjective prior knowledge base (ontology, model of the 3-D world). Semantic2 information resulting from this interaction takes the intermediate form of semisymbolic secondary data structures that incorporate physical information at the bottom level

A shift of learning paradigm from MAL-from-examples to MAT-by-rules in the first stage of two-stage hybrid RS-IUSs is recommended. Experimental proof of this concept is provided

accuracy.

occupation.

**or user's needs.**

computing power).

**7. Conclusions** 

	- a. replaced by a deductive MAT-by-rules approach where community-agreed prior knowledge is conveyed and,
	- b. if useful, adapted to work on a driven-by-knowledge stratified (semantic masked) basis and moved to the second stage of a two-stage stratified hierarchical hybrid cognitive system. For example, a two-stage stratified hierarchical hybrid RS-IUS architecture has been proposed in recent literature, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

This required shift of the learning paradigm from MAL-from-examples to MAT-by-rules adopted in the first stage of a two-stage hybrid RS-IUS is similar in nature to previous conceptual shifts occurring between deductive coarse-to-fine (from symbolic concepts to sub-symbolic data) AI/MAI and inductive fine-to-coarse (from sub-symbolic data to symbolic concepts) Cybernetics/MAL, see Part I Section 2.2. What is novel about the proposed shift of the learning paradigm from MAL-from-examples to MAT-by-rules at the first stage of a two-stage hybrid RS-IUS is the following.


subjective prior knowledge base (ontology or model of the 3-D world (Matsuyama & Shang-Shouq Hwang, 1990)) provided by an external subjective supervisor (human, God or equivalent machine), refer to Part II Section 4.


To summarize, to the best of this author's knowledge this is the first time a novel computational theory (RS-IUS architecture) is supported by operational (good-to-go, pressand-go, turnkey) algorithmic and implementation solutions as proofs of concept. For example, this was not the case of the Marr (Marr, 1982) or the Diamant CV systems (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b), whose computational theories (see Part I Section 2.6) are both inconsistent with algorithmic solutions adopted by their authors. As a consequence, these two CV systems become two more instances of the well-known class of two-stage segment-based hybrid CV systems, also termed GEOBIA systems, traditionally affected by a lack of general consensus and research (Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990).

The proposed conclusions of potential interest to the RS, CV, AI and MAL communities are supported by unquestionable independent sources of evidence listed below.

 Since the late 1950s, the original ambitious goals of AI/MAI and Cybernetics/MAL have been fragmented into "practical" and "manageable" problems equivalent to "a family of relatively disconnected efforts" (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b).

	- Unlabeled (unsupervised) data learning algorithms, namely, unlabeled data clustering (Backer & Jain, 1981; Baraldi & Alpaydin, 2002a; Baraldi & Alpaydin, 2002b; Cherkassky & Mulier, 2006; Fritzke, 1997) and unlabeled (2-D) image segmentation algorithms (Burr & Morrone, 1992; Corcoran et al., 2010; Corcoran & Winstanley, 2007; Delves et al., 1992; Hay & Castilla, 2006; Matsuyama & Shang-Shouq Hwang, 1990; Petrou & Sevilla, 2006; Vecera & Farah, 1997), are recognized as inherently ill-posed problems subjective in nature by a relevant portion of existing literature.
	- Labeled (supervised) data learning classifiers are unable to establish correlation relationships between objective sensory (e.g., RS) data and categorical variables (e.g., land cover classes) at large data scale or fine semantic granularity. For example, in (Chengquan Huang et al., 2008) a forest/non-forest one-class SVM battery of classifiers must be re-trained and re-selected for every image in an image mosaic at global scale. Vice versa, labeled data learning classifiers are exclusively suitable for finding correlation relationships between objective sensory data and categorical variables at small data scale and coarse semantic granularity (e.g., in RS data mapping problems at coarse spatial resolution and local/regional scale). In fact, in practical RS data applications where supervised data learning algorithms are employed at large spatial scale, fine spatial resolution or fine semantic granularity (Chengquan Huang et al., 2008), the cost, timeliness, quality and availability of adequate reference (training/testing) datasets derived from field sites, existing maps and tabular data have turned out to be the most limiting factors on RS data product generation and validation (Gutman et al., 2004).

To the best of this author's knowledge, while the proposed practical conclusions of potential interest to the RS, CV, AI and MAL communities are supported by the aforementioned independent sources of evidence, these conclusions are not contradicted by any practical achievement gained by the RS, CV, AI and MAL communities in recent years. Thus, rather than being agreed or disagreed upon, these conclusions ought to be accepted by the scientific community unless proved otherwise when the increasing rate of collection of RS data of enhanced spatial, spectral and temporal quality will no longer outpace our capability of generating (rather than extracting) semantic2 information from RS data provided, *per se*, with no semantics at all.

#### **8. Acknowledgments**

130 Earth Observation

(human, God or equivalent machine), refer to Part II Section 4.

subjective prior knowledge base (ontology or model of the 3-D world (Matsuyama & Shang-Shouq Hwang, 1990)) provided by an external subjective supervisor

 It affects exclusively the inductive learning-from-data first stage of traditional twostage hybrid CV systems (e.g., Marr's (Marr, 1982), Diamant's (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b)) or RS-IUSs, whether or not this first stage is implemented as an inductive algorithm capable of learning from either unlabeled (unsupervised) or labeled (supervised) data, whether context-insensitive (e.g., pixel-based) or context-sensitive (e.g., (2-D) object-based). If useful, these inductive data learning algorithms may be adapted to run on a driven-byknowledge stratified (semantic masked, layered) basis and moved to the second stage of a novel two-stage stratified hierarchical hybrid RS-IUS architecture proposed in recent literature, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a;

Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi, 2011b).

2011b).

2010c; Baraldi, 2011a; Baraldi, 2011b).

2006; Matsuyama & Shang-Shouq Hwang, 1990).

2010a; Diamant, 2010b).

 It comes together with a novel two-stage stratified hierarchical hybrid RS-IUS architecture employing a first-stage spectral rule-based preliminary classification algorithm based on prior spectral knowledge, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al., 2010c; Baraldi, 2011a; Baraldi,

 It comes together with an operational (namely, automatic, efficient, accurate, robust, scalable, see Part I Section 2.8) Satellite Image Automatic Mapper™ (SIAM™) implementation (software executable), equivalent to an automatic (goodto-go, press-and-go, turnkey) software button, provided as an experimental proof of the efficacy of the required shift in learning paradigm from MAL-from-examples to MAT-by-rules at the first stage of a two-stage hybrid RS-IUS architecture, see Fig. 3 (Baraldi et al., 2006; Baraldi et al., 2010a; Baraldi et al., 2010b; Baraldi et al.,

To summarize, to the best of this author's knowledge this is the first time a novel computational theory (RS-IUS architecture) is supported by operational (good-to-go, pressand-go, turnkey) algorithmic and implementation solutions as proofs of concept. For example, this was not the case of the Marr (Marr, 1982) or the Diamant CV systems (Diamant, 2005; Diamant, 2008; Diamant, 2010a; Diamant, 2010b), whose computational theories (see Part I Section 2.6) are both inconsistent with algorithmic solutions adopted by their authors. As a consequence, these two CV systems become two more instances of the well-known class of two-stage segment-based hybrid CV systems, also termed GEOBIA systems, traditionally affected by a lack of general consensus and research (Hay & Castilla,

The proposed conclusions of potential interest to the RS, CV, AI and MAL communities are

 Since the late 1950s, the original ambitious goals of AI/MAI and Cybernetics/MAL have been fragmented into "practical" and "manageable" problems equivalent to "a family of relatively disconnected efforts" (Diamant, 2005; Diamant, 2008; Diamant,

supported by unquestionable independent sources of evidence listed below.

This material is partly based upon work supported by the National Aeronautics and Space Administration under Grant/Contract/Agreement No. NNX07AV19G issued through the Earth Science Division of the Science Mission Directorate. The research leading to these results has also received funding from the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement n° 263435. This author wishes to thank the Editorial Board of InTech for its competence and willingness to help.

#### **9. References**


FP7/2007-2013 under grant agreement n° 263435. This author wishes to thank the Editorial

Baatz, M.; Hoffmann, C.; Willhauck, G. Progressing from object-based to object-oriented

Backer, E. & Jain, A. K. (1981). A clustering performance measure based on fuzzy set

Baraldi, A. & Parmiggiani, F. (1996). Combined detection of intensity and chromatic contours in color images. *Optical Engineering*, Vol. 35, No. 5, pp. 1413-1439. Baraldi, A. & Alpaydin, E. (2002a). Constructive feedforward ART clustering networks—

Baraldi, A. & Alpaydin, E. (2002b). Constructive feedforward ART clustering networks—

Baraldi, A.; Bruzzone, L. & Blonda, P. (2005). Quality assessment of classification and cluster

Baraldi, A.; Puzzolo, V.; Blonda, P.; Bruzzone, L. & Tarantino, C. (2006). Automatic spectral

Baraldi, A.; Durieux, L.; Simonetti, D.; Conchedda, G.; Holecz, F. & Blonda, P. (2010a).

Baraldi, A.; Durieux, L.; Simonetti, D.; Conchedda, G.; Holecz, F. & Blonda, P. (2010b).

Baraldi, A. (2011a). Levels of understanding and degrees of novelty of a two-stage remote

maps without ground truth knowledge. *IEEE Trans. Geosci. Remote Sensing*, Vol. 43,

rule-based preliminary mapping of calibrated Landsat TM and ETM+ images, *IEEE* 

Automatic spectral rule-based preliminary classification of radiometrically calibrated SPOT-4/-5/IRS, AVHRR/MSG, AATSR, IKONOS/QuickBird/ OrbView/GeoEye and DMC/SPOT-1/-2 imagery – Part I: System design and implementation. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 3, pp. 1299 - 1325.

Automatic spectral rule-based preliminary classification of radiometrically calibrated SPOT-4/-5/IRS, AVHRR/MSG, AATSR, IKONOS/QuickBird/ OrbView/GeoEye and DMC/SPOT-1/-2 imagery – Part II: Classification accuracy assessment. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 3, pp. 1326 - 1354. Baraldi, A.; Wassenaar, T. & Kay, S. (2010c). Operational performance of an automatic

preliminary spectral rule-based decision-tree classifier of spaceborne very high resolution optical images. *IEEE Trans. Geosci. Remote Sensing*, Vol. 48, No. 9, pp. pp.

sensing image understanding system employing the Satellite Image Automatic Mapper™ (SIAM™) as its preliminary classification first stage. *IEEE Trans. Geosci. Remote Sensing*, submitted for consideration for publication, TGRS-2011-00241. Baraldi, A. (2011b). Fuzzification of a crisp near real-time operational automatic spectral

rule-based decision-tree preliminary classifier of multi-source multi-spectral

Springer-Verlag: New York, NY, 2008, Chapter 1.4, pp. 29- 42.

Part I. *IEEE Trans. Neural Netw.*, Vol. 13, No. 3, pp. 645–661.

Part II. *IEEE Trans. Neural Netw.*, Vol. 13, No. 3, pp. 662–677.

*Trans. Geosci. Remote Sensing.* Vol. 44, No. 9, pp. 2563-2586.

image analysis. In Object-Based Image Analysis–Spatial Concepts for Knowledgedriven Remote Sensing Applications; Blaschke, T., Lang, S., Hay, G.J., Eds.;

decomposition*. IEEE Trans. Pattern Anal. Mach. Intell*., Vol. PAMI-3, No. 1, pp. 66–

Board of InTech for its competence and willingness to help.

**9. References** 

75.

No. 4, pp. 857-873.

3482 - 3502.

remotely-sensed images, *IEEE Trans. Geosci. Remote Sensing*, accepted for publication, July 2011.


http://calvalportal.ceos.org/CalValPortal/showQA4EO.do?section=qa4eoIntro


Diamant, E. (2008). I'm Sorry to Say, But Your Understanding of Image Processing

Diamant, E. (2010a). Not only a lack of a right definition: Arguments for a shift in

Diamant, E. (2010b). Machine Learning: When and Where the Horses Went Astray?, In:

Di Gregorio, A. & Jansen, L. (2000). *Land Cover Classification System (LCCS): Classification* 

European Space Agency (ESA). (2008). GMES Observing the Earth, 17.04.2011, Available from: http://www.esa.int/esaLP/SEMOBSO4KKF\_LPgmes\_0.html Forestry Department, Food and Agriculture Organization (FAO) of the United Nations

Fritzke, B. (1997). *Some competitive learning methods* (Draft document), 17.04.2011, Available

GEO/CEOSS. (2008). A Quality Assurance Framework for Earth Observation, Version 2.0,

Gouras, P. (1991). Color vision, In: *Principles of Neural Science*, E. Kandel & J. Schwartz,

Group on Earth Observations (GEO). (2005). The Global Earth Observation System of

Group on Earth Observations (GEO). (2008). GEO 2007-2009 Work Plan: Toward Convergence, 17.04.2011, Available from: http://earthobservations.org Gutman, G. *et al*., (Eds.). (2004). *Land Change Science*, Kluwer Academic Publishers,

Hay, G. J. & Castilla, G. (2006). Object-based image analysis: Strengths, weaknesses,

Herold, M.; Woodcock, C.; Di Gregorio, A.; Mayaux, P.; Belward, A. S.; Latham, J. &

Systems (GEOSS) 10-Year Implementation Plan, 17.04.2011, Available from: http://www.earthobservations.org/docs/10-Year%20Implementation%20Plan.pdf

opportunities and threats (SWOT), *Proc. 1st Int. Conf. Object-based Image Analysis* (OBIA), 2006. 17.04.2011, Available from: www.commission4.isprs.org/obia06/

Schmullius, C. (2006). A joint initiative for harmonization and validation of land cover datasets. *IEEE Trans. Geosci. Remote Sensing*, Vol. 44, No. 7, pp. 1719–1727.

(Eds.), pp. 467-479, Appleton and Lange, Norwalk, Connecticut.

Papers/01\_Opening%20Session/OBIA2006\_Hay\_Castilla.pdf

 http://calvalportal.ceos.org/CalValPortal/showQA4EO.do?section=qa4eoIntro Global Monitoring for Environment and Security (GMES) (2011). 17.04.2011, Available from:

information-processing paradigm. 17.04.2011, Available from:

*Remote Sensing Letters*, Vol. 5, No. 3 (Jul. 2008), pp. 463-467.

http://arxiv.org/ftp/arxiv/papers/1009/1009.0077.pdf

Available from: http://arxiv.org/abs/0911.1386

*Coarse Spatial Resolution Sensors*, Rome, 2000.

DemoGNG.

17.04.2011, Available from:

http://www.gmes.info

Dordrecht, The Netherlands.

5, pp. 95-110, In-Tech Publishing.

Fundamentals Is Absolutely Wrong, In: *Brain, Vision and AI*, C. Rossi, (Ed.), Chapter

*Machine Learning*, Yagang Zhang, (Ed.), pp. 1-18, In-Tech Publishing. 17.04.2011,

*concepts and user manual*, FAO Corporate Document Repository, 17.04.2011, Available from: http://www.fao.org/DOCREP/003/X0596E/X0596e00.htm Esch, T.; Thiel, M.; Bock, M.; Roth, A. & Dech, S. (2008). Improvement of image

segmentation accuracy based on multiscale optimization procedure. *IEEE Geosci.* 

(2000). *FRA 2000 - Forest Cover Mapping & Monitoring with NOAA-AVHRR & Other* 

from: http://www.neuroinformatik.ruhr-unibochum.de/ini/VDM/research/gsn/


http://www.geog.umontreal.ca/donnees/geo6333/atcor23\_manual.pdf


### **8-Band Image Data Processing of the Worldview-2 Satellite in a Wide Area of Applications**

Cristina Tarantino, Maria Adamo, Guido Pasquariello, Francesco Lovergine, Palma Blonda and Valeria Tomaselli *National Council of Researches (CNR), Italy* 

#### **1. Introduction**

136 Earth Observation

USGS & NASA. (2011). Web-enabled Landsat data (WELD) Project. 17.04.2011, Available

Vecera, S. P. & Farah, M. J. (1997). Is visual image segmentation a bottom-up or an interactive process?. *Perception & Psychophysics*, Vol. 59, pp. 1280–1296. Wang, Z. & Bovik, A. C. (2002). A universal image quality index. *IEEE Signal Proc. Letters*,

Wilson, H. R. & Bergen, J. R. (1979). A four mechanism model for threshold spatial vision.

Yang, J. & Wang, R. S. (2007). Classified road detection from satellite images based on perceptual organization. *Int. J. Remote Sensing*, Vol. 28, No. 20, pp. 4653-4669. Zamperoni, P. (1996). Plus ça va, moins ça va. *Pattern Recognition Letters*, Vol. 17, No. 7,

from: http://landsat.usgs.gov/WELD.php

Vol. 9, No. 3, pp. 81-84.

(1996), pp. 671-677.

*Vision Res*., Vol. 19, pp. 19-32.

Recent years have seen advances in remote sensing in many fields with applications at a spatial scale which range from global to local. As a consequence, the need to observe the Earth with more specialized and sophisticated sensors and data analysis techniques to obtain more accurate information has increased. On the 8th October 2009 a new second nextgeneration Worldview-2 satellite was launched by DigitalGlobe: it represents the latest innovation among sensors for the acquisition of remote sensed imagery. It has an advanced agility due to control moment gyros (like Worldview-1) and combines an average revisiting time of 1.1 days around the globe with a large scale collection capacity. Moreover, it is also the first commercial satellite able to provide panchromatic imagery at 46 cm of spatial resolution and 8-band multispectral imagery at 1.84 m spatial resolution. In addition to the standard panchromatic and multispectral BLUE, GREEN, RED and NEAR INFRARED (NIR1) bands the Worldview-2 sensor has:


In literature, many studies deal with the use of the add on bands of the Worldview-2 sensor with respect to the traditional bands of the most common commercial satellites searching for new indexes in different application fields such as bathymetry [1], or vegetation and agricultural purposes ([2], [3]). In [4] the authors analyze the high correlation among some bands of the Worldview-2, like COASTAL and BLUE bands or NIR1 and NIR2 bands which could mean redundant and useless information associated with some of the add on bands.

The aim of this work is the study of the performance of the whole spectral information offered by the Worldview-2 sensor for the characterization and the classification of some selected land cover targets. Three main land cover targets were recognized: "Water", "Bare lands" and "Vegetated lands". The Worldview-2 image was, firstly, used for a finer discrimination of different sub-classes on the ground belonging to the land cover targets with the application of an unsupervised approach and the help of a certified CORINE-like Land Use Map, at a 1:10.000 scale. A hyperspectral image acquired by the airborne MIVIS sensor was used to analyze the spectral profiles characterizing each distinct sub-class. Then a standard Maximum Likelihood classifier was applied to the Worldview-2 image with different input configurations as below:


The accuracy of the classification map was estimated using a set of test fields randomly selected on the ground truth map.

ITT ENVI© and GRASS software were used to analyze and process data.

#### **2. The Worldview-2 data**

The data set analyzed was a Worldview-2 image granted by DigitalGlobe over an area of 100 km2 chosen by the authors among the available archive acquisitions. The scene, acquired on the 13th June 2010, includes the region known as the "Natural Oasis of Lago Salso", an area essentially wet and marshy, sited in the south-east of the Capitanata in the Apulia Region, Italy. The Natural Oasis of Lago Salso is characterized by the presence of a wetland of considerable importance (one of the most important in southern Italy) as a breeding and step birds station. The area falls in the Natura 2000 network, found within the boundaries of the Site of Community Interest (SCI) IT 9110005 "Zone umide della Capitanata" and of the Special Protection Area (SPA) IT9110038 "Paludi presso il Golfo di Manfredonia". The Natural Oasis of Lago Salso falls also within the Gargano National Park. The Natural Oasis has an extent of about 1040 ha and only 500 ha are wetland "sensu strictu", the remaining part is covered by cultivated or partially abandoned areas. Agricultural areas cover a wide surface formerly occupied by coastal lagoons (until the 1950s) and subsequently buried and used for agricultural purposes. SCI and SPA have an extent, respectively, of 14,109 ha and 14,437 ha. Water bodies are subject to fluctuations of water levels over the year, creating ecological gradients due to the variation of salt rates and moisture in soil. Soil salinity gradually increases with soil elevation, reaching a maximum just above mean high sea level (MHSL). Above the MHSL, the salinity tends to decrease due to progressively less frequent flooding. The zonation of the vegetation of salt marshes is typically associated with the tolerance to these ecological gradients.

Figure 1 shows an RGB composition in the visible spectrum of the Worldview-2 image.

Fig. 1. RGB composition in the visible spectrum of Worldview–2 image.

#### **2.1 Preprocessing**

138 Earth Observation

NIR1 and NIR2 bands which could mean redundant and useless information associated

The aim of this work is the study of the performance of the whole spectral information offered by the Worldview-2 sensor for the characterization and the classification of some selected land cover targets. Three main land cover targets were recognized: "Water", "Bare lands" and "Vegetated lands". The Worldview-2 image was, firstly, used for a finer discrimination of different sub-classes on the ground belonging to the land cover targets with the application of an unsupervised approach and the help of a certified CORINE-like Land Use Map, at a 1:10.000 scale. A hyperspectral image acquired by the airborne MIVIS sensor was used to analyze the spectral profiles characterizing each distinct sub-class. Then a standard Maximum Likelihood classifier was applied to the Worldview-2 image with

1. the 4 bands (R,G,B,NIR1) common to the standard commercial multispectral sensors at

The accuracy of the classification map was estimated using a set of test fields randomly

The data set analyzed was a Worldview-2 image granted by DigitalGlobe over an area of 100 km2 chosen by the authors among the available archive acquisitions. The scene, acquired on the 13th June 2010, includes the region known as the "Natural Oasis of Lago Salso", an area essentially wet and marshy, sited in the south-east of the Capitanata in the Apulia Region, Italy. The Natural Oasis of Lago Salso is characterized by the presence of a wetland of considerable importance (one of the most important in southern Italy) as a breeding and step birds station. The area falls in the Natura 2000 network, found within the boundaries of the Site of Community Interest (SCI) IT 9110005 "Zone umide della Capitanata" and of the Special Protection Area (SPA) IT9110038 "Paludi presso il Golfo di Manfredonia". The Natural Oasis of Lago Salso falls also within the Gargano National Park. The Natural Oasis has an extent of about 1040 ha and only 500 ha are wetland "sensu strictu", the remaining part is covered by cultivated or partially abandoned areas. Agricultural areas cover a wide surface formerly occupied by coastal lagoons (until the 1950s) and subsequently buried and used for agricultural purposes. SCI and SPA have an extent, respectively, of 14,109 ha and 14,437 ha. Water bodies are subject to fluctuations of water levels over the year, creating ecological gradients due to the variation of salt rates and moisture in soil. Soil salinity gradually increases with soil elevation, reaching a maximum just above mean high sea level (MHSL). Above the MHSL, the salinity tends to decrease due to progressively less frequent flooding. The zonation of the vegetation of salt marshes is typically associated with the

Figure 1 shows an RGB composition in the visible spectrum of the Worldview-2 image.

2. the 4 bands R,G,B,NIR1 adding on, one at a time, the new bands;

ITT ENVI© and GRASS software were used to analyze and process data.

3. the new complete configuration with 8 spectral bands.

with some of the add on bands.

different input configurations as below:

very high spatial resolution;

selected on the ground truth map.

tolerance to these ecological gradients.

**2. The Worldview-2 data** 

The image was calibrated in order to produce the reflectance image and to obtain the spectral profiles of some targets in the scene to compare with a previously acquired hyperspectral data set. The processing includes the following steps:


$$\rho = \frac{d\_{ES}^2 \pi \mathcal{L}\left(\theta, \phi, \lambda\right)}{E\_S\left(\lambda\right)\cos\theta\_S} \tag{1}$$

where *θS* is the Solar Zenith Angle, L is the spectral radiance for a defined pixel and wavelength and *Es* is the mean solar spectral irradiance. The term *dES* is the earth-sun distance in astronomical units as a function of the viewing day and time.

Reflectance values belong to the range [0, 1].

#### **3. Selection and characterization of targets**

An existing certified CORINE-like land use map at 1:10000 scale [5] was considered as ground truth. The map was produced in 2006. It originally showed a set of 40 land use thematic classes: after a first screening only land cover classes were selected. An unsupervised analysis was used to cluster the EO data into a certain number of spectrally different signatures. To accomplish this task the "K-Means" algorithm was considered and the 8 bands of the Worldview-2 image were used as input. After a few attempts, a number of 20 unlabelled classes (with a maximum number of 50 iterations until the convergence and 1% as the change threshold to end the iterative process) were selected. Comparing the clusters with the ground truth information resulted in the splitting/merging of certain classes, for a total number of 18 land cover classes. As shown in Figure 2 where a bathymetry map of the test site is represented, the 8 band segmented map reports 3 differentiated clusters in correspondence with the ground truth class labeled as "Sea". These three different signatures could be associated with different depth values of the sea.

Fig. 2. Bathymetric map (a) and 8-band segmented map (b).

The 18 selected classes, grouped into 4 main land cover targets of interest were "Water", "Bare lands", "Vegetated lands" and "Artificial", as shown in Table 1. The target "Artificial" was eliminated due to its poor presence in the scene. A sample of each considered class is shown in Figure 3.

An existing certified CORINE-like land use map at 1:10000 scale [5] was considered as ground truth. The map was produced in 2006. It originally showed a set of 40 land use thematic classes: after a first screening only land cover classes were selected. An unsupervised analysis was used to cluster the EO data into a certain number of spectrally different signatures. To accomplish this task the "K-Means" algorithm was considered and the 8 bands of the Worldview-2 image were used as input. After a few attempts, a number of 20 unlabelled classes (with a maximum number of 50 iterations until the convergence and 1% as the change threshold to end the iterative process) were selected. Comparing the clusters with the ground truth information resulted in the splitting/merging of certain classes, for a total number of 18 land cover classes. As shown in Figure 2 where a bathymetry map of the test site is represented, the 8 band segmented map reports 3 differentiated clusters in correspondence with the ground truth class labeled as "Sea". These

three different signatures could be associated with different depth values of the sea.

**3. Selection and characterization of targets** 

Fig. 2. Bathymetric map (a) and 8-band segmented map (b).

shown in Figure 3.

The 18 selected classes, grouped into 4 main land cover targets of interest were "Water", "Bare lands", "Vegetated lands" and "Artificial", as shown in Table 1. The target "Artificial" was eliminated due to its poor presence in the scene. A sample of each considered class is


Table 1. Different land cover classes selected for supervised classification.

#### **3.1 Analysis of the targets' spectral profiles**

The analysis of the targets' spectral profiles was carried out by means of a dataset composed by an MIVIS airborne system hyperspectral image, acquired on 25th May 2009 at 06:18 UTC. The selection of this image was possible because of the comparable period of acquisition with respect to the Worldview-2 image. MIVIS (Multispectral Infrared and Visible Imaging Spectrometer) is a hyperspectral sensor consisting of 4 spectrometers which acquire radiation coming from the surface in the VNIR (20 bands between 0.411 and 0.819 μm), in the NIR (8 bands between 1.145 and 1.54 μm), in the MIR (64 bands between 1.992 and 2.474 μm) and in the TIR (10 bands between 8.34 and 12.42 μm). The result of the MIVIS images pre-processing step is an image with pixels given in radiance values (μW/cm2·sr·nm). In order to compare Worldview-2 and MIVIS spectral profiles the analysis was focused on the 20 bands of the VNIR spectrometer which match with the Worldview-2 bands. Details of the VNIR MIVIS bands and comparison with the Worldview-2 bands are shown in Table 2.

Fig. 3a. The different classes grouped in the target "Water".

Fig. 3b. The different classes grouped in the target "Bare Land".

Fig. 3a. The different classes grouped in the target "Water".

Fig. 3b. The different classes grouped in the target "Bare Land".

Fig. 3c. The different classes grouped in the target "Vegetated Land".


Table 2. MIVIS and Woldview-2 spectral details.

Because of its flexible airborne platform for remote sensing, the MIVIS system is able to acquire images with a good spatial resolution. The MIVIS acquisition used for this analysis was made at a height of 1.5 Km so the spatial resolution is 3 m at nadir. In order to compare MIVIS spectra with those produced by Worldview-2, the pixel values of MIVIS images were also converted into reflectance.

Due to the unknown quality of the MIVIS data pre-processing calibration, the comparison between the Worldview-2 and the MIVIS profiles has to be considered in terms of spectral profile trends. In addition, no atmospheric correction was made to both the images [6]. As a consequence, the consideration that the atmosphere contributes in a different way to the reflectance measured by sensors, due to the different day of acquisition and the different flight height of sensors, should be observed. In Figure 4, a subset of the MIVIS acquisition corresponding to the Worldview-2 image is shown. Close to the right edge of the frame a slight pattern of sunglint is visible. It is presumed that it influences the reflectance of the sea.

The spectral analysis was carried out by selecting regions which can be considered representative of the 18 classes identified by the unsupervised analysis and the ground truth map. A time interval of about one year between the MIVIS and Worldview-2 acquisitions restricts the selection of target areas to regions not affected by significant changes between the two dates. In fact there could be some arable lands where crops changed for agricultural practices or were covered with screening covers. Moreover, it should be considered that the two images were acquired in two different spring months corresponding to the different phenological status of the same crop.

Fig. 4. MIVIS image acquired on 25th May 2009 at 06:18UTC.

Because of its flexible airborne platform for remote sensing, the MIVIS system is able to acquire images with a good spatial resolution. The MIVIS acquisition used for this analysis was made at a height of 1.5 Km so the spatial resolution is 3 m at nadir. In order to compare MIVIS spectra with those produced by Worldview-2, the pixel values of MIVIS images were

Due to the unknown quality of the MIVIS data pre-processing calibration, the comparison between the Worldview-2 and the MIVIS profiles has to be considered in terms of spectral profile trends. In addition, no atmospheric correction was made to both the images [6]. As a consequence, the consideration that the atmosphere contributes in a different way to the reflectance measured by sensors, due to the different day of acquisition and the different flight height of sensors, should be observed. In Figure 4, a subset of the MIVIS acquisition corresponding to the Worldview-2 image is shown. Close to the right edge of the frame a slight pattern of sunglint is visible. It is presumed that it influences the reflectance of the sea. The spectral analysis was carried out by selecting regions which can be considered representative of the 18 classes identified by the unsupervised analysis and the ground truth map. A time interval of about one year between the MIVIS and Worldview-2 acquisitions restricts the selection of target areas to regions not affected by significant changes between the two dates. In fact there could be some arable lands where crops changed for agricultural practices or were covered with screening covers. Moreover, it should be considered that the two images were acquired in two different spring months corresponding to the different

also converted into reflectance.

phenological status of the same crop.

Fig. 4. MIVIS image acquired on 25th May 2009 at 06:18UTC.

In Figure 5, Worldview-2 (top) and MIVIS (bottom) water target spectral profiles are shown. As it can be noticed for all the target profiles, there is an atmospheric contribution to the Worldview-2 reflectance. In particular, in the range of shorter wavelengths of the COASTAL and the BLUE bands, the Rayleigh scattering is the prevailing contribution while for the longer wavelengths, like the NIR2 band, the water vapor absorption is dominant. The gaseous absorption results as visible also for the MIVIS profiles. It can be noted that:


Fig. 5. Worldview-2 (top) and MIVIS (bottom) spectral profiles of water classes.


Fig. 6. Natural Oasis of Lago Salso.

In Figure 7, Worldview-2 (top) and MIVIS (bottom) spectral profiles for vegetated land regions are shown. MIVIS spectral profiles of the six vegetated land classes show the typical vegetation trend. The absorption of chlorophyll in the BLUE and the RED regions of the spectrum can be observed. A peak at the GREEN region which gives rise to the green color of vegetation was noted. In the NIR the reflectance is much higher than that in the visible band due to the cellular structure in the leaves. The slope of the spectrum profile between RED and NIR is characteristic of the vegetation species and gives information about plant health [9]. Spectral profiles also show a reduction in band 20 (Table 1) due to atmospheric absorption. Analyzing MIVIS spectra some considerations about vegetated land classes can be made. The class labeled as "Arable Land with Vegetation 2" shows a reduced increase of reflectance in the wavelength range between RED and NIR and a peak of reflectance in correspondence with the GREEN range which is less evident with respect to the other profiles. Considering the particular color of the regions and the presence of an almost regular texture, it is possible that this class is related to arable fields covered by a thick net typical of local agricultural practices.

Classes labeled as "Vegetated Marshy Areas 1" and "Vegetated Marshy Areas 2" are relative to different kinds of vegetation characterizing the "Natural Oasis of Lago Salso" (Figure 6). The Worldview-2 profiles, due to atmospheric effects, do not show the typical trend of vegetation spectra in the visible range. The absorption peak in the BLUE range is suppressed by the Rayleigh scattering contribution which decreases with an increase in wavelength. On the contrary, the range of the spectrum from RED to NIR1 can be useful for vegetation characterization and, except for a few differences (which could be explained considering the time interval between the two acquisitions) MIVIS and Worldview-2 spectral profiles are sufficiently in agreement. The class labeled as "Forested Area "is mainly composed of the Siponto pine forest (Figure 8) sited in a coastal area on the Manfredonia Gulf. The spectral profile obtained by MIVIS and confirmed by Worldview-2 shows a low reflectance in the NIR spectral range which is correlated to lower vegetation LAI [10].

In Figure 9, Worldview-2 (top) and MIVIS (bottom) spectral profiles for bare land regions are shown. In this case the trend of MIVIS spectral profiles are in agreement with the Worldview-2 ones; although, the better spectral resolution of MIVIS is able to acquire finer spectral signatures for every class.

Fig. 7. Worldview-2 (top) and MIVIS (bottom) spectral profiles of vegetated land classes.

Fig. 8. Siponto Pine Forest.

In Figure 7, Worldview-2 (top) and MIVIS (bottom) spectral profiles for vegetated land regions are shown. MIVIS spectral profiles of the six vegetated land classes show the typical vegetation trend. The absorption of chlorophyll in the BLUE and the RED regions of the spectrum can be observed. A peak at the GREEN region which gives rise to the green color of vegetation was noted. In the NIR the reflectance is much higher than that in the visible band due to the cellular structure in the leaves. The slope of the spectrum profile between RED and NIR is characteristic of the vegetation species and gives information about plant health [9]. Spectral profiles also show a reduction in band 20 (Table 1) due to atmospheric absorption. Analyzing MIVIS spectra some considerations about vegetated land classes can be made. The class labeled as "Arable Land with Vegetation 2" shows a reduced increase of reflectance in the wavelength range between RED and NIR and a peak of reflectance in correspondence with the GREEN range which is less evident with respect to the other profiles. Considering the particular color of the regions and the presence of an almost regular texture, it is possible that this class is related to arable fields covered by a thick net

Classes labeled as "Vegetated Marshy Areas 1" and "Vegetated Marshy Areas 2" are relative to different kinds of vegetation characterizing the "Natural Oasis of Lago Salso" (Figure 6). The Worldview-2 profiles, due to atmospheric effects, do not show the typical trend of vegetation spectra in the visible range. The absorption peak in the BLUE range is suppressed by the Rayleigh scattering contribution which decreases with an increase in wavelength. On the contrary, the range of the spectrum from RED to NIR1 can be useful for vegetation characterization and, except for a few differences (which could be explained considering the time interval between the two acquisitions) MIVIS and Worldview-2 spectral profiles are sufficiently in agreement. The class labeled as "Forested Area "is mainly composed of the Siponto pine forest (Figure 8) sited in a coastal area on the Manfredonia Gulf. The spectral profile obtained by MIVIS and confirmed by Worldview-2 shows a low reflectance in the

In Figure 9, Worldview-2 (top) and MIVIS (bottom) spectral profiles for bare land regions are shown. In this case the trend of MIVIS spectral profiles are in agreement with the Worldview-2 ones; although, the better spectral resolution of MIVIS is able to acquire finer

NIR spectral range which is correlated to lower vegetation LAI [10].

Fig. 6. Natural Oasis of Lago Salso.

typical of local agricultural practices.

spectral signatures for every class.

It can be noted that "Arable Land without Vegetation 1", "Arable Land without Vegetation 2" and "Arable Land without Vegetation 3" are characterized by a spectral signature similar in shape but with an increasing average reflectance. This can be explained considering that the reflectance level decreases for soil with increasing moisture [11] and so the different targets could be associated with different moisture content. "Sand", instead, shows a spectral profile which differs from the other ones probably due to the extremely different composition of the soil.

Fig. 9. Worldview-2 (top) and MIVIS (bottom) spectral profiles of Bare Land classes.

#### **4. Processing, results and discussion**

For the supervised analysis, the standard statistic "Maximum Likelihood" (ML) algorithm was considered. The 18 classes of table 1, recognized on the scene with the guide of the segmentation step and better characterized with the help of MIVIS data, were selected. Randomly selected training (TR) and test (TE) sets were used respectively to train the algorithm and to assess the accuracy of the produced maps. For the accuracy of all the classes, the Overall Accuracy percentage (OA%) (i.e. number of correctly classified pixels divided by the total of pixels) with the estimation of the relative confidence interval with a significance of 95% [12] as computed. For the accuracy of each class, the Mapping Accuracy percentage (MA%), [13], [14], was computed. It is defined as:

$$\text{MA\%} = \frac{\text{pixels}\_{\text{correctlyclassified}}}{\text{pixels}\_{\text{correctlyclassified}} + \text{pixels}\_{\text{emission}} + \text{pixels}\_{\text{commission}}} \cdot 100 \tag{2}$$

where:

pixelsomission is the number of pixels assigned to other classes along the row of the confusion matrix relevant to the class considered;

pixelscommission is the number of pixels assigned to other classes along the column of the confusion matrix relevant to the class considered.

According to [15], many input configurations to the classifier were tested considering, firstly, the standard 4 spectral bands of the image and then adding a fifth band among the 4 add on bands of Worldview-2 in order to analyze the specific contribution of each band.


Finally all the 8 band contributions were considered. The results obtained for all the classes with different input bands to the supervised classifier are shown in table 3.

Table 3. Results in the supervised classification.

148 Earth Observation

Fig. 9. Worldview-2 (top) and MIVIS (bottom) spectral profiles of Bare Land classes.

For the supervised analysis, the standard statistic "Maximum Likelihood" (ML) algorithm was considered. The 18 classes of table 1, recognized on the scene with the guide of the segmentation step and better characterized with the help of MIVIS data, were selected. Randomly selected training (TR) and test (TE) sets were used respectively to train the algorithm and to assess the accuracy of the produced maps. For the accuracy of all the classes, the Overall Accuracy percentage (OA%) (i.e. number of correctly classified pixels divided by the total of pixels) with the estimation of the relative confidence interval with a significance of 95% [12] as computed. For the accuracy of each class, the Mapping Accuracy

> % 100 *correctlyclassified correctlyclassified omission commission*

(2)

*pixels pixels pixels*

pixelsomission is the number of pixels assigned to other classes along the row of the confusion

pixelscommission is the number of pixels assigned to other classes along the column of the

According to [15], many input configurations to the classifier were tested considering, firstly, the standard 4 spectral bands of the image and then adding a fifth band among the 4 add on bands of Worldview-2 in order to analyze the specific contribution of each band.

*pixels*

**4. Processing, results and discussion** 

*MA*

matrix relevant to the class considered;

confusion matrix relevant to the class considered.

where:

percentage (MA%), [13], [14], was computed. It is defined as:

In training and testing, with an increase in the number of the bands there is an increase in the OA% because more information was added as input to the classifier to improve discrimination among classes. Observing the generalization ability, in testing, an improvement of 10% was achieved with the use of 8 bands with respect to 4 bands. The asterisk indicates the best value.

The same analysis was carried out for each target ("Water"-"Bare land"-"Vegetated land") in order to evaluate the contribution that each of the add on 4 bands could give to characterize the specific target. A finer detailed discrimination among the classes is expected. For the target "Water" the MA% in testing in the different input configurations to the classifier are shown in Table 4.


Table 4. MA% in test for the classes belonged to the target Water.

The best MA% value is obtained with the use of 8 bands, as indicated with a double asterisk, with an average improvement of 20% with respect to the use of only 4 bands. Analyzing the contribution of each add on band to the single class, it emerged that (the best value due to the add on bands has been marked with a single asterisk):

 the discrimination of "Sea Water 1" (deep) and "Marsh Water" is improved by the NIR2 band. "Marsh Water" is water with the presence of vegetation under and over the surface and this could explain the role of NIR2;


For the target "Bare land", the MA% in testing in the different input configurations to the classifier are shown in Table 5.


Table 5. MA% in test for the classes belonged to the target Bare Land.

For all the different spectral signatures, the best MA% value is obtained with the use of 8 bands, as indicated with a double asterisk, with an average improvement of 10% with respect to the use of only 4 bands. The class "Arable Land without Vegetation 4" is an exception which can be justified by a high misclassification with the class "Arable Land without Vegetation 3". Analyzing the contribution of each add on band to the single class, it emerged that (the best value due to the add on bands has been marked with a single asterisk):


The different spectral profiles could be explained by the different pedological composition of soil or its different water content. For the "Vegetated land" target, the MA% test classification values obtained with different input bands are shown in Table 6.

The best MA% value is obtained with 8 bands, as evidenced by a double asterisk in the table, with an average improvement of about 3% for "Forested Area" and "Vegetated Marshy Areas" and of about 30% for "Arable Land with Vegetation 1" and "Arable Land with Vegetation 2" with respect to the use of only 4 bands. For each class, the best result obtained by a specific band is evidenced by a single asterisk in the table.


Table 6. MA% in test for the classes belonged to the target Vegetated Land.

### **5. Conclusions**

150 Earth Observation

 the discrimination of "Sea Water2 " (medium deep) and "Sea Water 3" (coastal) is improved by the YELLOW band that appears to be able to recognize water with

 the discrimination of "River Water", substantially muddy water, is improved by the COASTAL band which appears able to recognize a mixture of water and mud. For the target "Bare land", the MA% in testing in the different input configurations to the

**4 BANDS** 80.84 38.62 48.72 55.20 39.10

**WITH NIR2** 80.55 50.17 55.99\* 65.01\*\* 39.16 **8 BANDS** 86.15\*\* 57.67\*\* 61.17\*\* 64.10 44.24\*\*

For all the different spectral signatures, the best MA% value is obtained with the use of 8 bands, as indicated with a double asterisk, with an average improvement of 10% with respect to the use of only 4 bands. The class "Arable Land without Vegetation 4" is an exception which can be justified by a high misclassification with the class "Arable Land without Vegetation 3". Analyzing the contribution of each add on band to the single class, it emerged that (the best

 the discrimination of "Arable Land without Vegetation 1" (dark brown) and "Arable Land without Vegetation 2" (light brown) is improved by the YELLOW band; the discrimination of "Arable Land without Vegetation 3" (orange) is improved by the NIR2 band and by the RED EDGE, whereas "Arable Land without Vegetation 4" (very

The different spectral profiles could be explained by the different pedological composition of soil or its different water content. For the "Vegetated land" target, the MA% test

The best MA% value is obtained with 8 bands, as evidenced by a double asterisk in the table, with an average improvement of about 3% for "Forested Area" and "Vegetated

Table 5. MA% in test for the classes belonged to the target Bare Land.

value due to the add on bands has been marked with a single asterisk):

the discrimination of "Sand" is improved by the COASTAL band.

classification values obtained with different input bands are shown in Table 6.

light brown) is improved only by the NIR2 band;

**ARABLE LAND WITHOUT VEGETATION 3 (orange)**

75.05 38.08 49.48 58.74 41.40\*

81.31\* 50.53\* 48.51 56.31 39.07

78.71 49.78 55.96\* 60.28 40.71

**ARABLE LAND WITHOUT VEGETATION 4 (very light brown)**

**SAND**

**ARABLE LAND WITHOUT VEGETATION 2 (light brown)**

hanging deposits;

classifier are shown in Table 5.

**ARABLE LAND WITHOUT VEGETATION 1 (dark brown)**

**Input Configuration**

> **5 BANDS WITH COASTAL**

**5 BANDS WITH YELLOW**

**5 BANDS WITH RED EDGE**

**5 BANDS** 

This paper describes the experimental activity aimed at the exploitation of the new Worldview-2 sensor with respect to the effectiveness of the new add on COASTAL, YELLOW, RED EDGE and NIR2 bands. Firstly, an unsupervised analysis for data spectral clustering was applied to discriminate among the different spectral signatures, then a supervised image classification produced a land cover map. Standard/commercial tools were used. In the first step the clusters in the spectral domain were interpreted with the help of a detailed ground truth map and compared with a hyperspectral data set. This analysis showed that the 8-band sensor is extremely useful to better discriminate different spectral sub-signatures corresponding to the same land cover category. This means that the major capability of the new sensor resides in the capacity of investigating the "ground" diversity underlying the apparent homogeneity of conventional land cover/land use map categorization. From the supervised classification, it was possible to detect changes in the bathymetry for the "Sea Water" classes by using the COASTAL band; moreover, the lowest wavelength band appears to be significant for the recognition of mixed patterns of water and terrain. The YELLOW band appears significant to detect the presence of hanging deposits or to elicit terrain composition, as characterized by a certain degree of "yellowness". Finally, the RED EDGE and the NIR2 bands seem useful for a better discrimination of ground sites characterized by a mixing of water and vegetation. The increase in thematic accuracy was 10%, passing from the "traditional" 4-band to the new 8 band sensor.

#### **6. Acknowledgements**

This work was supported by the project "Flight Risks Mitigation and Nowcasting at Airports" (RIVONA) funded by the Apulia Region, POFESR 2007-2013.

The authors want to thank DigitalGlobe for having offered the opportunity to analyze images from the newest Worldview-2 sensor.

Special acknowledgements to Planetek Italia s.r.l. for supplying the MIVIS data set.

#### **7. References**


http://www.digitalglobe.com/downloads/8bc/borel\_8band\_paper\_12\_14\_10.pdf.


### **Convex Set Approaches for Material Quantification in Hyperspectral Imagery**

Juan C. Valdiviezo-N and Gonzalo Urcid *Optics Department, INAOE Mexico*

#### **1. Introduction**

152 Earth Observation

This work was supported by the project "Flight Risks Mitigation and Nowcasting at

The authors want to thank DigitalGlobe for having offered the opportunity to analyze

[1] Bramante, J. F.; Raju, D. K. & Tsai Min S., Derivation of bathymetry from multispectral

http://www.digitalglobe.com/downloads/8bc/8band\_Challenge\_TMSI.pdf. [2] Ozdemir, I.; Karnieli, A. (2011), Predicting forest structural parameters using the image

[3] Borel, C. C., Vegetative canopy parameter retrieval using 8-band data*, DigitalGlobe 8-*

[4] Peroni, G.; Gachelin, J.P.; Saint-Pol, M., Legoff, V.; Fontanot, F. & Sannier C. (2010), New

[6] Baraldi, A. (2009), Impact of radiometric calibration and specifications of spaceborne

[7] Stramski, D.; Wozniak, S. B. & Flatau, P. J. (2004), Optical properties of Asian mineral

[8] Robinson, I.S. (2004), *Measuring the Oceans from Space - The principles and methods of* 

[9] Govender, M.; Chetty, K. & Bulcock, H. (2007), A review of hyperspectral remote sensing and its application in vegetation and water resource studies*, Water SA*, 33(2), 1–8. [10] Schlerf, M.; Atzberger, C. & Hill J. (2005), Remote sensing of forest biophysical variables using HyMap imaging spectrometer data*, Remote Sens. Environ.* 95, 177-194. [11] Bowers, S.A. & Hanks, A.J. , Reflection of radiant energy from soil, *Soil Science*, 100: 130. [12] Baraldi, A.; Puzzolo, V.; Blonda, P.; Bruzzone, L. & Tarantino C. (2006), Automatic

Images, *IEEE Trans. On Geoscience and Remote Sensing*, Vol. 44, No. 9.

[14] Congalton, R. & Green K. (1999), *Assessing the Accuracy of Remotely Sensed Data:* 

[15] Puetz, A.M.; Lee, K. & Olsen R.C. (2009), Worldview-2 data simulation and analysis

imagery in the highly turbid waters of Singapore's south islands: A comparative

texture derived from WorldView-2 multispectral imagery in a dryland forest, Israel*, Int. Journal of Appl. Earth Observation and Geoinformation*, Volume 13, Issue 5,

http://www.digitalglobe.com/downloads/8bc/borel\_8band\_paper\_12\_14\_10.pdf.

spectral data available for the controls in agriculture (CWRS) and for vegetation

optical imaging sensors on the development of operational automatic remote sensing image understanding systems, *IEEE Journal of Selected Topics in Applied* 

Spectral Rule-based Preliminary Mapping of Calibrated Landsat TM and ETM+

Special acknowledgements to Planetek Italia s.r.l. for supplying the MIVIS data set.

study, *DigitalGlobe 8-Band Research Challenge 2010*, Available from:

Airports" (RIVONA) funded by the Apulia Region, POFESR 2007-2013.

*Band Research Challenge 2010*, Available from:

monitoring*, Proc. Of the 16th GeoCAP Annual Conference*. [5] GIS Apulia Region, Italy, , Available from: http://www.sit.Apulia.it/

*Earth Observations and Remote Sensing*, Vol. 2, No.2.

[13] Short, N.M., The Remote Sensing Tutorial, NASA, Available from

*Principles and Practices*, CRC/Lewis Press, Boca Raton.

dust suspended in seawater, *Limnol. Oceanogr*.

*satellite oceanography*, Springer.

http://rst.gsfc.nasa.gov.

results, *Proc. of SPIE*, Vol. 7334.

**6. Acknowledgements** 

**7. References** 

Pages 701-710.

images from the newest Worldview-2 sensor.

Emerging as the combination of optics and spectroscopy, the development of high resolution imaging spectrometers has allowed a new perspective for the monitoring, identification and quantification of natural resources in Earth's surface, that is known today as *hyperspectral remote sensing*. An imaging spectrometer is an instrument that images the energy reflected or scattered by an object in hundred of spectral bands at different portions of the electromagnetic spectrum. Although these devices have been developed for remote sensing purposes, their applications have substantially increased in the last years because of their capabilities in materials identification, being also used in biology, medicine and related areas (Huebshman et al, 2005). In contrast to multispectral devices where each imaged spectral band covers a wide spectral range, a hyperspectral sensor has a higher spectral resolution that usually is less than 10 nm; thus, the number of spectral bands captured by the sensors represents an important difference between both technologies. Once the hyperspectral data have been appropriately calibrated taking into account the illumination factors and the atmospheric effects, the spectral information registered at each pixel of the image allows a direct identification of any imaged object based on its spectrum.

When Earth observation is the application, a hyperspectral sensor usually presents a low spatial resolution caused by either the characteristics of the instrument or the flight altitude of the aerial platform, which causes that the spatial resolution decreases as the distance from the Earth increases. Considering such a sensor having a spatial resolution in the order of meters, the spectral reflectance captured in a single pixel of the image would be comprised by the mixed reflectance spectra of different materials or objects present in that physical area. Therefore, the image data will be formed by a number of pixels whose spectral information corresponds to the mixture of the constituent materials spectra. Many authors in the literature have proposed to represent these spectral mixtures as a linear combination of constituent materials spectra with their corresponding abundances (Boardman, 1993; Keshava, 2003; Winter, 1999). This model, frequently known as the *constrained linear mixing model* (CLMM), has been the basis for some autonomous techniques oriented toward the unsupervised identification of constituent materials from hyperspectral imagery, and can be considered as a convex set representation.

This chapter presents a general overview of the techniques based on a convex set representation that have been used to identify the constituent materials from a hyperspectral scene. Besides the presentation of some classical methods used for this purpose, we are going to emphasize a recently published technique whose properties are based on lattice algebra to approximate a minimum convex set. The organization of this chapter is as follows. In Section 2 the physical foundations concerning the hyperspectral imaging process, including data characteristics and their appropriate calibration will be presented. In Section 3 we will state the necessary mathematical background to understand fundamental concepts such as minimum convex sets, affine independence, and their relation with constituent materials in hyperspectral data. Section 4 will describe some classical as well as recent techniques to achieve the autonomous endmember determination process. Section 5 will start with a brief mathematical background on lattice algebra that is necessary to understand the endmember determination method that will be described later. The section will be complemented with the presentation of two canonical lattice associative memories whose geometrical properties are used to define a convex hull from hyperspectral data. In Section 6 we will provide two application examples to illustrate the autonomous identification of natural resources from two scenes registered, respectively, over the Gulf of Mexico, and the Belstville area in Maryland (USA). Thus, the endmember identification will be realized using lattice associative memories and another novel method known as *vertex component analysis* (VCA). Finally, in section 6 we will give some pertinent comments and conclusions of this chapter.

#### **2. Hyperspectral imaging**

The development of more sophisticated imaging technologies in combination with high resolution spectrometers has given place to a new perspective in remote sensing, in which it is possible to register simultaneously the spatial and spectral information of the energy reflected from Earth's surface. These instruments, known as imaging spectrometer systems, image the Sun radiance reflected from or emitted by materials on the surface, in hundred of narrow and contiguous spectral bands usually in the reflective solar portion of the spectrum (from 0.35 to 2.5 *μ*m). In remote sensing terminology, the region from approximately 0.35 to 1.0 *μ*m is known as the *visible/near infrared* (VNIR) and the range from 1.0 to 2.5 *μ*m is known as the *short wavelength infrared* (SWIR). Therefore, the resulting hyperspectral data consist of an image cube conformed by a number of radiance images that can be used to estimate the reflectance spectra of the scene. Thus, the information contained in a single pixel of a hyperspectral image can be used to compare and identify any object based on its characteristic spectrum, at a specific location of the zone of interest.

#### **2.1 Physical foundations**

There are fundamental matter-energy interaction processes that constitute the basis of the information captured by spectrometer instruments. The electromagnetic radiation coming from the sun can be modified in its direction, intensity or polarization when reaching the Earth's surface. These radiation changes depend on the physical and chemical constitution of the materials comprising the surface, and can be classified as radiation *transmission, reflection, absorption* or *emission*. When an electromagnetic wave propagating in free space reaches the frontier of a different medium, one part of its energy can be transmitted through the material and the other part can be reflected by the surface. Thus, the portion of energy that has been transmitted can be absorbed by some molecules at certain frequencies, causing an increment of the energy in their electrons and a change in the energy level. After a short time in the excitation state, the electrons return to their original state producing an emission 2 Will-be-set-by-IN-TECH

scene. Besides the presentation of some classical methods used for this purpose, we are going to emphasize a recently published technique whose properties are based on lattice algebra to approximate a minimum convex set. The organization of this chapter is as follows. In Section 2 the physical foundations concerning the hyperspectral imaging process, including data characteristics and their appropriate calibration will be presented. In Section 3 we will state the necessary mathematical background to understand fundamental concepts such as minimum convex sets, affine independence, and their relation with constituent materials in hyperspectral data. Section 4 will describe some classical as well as recent techniques to achieve the autonomous endmember determination process. Section 5 will start with a brief mathematical background on lattice algebra that is necessary to understand the endmember determination method that will be described later. The section will be complemented with the presentation of two canonical lattice associative memories whose geometrical properties are used to define a convex hull from hyperspectral data. In Section 6 we will provide two application examples to illustrate the autonomous identification of natural resources from two scenes registered, respectively, over the Gulf of Mexico, and the Belstville area in Maryland (USA). Thus, the endmember identification will be realized using lattice associative memories and another novel method known as *vertex component analysis* (VCA). Finally, in section 6 we

The development of more sophisticated imaging technologies in combination with high resolution spectrometers has given place to a new perspective in remote sensing, in which it is possible to register simultaneously the spatial and spectral information of the energy reflected from Earth's surface. These instruments, known as imaging spectrometer systems, image the Sun radiance reflected from or emitted by materials on the surface, in hundred of narrow and contiguous spectral bands usually in the reflective solar portion of the spectrum (from 0.35 to 2.5 *μ*m). In remote sensing terminology, the region from approximately 0.35 to 1.0 *μ*m is known as the *visible/near infrared* (VNIR) and the range from 1.0 to 2.5 *μ*m is known as the *short wavelength infrared* (SWIR). Therefore, the resulting hyperspectral data consist of an image cube conformed by a number of radiance images that can be used to estimate the reflectance spectra of the scene. Thus, the information contained in a single pixel of a hyperspectral image can be used to compare and identify any object based on its characteristic spectrum, at

There are fundamental matter-energy interaction processes that constitute the basis of the information captured by spectrometer instruments. The electromagnetic radiation coming from the sun can be modified in its direction, intensity or polarization when reaching the Earth's surface. These radiation changes depend on the physical and chemical constitution of the materials comprising the surface, and can be classified as radiation *transmission, reflection, absorption* or *emission*. When an electromagnetic wave propagating in free space reaches the frontier of a different medium, one part of its energy can be transmitted through the material and the other part can be reflected by the surface. Thus, the portion of energy that has been transmitted can be absorbed by some molecules at certain frequencies, causing an increment of the energy in their electrons and a change in the energy level. After a short time in the excitation state, the electrons return to their original state producing an emission

will give some pertinent comments and conclusions of this chapter.

**2. Hyperspectral imaging**

a specific location of the zone of interest.

**2.1 Physical foundations**

Fig. 1. Two types of scanning systems used to register a hyperspectral scene; the number of spectral bands are determined by the detectors that cover specific wavelength intervals *λ*.

of energy at lower frequencies. These interaction processes are used in spectroscopy for the characterization of materials in nature since they absorb or emit electromagnetic radiation at different wavelengths depending on their physical constitution. Hence, the materials covering the Earth's surface can be identified in hyperspectral data according to some absorption or emission bands present in the spectra recorded by the sensor.

For remote sensing purposes imaging spectrometers are placed onboard aerial platforms, mainly satellites or airplanes. Thus, the fundamental parts conforming hyperspectral remote sensing systems are: (1) optics to collect light, (2) a mechanism to scan the *instantaneous field of view* (IFOV) of the spectrometer over a scene, and (3) a set of spectrometers. The image acquisition process is as follows. A scanning mirror coupled to the mechanical system and the platform motion are used as part of the scanning process to collect the reflected energy coming from the surface. Furthermore, the scanning process of each line of the image can be realized using different systems. If the optics forms an image of a single point on the ground such that a line scanner scans a long line that is cross tracked to the platform motion, the scanner is called a "whiskbroom system". If the optics forms the image of a large slit such that no scan mechanism is needed other than the platform motion to form an image, the scanner is called a "pushbroom system" (see Fig. 1). Still another kind of systems use a linear variable filter over a two dimensional array of photodetectors (Jensen, 2007). After the collection of energy has been realized, the incoming light is then leaded through a set of spectrometers that splits the light into many narrow bands of energy by means of a dispersive element that can be either a grating or a prism. The energy coming from the dispersive elements is recorded by photodetectors whose sensibility responds to a specific wavelength interval, giving place to several image spectral bands.

#### **2.1.1 Spatial and spectral resolution**

In imaging spectrometers there are two basic characteristics that define the degree of resolution of the system. The *spatial resolution* is a measure of the minimum detail on the surface that can be captured for a given remote sensor. Thus, spatial resolution depends on the proper characteristics of the sensor and the flight altitude of the aerial platform. In particular, for a grating spectrograph hyperspectral imager, the spatial resolution is set by the size of the pixels of the *charge couple device* (CCD) camera in the *y* direction and the microscope system magnification. However, in the *x* direction, the resolution depends on the spectrometer slit width and the microscope system magnification (Huebshman et al, 2005). Let Δ*x* and Δ*y* be respectively the *x* and *y* dimensions of the CCD pixels and the magnification be *M*. Note that the slit width of the spectrometer *ws* is always going to be larger than Δ*x*. Then, the spatial resolution in the *y* direction is 2Δ*y*/*M*, while in the *x* direction is 2*ws*/*M*. Moreover, other common definition of spatial resolution relating the pixel size and the flight altitude refers to the physical area over the surface occupied by a single pixel. Clearly, the resolution increases as the altitude of the aerial platform decreases.

On the other hand, the *spectral resolution* refers to the number and bandwidth of spectral bands that a sensor can register. In fact, spectral resolution depends on spectrometer components which includes the slit width, the dispersion of the grating or prism, and the sensor device pixel size. For example, for a CCD pixel size of 10 square microns, the dispersion at normal operation is determined to be approximately 40 nm per mm or, equivalently, 0.4 nm per pixel.

#### **2.2 Reflectance estimation from sensors**

The light intercepted by the entrance aperture of a sensor is the quantity know as radiance. Given that the spectral reflectance is a physical quantity that is related to material properties, it is necessary to estimate the reflectance spectra<sup>1</sup> from radiance information captured in hyperspectral data. For this purpose, the background energy level of the Sun must be removed and the scattering and absorbing effects of the atmosphere must be compensated for. There are three main techniques that can be used in order to estimate the spectral reflectance, which can be considered as being either an image, empirical, or model based approach. An image based approach uses only data measured by the instrument, requiring that the images include regions of relatively uniform reflectance. Thus, any absorption presented in the measured reflectance of these regions will be related with one of the mentioned effects and therefore, such effects can be compensated for the complete image. Dividing each image spectrum by the flat field spectrum, the scene is converted to relative reflectance. On the other hand, empirical methods employ both remotely sensed data and field measurements of reflectance, denoted by *r*(*λ*), to solve a linear equation of at-sensor radiance, such that,

$$L(\lambda) = br(\lambda) + \mathfrak{c}\_{\prime} \tag{1}$$

where *L*(*λ*) is the radiance captured by the sensor that varies with wavelength *λ*, and *b*, *c* represent, respectively, multiplicative and additive terms that adjust the sensor radiance.

Model based approaches seek to represent all factors involved in the radiance acquired at a pixel by pixel basis including atmospheric perturbations. For this purpose, a simulated solar irradiance spectrum is used, then the method estimates the solar radiance in the day and hour of image acquisition and the absorption and scattering effects of the atmosphere. Hence, the solar radiance impinging on sensor *Ls* as a function of wavelength *λ* can be modeled as

$$L\_s(\lambda) = \frac{1}{\pi} (Er(\lambda) + M\_T)\tau\_\theta + L\_{p\text{ }\theta} \tag{2}$$

<sup>1</sup> Recall that reflectance is defined as the ratio of the energy reflected from a material to the incident light falling on it.

where *E* is the irradiance on the Earth's surface, *r*(*λ*) is the reflectance of the surface, *MT* is the spectral radiant exitance at temperature *T*, *τθ* is the transmissivity of the atmosphere at zenith angle *θ* and *Lp* is the spectral path radiance of the atmosphere (Farrand, 2005). Solving Eq. (2) gives accurate results in reflectance estimation since it includes all factors contributing to the image acquisition process. Model based approaches are also employed to estimate atmospheric properties directly from the hyperspectral data.

#### **2.3 Imaging spectrometers**

4 Will-be-set-by-IN-TECH

pixels of the *charge couple device* (CCD) camera in the *y* direction and the microscope system magnification. However, in the *x* direction, the resolution depends on the spectrometer slit width and the microscope system magnification (Huebshman et al, 2005). Let Δ*x* and Δ*y* be respectively the *x* and *y* dimensions of the CCD pixels and the magnification be *M*. Note that the slit width of the spectrometer *ws* is always going to be larger than Δ*x*. Then, the spatial resolution in the *y* direction is 2Δ*y*/*M*, while in the *x* direction is 2*ws*/*M*. Moreover, other common definition of spatial resolution relating the pixel size and the flight altitude refers to the physical area over the surface occupied by a single pixel. Clearly, the resolution increases

On the other hand, the *spectral resolution* refers to the number and bandwidth of spectral bands that a sensor can register. In fact, spectral resolution depends on spectrometer components which includes the slit width, the dispersion of the grating or prism, and the sensor device pixel size. For example, for a CCD pixel size of 10 square microns, the dispersion at normal operation is determined to be approximately 40 nm per mm or, equivalently, 0.4 nm per pixel.

The light intercepted by the entrance aperture of a sensor is the quantity know as radiance. Given that the spectral reflectance is a physical quantity that is related to material properties, it is necessary to estimate the reflectance spectra<sup>1</sup> from radiance information captured in hyperspectral data. For this purpose, the background energy level of the Sun must be removed and the scattering and absorbing effects of the atmosphere must be compensated for. There are three main techniques that can be used in order to estimate the spectral reflectance, which can be considered as being either an image, empirical, or model based approach. An image based approach uses only data measured by the instrument, requiring that the images include regions of relatively uniform reflectance. Thus, any absorption presented in the measured reflectance of these regions will be related with one of the mentioned effects and therefore, such effects can be compensated for the complete image. Dividing each image spectrum by the flat field spectrum, the scene is converted to relative reflectance. On the other hand, empirical methods employ both remotely sensed data and field measurements of reflectance, denoted

where *L*(*λ*) is the radiance captured by the sensor that varies with wavelength *λ*, and *b*, *c* represent, respectively, multiplicative and additive terms that adjust the sensor radiance.

Model based approaches seek to represent all factors involved in the radiance acquired at a pixel by pixel basis including atmospheric perturbations. For this purpose, a simulated solar irradiance spectrum is used, then the method estimates the solar radiance in the day and hour of image acquisition and the absorption and scattering effects of the atmosphere. Hence, the solar radiance impinging on sensor *Ls* as a function of wavelength *λ* can be modeled as

<sup>1</sup> Recall that reflectance is defined as the ratio of the energy reflected from a material to the incident light

*L*(*λ*) = *br*(*λ*) + *c* , (1)

*<sup>π</sup>* (*Er*(*λ*) + *MT*)*τθ* <sup>+</sup> *Lp* , (2)

as the altitude of the aerial platform decreases.

**2.2 Reflectance estimation from sensors**

by *r*(*λ*), to solve a linear equation of at-sensor radiance, such that,

*Ls*(*λ*) = <sup>1</sup>

falling on it.

One of the first hyperspectral instruments placed onboard an aircraft for Earth observation is the *Airborne, Visible and Infrared Imaging Spectrometer* (AVIRIS). The sensor was developed at NASA's Jet Propulsion Laboratory and it is composed by a whiskbroom scanning mirror and a linear array of 224 silicon and indium-antimonide sensors. The fine spectral resolution of the instrument, around 10 nm, allows to acquire 224 spectral bands in the spectral range from 0.4 to 2.5 *μ*m. When the sensor is placed onboard the ER-2 aircraft, flying at an altitude of 20 km above ground level, the spatial resolution of the sensor is around 400 m2, having a 30◦ total field of view, and an IFOV of 1.0 mrad. However, if the instrument is placed on an aircraft flying at an altitude of 4 km over the sea level, the spatial resolution of the sensor is about 16 m2.

Furthermore, CHRIS is a current European imaging spectrometer that is operating in its ninth year. The instrument has a spatial resolution of 17 m in up to 62 bands. The data captured by the sensor is serving in more than 50 countries to support a wide range of applications, such as, land surface and coastal zone monitoring. Other imaging spectrometers that are in use today are the *hyperspectral digital imagery collection experiment* (HYDICE) and the image spectrometers belonging to SpecTir (SpecTir, 2009).

Besides the current hyperspectral sensors, three missions are planned to work within the next five years. Italy's ASI space agency plans to launch a medium resolution hyperspectral imaging mission, known as Prisma, in 2012. The instrument will combine a hyperspectral sensor with a panchromatic medium resolution camera, being able to acquire 235 spectral bands in the VNIR and SWIR. The German Aerospace Center (DLR) and the German Research Centre for Geosciences (GFZ) are planning to launch the EnMAP hyperspectral satellite in 2014; the sensor is designed to register Earth's surface in over 200 narrow color bands at the same time. In 2015, NASA plans to launch the *Hyperspectral Infrared Imager*, known as HyspIRI. The HyspIRI mission includes two instruments mounted on a satellite in Low Earth Orbit. The first, an imaging spectrometer, will measure from the visible to short wavelength infrared at a resolution of 10 nm. Also, a multispectral sensor will cover from 3 to 12 *μ*m in the mid and thermal infrared. Both instruments have a spatial resolution of 60 m at nadir. Thus, HyspIRI will acquire 210 spectral bands, whose data will be used to study the world's ecosystems and provide critical information on natural disasters, such as, the processes that indicate volcanic eruption, the nutrients and water status of vegetation, deforestation, among others (Esa, 2010).

#### **3. Mathematical background**

In this section, a general mathematical background is given for several endmember search techniques briefly described in the next section. Many of these techniques developed and used for the unsupervised classification of materials in hyperspectral data have been based on convex sets theory; hence, it is necessary to define some important concepts such as minimum convex sets and endmembers together with its geometrical representation in multidimensional spaces. In the following definitions, we assume that a finite set *X* of *n*-dimensional vectors with real entries is given. Thus, using column notation we can denote this set as *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ∈ **<sup>R</sup>***<sup>n</sup>* where *<sup>k</sup>* is the number of vectors.

#### **3.1 Convex sets and affine independence**

In the theory of convex sets, a set of vectors *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ⊂ **<sup>R</sup>***n*, also considered as points, is said to be *convex* if a straight line joining any two points resides within the set *X* (Lay, 2007). Being {*aξ*} ⊂ **R** a set of scalars for all *ξ* ∈ *K* = {1, . . . , *k*}, a *linear combination* of vectors in *X* is an expression of the form ∑*<sup>k</sup> <sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* . Then, *<sup>X</sup>* is said to be a linearly independent set if the unique solution to the equation ∑*<sup>k</sup> <sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* <sup>=</sup> **<sup>0</sup>** is given by *<sup>a</sup><sup>ξ</sup>* <sup>=</sup> 0 for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Otherwise, the vectors in *X* are said to be linearly dependent. Furthermore, from a geometrical point of view, an *affine combination* is a linear combination of *X* subject to the condition ∑*<sup>k</sup> <sup>ξ</sup>*=<sup>1</sup> *a<sup>ξ</sup>* = 1. If, in addition to the preceding condition, we require that *a<sup>ξ</sup>* ≥ 0 ∀*ξ* ∈ *K* then the set is called a *convex combination* of vectors. The set of all convex combinations formed with elements of *X* is known as the *convex hull* of *X*, denoted as *C*(*X*).

The notion of affine independence is of fundamental importance in the theory of convex sets and is defined as follows. Let *K<sup>η</sup>* = *K* \ {*η*} denote the index set from which index *<sup>η</sup>* has been deleted. If the set of vector differences, *<sup>X</sup>*� <sup>=</sup> {**x***<sup>ξ</sup>* <sup>−</sup> **<sup>x</sup>***<sup>η</sup>* : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>η*} is linearly independent for some *η* ∈ *K*, it can be shown that *X*� is a linearly independent set ∀*η* ∈ *K*. Therefore, the set *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ⊂ **<sup>R</sup>***<sup>n</sup>* is said to be *affine independent* if and only if the set *<sup>X</sup>*� <sup>=</sup> {**x***<sup>ξ</sup>* <sup>−</sup> **<sup>x</sup>***<sup>η</sup>* : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>η*} ⊂ **<sup>R</sup>***<sup>n</sup>* is a linearly independent set for some *<sup>η</sup>* <sup>∈</sup> *<sup>K</sup>* (Gallier, 2001). Notice that the vectors **x**1,..., **x***<sup>k</sup>* are *affinely independent* if the unique solution to the simultaneous equations ∑*<sup>k</sup> <sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* <sup>=</sup> **<sup>0</sup>** and <sup>∑</sup>*<sup>k</sup> <sup>ξ</sup>*=<sup>1</sup> *a<sup>ξ</sup>* = 0 is given by *a<sup>ξ</sup>* = 0 for all *ξ* ∈ *K*. Hence, linear independence implies affine independence. It follows from this definition that any two distinct points are affinely independent, any three non-collinear points are affinely independent, and in general any *<sup>m</sup>* points in **<sup>R</sup>***n*, with *<sup>m</sup>* <sup>≤</sup> *<sup>n</sup>* <sup>+</sup> 1 are affinely independent if and only if they are not points of a common (*<sup>m</sup>* <sup>−</sup> <sup>2</sup>)-dimensional linear subspace of **<sup>R</sup>***n*. The convex hull of affinely independent points form a simplex that is the minimum convex set formed by *n* + 1 vertices. In particular, if *X* is affinely independent, then *C*(*X*) is an *m*-dimensional simplex or *m*-simplex. Thus, a 0-simplex is simply a point, a 1-simplex is a line segment determined by two affinely independent points, a 2-simplex is a triangle determined by three affinely independent points, while a 3-simplex is a tetrahedron defined by four affinely independent points.

#### **3.2 The constrained linear mixing model**

As discussed in the previous section, a noticeable characteristic of hyperspectral images is that most of the pixels contain mixtures of the spectra of constituent materials in the scene. According to the physical interaction of light with matter, it is possible to represent such mixtures using a non-linear model if we consider that photons contribute with each molecule separately. However, in this representation the estimation of the proportions of each constituent material could be a difficult task. A more practical representation, known as the *constrained linear mixing model* (CLMM), has been used to represent the spectral mixtures at a pixel basis in hyperspectral data; the CLMM model has shown to be a good approximation for 6 Will-be-set-by-IN-TECH

minimum convex sets and endmembers together with its geometrical representation in multidimensional spaces. In the following definitions, we assume that a finite set *X* of *n*-dimensional vectors with real entries is given. Thus, using column notation we can denote

In the theory of convex sets, a set of vectors *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ⊂ **<sup>R</sup>***n*, also considered as points, is said to be *convex* if a straight line joining any two points resides within the set *X* (Lay, 2007). Being {*aξ*} ⊂ **R** a set of scalars for all *ξ* ∈ *K* = {1, . . . , *k*}, a *linear combination* of vectors in

the vectors in *X* are said to be linearly dependent. Furthermore, from a geometrical point of

If, in addition to the preceding condition, we require that *a<sup>ξ</sup>* ≥ 0 ∀*ξ* ∈ *K* then the set is called a *convex combination* of vectors. The set of all convex combinations formed with elements of *X*

The notion of affine independence is of fundamental importance in the theory of convex sets and is defined as follows. Let *K<sup>η</sup>* = *K* \ {*η*} denote the index set from which index *<sup>η</sup>* has been deleted. If the set of vector differences, *<sup>X</sup>*� <sup>=</sup> {**x***<sup>ξ</sup>* <sup>−</sup> **<sup>x</sup>***<sup>η</sup>* : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>η*} is linearly independent for some *η* ∈ *K*, it can be shown that *X*� is a linearly independent set ∀*η* ∈ *K*. Therefore, the set *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ⊂ **<sup>R</sup>***<sup>n</sup>* is said to be *affine independent* if and only if the set *<sup>X</sup>*� <sup>=</sup> {**x***<sup>ξ</sup>* <sup>−</sup> **<sup>x</sup>***<sup>η</sup>* : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>η*} ⊂ **<sup>R</sup>***<sup>n</sup>* is a linearly independent set for some *<sup>η</sup>* <sup>∈</sup> *<sup>K</sup>* (Gallier, 2001). Notice that the vectors **x**1,..., **x***<sup>k</sup>* are *affinely independent* if the unique solution to the

Hence, linear independence implies affine independence. It follows from this definition that any two distinct points are affinely independent, any three non-collinear points are affinely independent, and in general any *<sup>m</sup>* points in **<sup>R</sup>***n*, with *<sup>m</sup>* <sup>≤</sup> *<sup>n</sup>* <sup>+</sup> 1 are affinely independent if and only if they are not points of a common (*<sup>m</sup>* <sup>−</sup> <sup>2</sup>)-dimensional linear subspace of **<sup>R</sup>***n*. The convex hull of affinely independent points form a simplex that is the minimum convex set formed by *n* + 1 vertices. In particular, if *X* is affinely independent, then *C*(*X*) is an *m*-dimensional simplex or *m*-simplex. Thus, a 0-simplex is simply a point, a 1-simplex is a line segment determined by two affinely independent points, a 2-simplex is a triangle determined by three affinely independent points, while a 3-simplex is a tetrahedron defined by four

As discussed in the previous section, a noticeable characteristic of hyperspectral images is that most of the pixels contain mixtures of the spectra of constituent materials in the scene. According to the physical interaction of light with matter, it is possible to represent such mixtures using a non-linear model if we consider that photons contribute with each molecule separately. However, in this representation the estimation of the proportions of each constituent material could be a difficult task. A more practical representation, known as the *constrained linear mixing model* (CLMM), has been used to represent the spectral mixtures at a pixel basis in hyperspectral data; the CLMM model has shown to be a good approximation for

view, an *affine combination* is a linear combination of *X* subject to the condition ∑*<sup>k</sup>*

*<sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* <sup>=</sup> **<sup>0</sup>** and <sup>∑</sup>*<sup>k</sup>*

*<sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* . Then, *<sup>X</sup>* is said to be a linearly independent set if

*<sup>ξ</sup>*=<sup>1</sup> *<sup>a</sup>ξ***x***<sup>ξ</sup>* <sup>=</sup> **<sup>0</sup>** is given by *<sup>a</sup><sup>ξ</sup>* <sup>=</sup> 0 for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Otherwise,

*<sup>ξ</sup>*=<sup>1</sup> *a<sup>ξ</sup>* = 0 is given by *a<sup>ξ</sup>* = 0 for all *ξ* ∈ *K*.

*<sup>ξ</sup>*=<sup>1</sup> *a<sup>ξ</sup>* = 1.

this set as *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ∈ **<sup>R</sup>***<sup>n</sup>* where *<sup>k</sup>* is the number of vectors.

**3.1 Convex sets and affine independence**

*X* is an expression of the form ∑*<sup>k</sup>*

simultaneous equations ∑*<sup>k</sup>*

affinely independent points.

**3.2 The constrained linear mixing model**

the unique solution to the equation ∑*<sup>k</sup>*

is known as the *convex hull* of *X*, denoted as *C*(*X*).

the abundance estimation of constituent materials when dealing with spectral mixtures, and it is mathematically expressed by,

$$\mathbf{x} = \sum\_{i=1}^{p} a\_i \mathbf{s}^i + \mathbf{r} = \mathbf{S}\mathbf{a} + \mathbf{r} \tag{3}$$

$$a\_i \ge 0 \,\,\forall i \quad \text{and} \quad \sum\_{i=1}^p a\_i = 1 \,\, \, \, \tag{4}$$

where **<sup>x</sup>** <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* is a spectral pixel acquired over *<sup>n</sup>* bands, *<sup>S</sup>* <sup>=</sup> {**s**1, **<sup>s</sup>**2,..., **<sup>s</sup>***p*} is an *<sup>n</sup>* <sup>×</sup> *<sup>p</sup>* matrix whose columns are the spectra of constituent materials (also known as endmembers), **a**=(*a*1, *a*2,..., *ap*)*<sup>t</sup>* is a *p*-dimensional vector of corresponding fractional abundances present in **x** and **r** is a noise vector (Keshava, 2003). The CLMM requires the set *S* of *p* endmembers be linearly independent and, in general, that the number of endmembers be much less than the dimensionality of the data pixel spectra (*p* � *n*).

In a geometrical representation, the CLMM described above can also be thought as a minimum convex set enclosing most of the hyperspectral data, where the *p* pure pixels spectra are the vertices of the corresponding simplex (see Fig. 2). Moreover, because of the spatial position of pure pixels, these vertices are technically known as *endmembers*. This way, any other spectral pixel of the image belongs to this convex set and can be completely represented by those endmembers. The last statement is the cornerstone of the geometrical based approach so frequently used to extract the constituent materials spectra from hyperspectral data. Furthermore, the estimation of fractional abundances for each endmember can be performed through the inversion of Eq. (3) subject to the imposed restrictions specified by Eq. (4). This process, known as *spectral unmixing* (or *demixing*), allows to quantify the proportion of each endmember in every image pixels. A simple and direct numerical method is provided, in the unconstrained case, by the least square estimation method expressed by

$$\mathbf{a} = \mathbf{S}^{+}\mathbf{x} = (\mathbf{S}^{t}\mathbf{S})^{-1}\mathbf{S}^{t}\mathbf{x},\tag{5}$$

where *S*<sup>+</sup> denotes de Moore-Penrose pseudoinverse matrix. This estimation exists when the *S* matrix is of full rank. The abundances that result from this estimation do not necessarily satisfy the constraints imposed in Eq.(4). Therefore, full additivity can be satisfied using the method of *Lagrange Multipliers*, while the non-negativity condition can be enforced by applying the *non-negative least squares numerical method* (Lawson & Hanson, 1974). It is also possible to employ a hybrid method in order to satisfy both constraints simultaneously.

#### **4. Autonomous methods for endmember determination**

Because the goal in the analysis of hyperspectral data is the quantification of materials comprising the scene, it is important to determine experimentally or even numerically the endmembers spectra. An experimental identification of these spectra implies the use of another device such as a spectroradiometer or a spectrometer to measure directly the reflectance spectra of materials belonging to the area under study; however this methodology is impractical in many situations because it requires an additional effort to collect samples from the zone of interest. A more practical methodology is to extract the same information as much as possible directly from the image data. In addition, assuming most pixels in the

Fig. 2. Left: a 2-simplex whose vertices are three spectrally pure pixels in the image. Right: a 3-simplex defined by four spectrally pure pixels in the image defining a tetrahedron. Both simplex encloses all the spectral data.

image are conformed of spectral mixtures, then constituent materials are identified as those pixels having the spectrum of only one material. Based on this hypothesis, several authors have recently proposed and developed different methodologies used for the autonomous identification of spectrally pure pixels from the image itself. In this section we will make a review of some important techniques that have been applied for this purpose and whose methodology takes the constrained linear mixing model to represent the spectral mixtures at image pixels.

One of the earliest efforts for endmember extraction was proposed by Boardman and is known as *pixel purity index* (PPI) (Boardman, 1995). The algorithm is based on the geometry of convex sets to extract the vertices of a convex hull. Starting with a dimensionality reduction applied to the original data cube by using the minimum noise fraction transform, PPI generates a large number of random *n*-dimensional vectors, known as "skewers", through the dataset. Every pixel vector in the input data is projected onto each skewer, and its position is specified. The data that correspond to extreme points in the direction of a skewer are identified and placed on a list, indicating an increment in their pixel purity score. After many repeated projections, those pixels with a score above a certain threshold are determined as candidate "pure" pixels. From the resulting set of endmembers spectra, one can manually select those pixels that correspond to pure spectra. It is important to remark that the PPI algorithm was originally conceived as a guide to endmember determination since it requires to compare the determined spectra with those obtained from a spectral library in order to identify the final set of endmembers.

The *minimum volume transform* (MVT) algorithm, computes the minimum volume simplex enclosing the data (Craig, 1994). This proposal is based on the observation that scatter diagrams of multispectral remote sensing data tend to be triangular or pyramidal for the two or three band cases, respectively. Hence, they radiate away from the *dark-point*, which represents the sensor's response to an unilluminated object. Therefore, a minimum volume transform may be described as a non-orthogonal linear transformation of the multivariate data to new axes passing through the dark-point, and whose directions are chosen such that they embrace the data cloud. Thus, the determined MVT can be used to unmix images into new 8 Will-be-set-by-IN-TECH

reflectance

Fig. 2. Left: a 2-simplex whose vertices are three spectrally pure pixels in the image. Right: a 3-simplex defined by four spectrally pure pixels in the image defining a tetrahedron. Both

image are conformed of spectral mixtures, then constituent materials are identified as those pixels having the spectrum of only one material. Based on this hypothesis, several authors have recently proposed and developed different methodologies used for the autonomous identification of spectrally pure pixels from the image itself. In this section we will make a review of some important techniques that have been applied for this purpose and whose methodology takes the constrained linear mixing model to represent the spectral mixtures at

One of the earliest efforts for endmember extraction was proposed by Boardman and is known as *pixel purity index* (PPI) (Boardman, 1995). The algorithm is based on the geometry of convex sets to extract the vertices of a convex hull. Starting with a dimensionality reduction applied to the original data cube by using the minimum noise fraction transform, PPI generates a large number of random *n*-dimensional vectors, known as "skewers", through the dataset. Every pixel vector in the input data is projected onto each skewer, and its position is specified. The data that correspond to extreme points in the direction of a skewer are identified and placed on a list, indicating an increment in their pixel purity score. After many repeated projections, those pixels with a score above a certain threshold are determined as candidate "pure" pixels. From the resulting set of endmembers spectra, one can manually select those pixels that correspond to pure spectra. It is important to remark that the PPI algorithm was originally conceived as a guide to endmember determination since it requires to compare the determined spectra with those obtained from a spectral library in order to identify the final

The *minimum volume transform* (MVT) algorithm, computes the minimum volume simplex enclosing the data (Craig, 1994). This proposal is based on the observation that scatter diagrams of multispectral remote sensing data tend to be triangular or pyramidal for the two or three band cases, respectively. Hence, they radiate away from the *dark-point*, which represents the sensor's response to an unilluminated object. Therefore, a minimum volume transform may be described as a non-orthogonal linear transformation of the multivariate data to new axes passing through the dark-point, and whose directions are chosen such that they embrace the data cloud. Thus, the determined MVT can be used to unmix images into new

 in band *k*

spectrum A

reflectance in band

 *j* spectrum B

spectrum C

reflectance in band *i*

spectrum D

reflectance in band *i*

simplex encloses all the spectral data.

spectrum C

spectrum B

spectrum A

reflectance in band

image pixels.

set of endmembers.

 *j* spatial variables showing the proportions of the different cover types present in the remotely sensed scene.

The NFIND-R algorithm is an *iterative simplex volume expansion* procedure that assumes the volume contained by an *n*-simplex whose vertices are specified by the purest pixels is always greater than any other volume formed by other combination of pixels (Winter, 1999). The input for the algorithm is the full data cube, which after subsequent projection is reduced in dimension. The selection of these vertices is initially realized by a random selection of a set of *q* vectors as endmembers candidates and then computing the volume of the simplex formed by these initial endmembers. The process continues iteratively by replacing every endmember one at a time with a pixel in the image and computing the respective volume. Hence, the pixel purity likelihood is evaluated by calculating the volume for every pixel in the place of each endmember. If the replacement results in a volume increase, then the pixel replaces the corresponding endmember. The procedure is repeated until there is no more replacement of endmembers; hence, the final spectra are considered as pure pixels and can be used as endmembers to estimate their corresponding abundances. It is important to remark that the accuracy in the method depends on the initial selection of endmembers.

The algorithm termed as *vertex component analysis* (VCA), is an unsupervised technique that relies on singular value decomposition and principal component analysis as subprocedures assuming the existence of pure pixels (Nascimento & Bioucas-Dias, 2005). In particular, VCA exploits the fact that endmembers are vertices of a simplex and that the affine transformation of a simplex is also a simplex. This algorithm iteratively projects data onto a direction orthogonal to the subspace spanned by the endmembers already determined. The new endmember spectrum is the extreme of the projection and the main loop continues until all given endmembers are exhausted.

The *minimum volume enclosing symplex* (MVES) algorithm is an autonomous technique supported on a linear programming solver that does not require the existence of pure pixels in the hyperspectral data (Chan et al, 2009). For the case when there exist pure pixels, the MVES technique leads to unique identification of endmembers. In particular, dimension reduction is accomplished by affine set fitting and Craig's unmixing criterion (Craig, 1994) is applied to formulate hyperspectral unmixing as an MVES optimization problem. The algorithm first determines the affine parameters set, solves by linear programming an initial feasibility problem with linear convex constraints, and iteratively optimizes two linear programming problems with nonconvex objective functions. Notice that the algorithm requires knowing in advance the number of endmembers to be found.

#### **5. Lattice based approach for endmember extraction**

#### **5.1 Lattice algebra operations**

The use of lattice algebra for science and engineering applications in which the usual matrix operations of addition and multiplication are replaced by corresponding lattice operations, has increased in the last years. These ideas have been applied in diverse areas, such as pattern recognition (Ritter et al, 1998), associative memories in image processing (Ritter et al, 2003; Ritter & Gader, 2006; Urcid & Valdiviezo, 2009), computational intelligence (Graña, 2008), industrial applications modeling and knowledge representation (Kaburlasos & Ritter, 2007), and hyperspectral image segmentation (Graña et al, 2009; Ritter et al, 2009; Ritter & Urcid, 2010; Valdiviezo & Urcid, 2007).

The basic numerical operations of taking the maximum or minimum of two numbers, denoted as functions max(*x*, *y*) and min(*x*, *y*), will be written as binary operators using the "join" and "meet" symbols employed in lattice theory, i.e., *x* ∨ *y* = max(*x*, *y*) and *x* ∧ *y* = min(*x*, *y*). We use lattice matrix operations that are defined componentwise using the underlying structure of **R**−<sup>∞</sup> or **R**<sup>∞</sup> as semirings. For example, the maximum of two matrices *X*, *Y* of the same size *m* × *n* is defined as (*X* ∨ *Y*)*ij* = *xij* ∨ *yij* for *i* = 1, . . . , *m* and *j* = 1, . . . , *n*. Inequalities between matrices are also verified componentwise, for example, *X* ≤ *Y* if and only if *xij* ≤ *yij*. Also, the *conjugate matrix X*<sup>∗</sup> is defined as <sup>−</sup>*X<sup>t</sup>* where *<sup>X</sup><sup>t</sup>* denotes usual matrix transposition. Given an *m* × *p* matrix *X* and a *p* × *n* matix *Y* with entries in **R**, we define a pair of dual matrix operations named as the *max-sum* and the *min-sum* denoted, respectively by *X* ∨ *Y* and *X* ∧ *Y* and whose *i*, *j*-th entry for *i* = 1, . . . , *m* and *j* = 1, . . . , *n*, respectively, is given by (*X* ∨ *Y*)*ij* = �*<sup>p</sup> <sup>k</sup>*=1(*xik* <sup>+</sup> *ykj*) and (*<sup>X</sup>* <sup>∧</sup> *<sup>Y</sup>*)*ij* <sup>=</sup> �*<sup>p</sup> <sup>k</sup>*=1(*xik* + *ykj*). For *p* = 1 these lattice matrix operations reduce to the *outer sum* of two vectors **<sup>x</sup>** = (*x*1,..., *xn*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and **<sup>y</sup>** = (*y*1,..., *ym*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***m*, defined by the *m* × *n* matrix

$$\mathbf{y} \times \mathbf{x}^t = \begin{pmatrix} y\_1 + \mathbf{x}\_1 \ \dots \ y\_1 + \mathbf{x}\_n \\ \vdots & \ddots & \vdots \\ y\_m + \mathbf{x}\_1 \ \dots \ y\_m + \mathbf{x}\_n \end{pmatrix} . \tag{6}$$

#### **5.2 Lattice associative memories**

Lattice based operations have been applied for pattern recognition problems as the computational model for a novel class of neural networks that are used as associative memories (Ritter et al, 1998). In general, let (**x**1, **y**1),...,(**x***k*, **y***k*) be *k* vector pairs with **x***<sup>ξ</sup>* = (*xξ* <sup>1</sup>,..., *<sup>x</sup><sup>ξ</sup> <sup>n</sup>*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and **<sup>y</sup>***<sup>ξ</sup>* = (*<sup>y</sup> ξ* <sup>1</sup>,..., *<sup>y</sup><sup>ξ</sup> <sup>m</sup>*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Given a set of vector associations {(**x***<sup>ξ</sup>* , **<sup>y</sup>***<sup>ξ</sup>* ) : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*} we define a pair of associated matrices (*X*, *<sup>Y</sup>*), where *<sup>X</sup>* = (**x**1,..., **<sup>x</sup>***k*) and *<sup>Y</sup>* = (**y**1,..., **<sup>y</sup>***k*), with an association given by (**x***<sup>ξ</sup>* , **<sup>y</sup>***<sup>ξ</sup>* ) for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Thus, *<sup>X</sup>* is of dimension *<sup>n</sup>* <sup>×</sup> *<sup>k</sup>* with *i*, *j*-th entry *x<sup>j</sup> <sup>i</sup>* and *Y* is of dimension *m* × *k* with *i*, *j*-th entry *y j i* . Two *m* × *n lattice associative memories* able to store *k* vectors such that, for *ξ* = 1, . . . , *k*, the memory recalls **y***<sup>ξ</sup>* when is presented the vector **x***<sup>ξ</sup>* are defined as follows: the *min-memory WXY* and the *max-memory MXY*, both of size *m* × *n*, that store a set of associations (*X*, *Y*) are given by the expressions

$$\mathcal{W}\_{XY} = \bigwedge\_{\substack{\mathfrak{F} = 1 \\ \mathfrak{F} = 1}}^{k} [\mathbf{y}^{\mathfrak{F}} \times (-\mathbf{x}^{\mathfrak{F}})^{t}] \quad ; \quad w\_{i\mathfrak{j}} = \bigwedge\_{\substack{\mathfrak{F} = 1 \\ \mathfrak{F} = 1}}^{k} (y\_{i}^{\mathfrak{F}} - \mathbf{x}\_{\mathfrak{j}}^{\mathfrak{F}}) , \tag{7}$$

$$M\_{XY} = \bigvee\_{\mathfrak{F}=1}^{k} [\mathbf{y}^{\mathfrak{F}} \times (-\mathbf{x}^{\mathfrak{F}})^{t}] \quad ; \quad m\_{ij} = \bigvee\_{\mathfrak{F}=1}^{k} (y\_i^{\mathfrak{F}} - x\_j^{\mathfrak{F}}).\tag{8}$$

The left part of Eqs. (7) and (8) are in matrix form, while the expressions to the right correspond to the *i*, *j*-th entry of *min*-W and *max*-M memories, respectively. In this case the memories are named *lattice hetero-associative* memories (LHAMs); if *X* = *Y*, we have a *lattice auto-associative* memory (LAAM), the case used for endmember determination. Furthermore, the main diagonals of both memories, i.e., *wii* and *mii*, consist entirely of zeros. Since *Y* = *X*, *X* ∨ *X*<sup>∗</sup> = (*X*∗)<sup>∗</sup> ∨ *X*<sup>∗</sup> = (*X* ∧ *X*∗)∗, then *M* = *W*∗. Hence, the *min*- and *max*-memories are dual to each other in the sense of matrix conjugation and *mij* = −*wji*.

#### **5.3 Endmember determination from LAAMs**

10 Will-be-set-by-IN-TECH

and hyperspectral image segmentation (Graña et al, 2009; Ritter et al, 2009; Ritter & Urcid,

The basic numerical operations of taking the maximum or minimum of two numbers, denoted as functions max(*x*, *y*) and min(*x*, *y*), will be written as binary operators using the "join" and "meet" symbols employed in lattice theory, i.e., *x* ∨ *y* = max(*x*, *y*) and *x* ∧ *y* = min(*x*, *y*). We use lattice matrix operations that are defined componentwise using the underlying structure of **R**−<sup>∞</sup> or **R**<sup>∞</sup> as semirings. For example, the maximum of two matrices *X*, *Y* of the same size *m* × *n* is defined as (*X* ∨ *Y*)*ij* = *xij* ∨ *yij* for *i* = 1, . . . , *m* and *j* = 1, . . . , *n*. Inequalities between matrices are also verified componentwise, for example, *X* ≤ *Y* if and only if *xij* ≤ *yij*. Also, the *conjugate matrix X*<sup>∗</sup> is defined as <sup>−</sup>*X<sup>t</sup>* where *<sup>X</sup><sup>t</sup>* denotes usual matrix transposition. Given an *m* × *p* matrix *X* and a *p* × *n* matix *Y* with entries in **R**, we define a pair of dual matrix operations named as the *max-sum* and the *min-sum* denoted, respectively by *X* ∨ *Y* and *X* ∧ *Y* and whose *i*, *j*-th entry for *i* = 1, . . . , *m* and *j* = 1, . . . , *n*, respectively, is given by (*X* ∨ *Y*)*ij* =

reduce to the *outer sum* of two vectors **<sup>x</sup>** = (*x*1,..., *xn*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and **<sup>y</sup>** = (*y*1,..., *ym*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***m*,

. .

Lattice based operations have been applied for pattern recognition problems as the computational model for a novel class of neural networks that are used as associative memories (Ritter et al, 1998). In general, let (**x**1, **y**1),...,(**x***k*, **y***k*) be *k* vector pairs with **x***<sup>ξ</sup>* =

{(**x***<sup>ξ</sup>* , **<sup>y</sup>***<sup>ξ</sup>* ) : *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*} we define a pair of associated matrices (*X*, *<sup>Y</sup>*), where *<sup>X</sup>* = (**x**1,..., **<sup>x</sup>***k*) and *<sup>Y</sup>* = (**y**1,..., **<sup>y</sup>***k*), with an association given by (**x***<sup>ξ</sup>* , **<sup>y</sup>***<sup>ξ</sup>* ) for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Thus, *<sup>X</sup>* is of dimension *<sup>n</sup>* <sup>×</sup> *<sup>k</sup>*

*memories* able to store *k* vectors such that, for *ξ* = 1, . . . , *k*, the memory recalls **y***<sup>ξ</sup>* when is presented the vector **x***<sup>ξ</sup>* are defined as follows: the *min-memory WXY* and the *max-memory MXY*, both of size *m* × *n*, that store a set of associations (*X*, *Y*) are given by the expressions

The left part of Eqs. (7) and (8) are in matrix form, while the expressions to the right correspond to the *i*, *j*-th entry of *min*-W and *max*-M memories, respectively. In this case the memories are named *lattice hetero-associative* memories (LHAMs); if *X* = *Y*, we have a *lattice auto-associative* memory (LAAM), the case used for endmember determination. Furthermore,

] ; *wij* <sup>=</sup> �

] ; *mij* <sup>=</sup> �

*k*

*ξ*=1 (*yξ <sup>i</sup>* <sup>−</sup> *<sup>x</sup><sup>ξ</sup>*

*k*

*ξ*=1 (*yξ <sup>i</sup>* <sup>−</sup> *<sup>x</sup><sup>ξ</sup>*

*<sup>i</sup>* and *Y* is of dimension *m* × *k* with *i*, *j*-th entry *y*

[**y***<sup>ξ</sup>* <sup>×</sup> (−**x***<sup>ξ</sup>* )*<sup>t</sup>*

[**y***<sup>ξ</sup>* <sup>×</sup> (−**x***<sup>ξ</sup>* )*<sup>t</sup>*

*y*<sup>1</sup> + *x*<sup>1</sup> ... *y*<sup>1</sup> + *xn*

. ... .

*ym* + *x*<sup>1</sup> ... *ym* + *xn*

. .

**<sup>y</sup>** <sup>×</sup> **<sup>x</sup>***<sup>t</sup>* <sup>=</sup>

*ξ* <sup>1</sup>,..., *<sup>y</sup><sup>ξ</sup>*

*WXY* = �

*MXY* = �

*k*

*ξ*=1

*k*

*ξ*=1

⎛

⎜⎝

*<sup>k</sup>*=1(*xik* + *ykj*). For *p* = 1 these lattice matrix operations

*<sup>m</sup>*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>m</sup>* for *<sup>ξ</sup>* <sup>∈</sup> *<sup>K</sup>*. Given a set of vector associations

*j i*

⎟⎠ . (6)

. Two *m* × *n lattice associative*

*<sup>j</sup>* ), (7)

*<sup>j</sup>* ). (8)

⎞

2010; Valdiviezo & Urcid, 2007).

*<sup>k</sup>*=1(*xik* <sup>+</sup> *ykj*) and (*<sup>X</sup>* <sup>∧</sup> *<sup>Y</sup>*)*ij* <sup>=</sup> �*<sup>p</sup>*

defined by the *m* × *n* matrix

**5.2 Lattice associative memories**

*<sup>n</sup>*)*<sup>t</sup>* <sup>∈</sup> **<sup>R</sup>***<sup>n</sup>* and **<sup>y</sup>***<sup>ξ</sup>* = (*<sup>y</sup>*

�*<sup>p</sup>*

(*xξ* <sup>1</sup>,..., *<sup>x</sup><sup>ξ</sup>*

with *i*, *j*-th entry *x<sup>j</sup>*

For a given set of vectors *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ∈ **<sup>R</sup>***<sup>n</sup>* and the corresponding matrix memories *<sup>W</sup>*XX and *<sup>M</sup>*XX computed from *<sup>X</sup>*, rewritten as *<sup>W</sup>* <sup>=</sup> {**w**1,..., **<sup>w</sup>***n*} and *<sup>M</sup>* <sup>=</sup> {**m**1,..., **<sup>m</sup>***n*} to specify their column vectors, an *n*-dimensional convex hull enclosing most if not all of the vectors in the given space can be derived. The points defining the convex hull will correspond to the vertices of an *n*-simplex and can be extracted from the columns of *W* and *M*. An important fact of the column values of LAAMs is that the relationship with the set of original data *X* is not direct, for example, *W* usually has negative values by definition. Hence, an *additive scaling* is required to relate the column values with the data set *X*. Thus, two scaled matrices, denoted respectively as *W* and *M*, are defined for all *i* = 1, . . . , *n* according to the following expressions,

$$\mathbf{w}^i = u\_i + \mathbf{w}^i \quad ; \quad u\_{\bar{i}} = \bigvee\_{\xi=1}^k x\_{\bar{i}}^{\xi} \quad ; \quad \mathbf{u} = \bigvee\_{\xi=1}^k \mathbf{x}^{\xi} \tag{9}$$

$$\mathbf{m}^i = v\_i + \mathbf{m}^i \quad ; \quad v\_i = \bigwedge\_{\substack{\mathfrak{x} = 1 \\ \mathfrak{x} = 1}}^k x\_i^{\mathfrak{x}} \quad ; \quad \mathbf{v} = \wedge\_{\mathfrak{x} = 1}^k \mathbf{x}^{\mathfrak{x}} \, , \tag{10}$$

where **u** and **v** denotes, respectively, the *maximum* and *minimum vector bounds* of *X*, and whose entries are defined for all *i* = 1, . . . , *n*.

Once the columns of *W* and *M* have been scaled, a fundamental result from this method is that the set of points *M* ∪ *W* ∪ {**u**, **v**}, forms a *convex polytope* B with 2(*n* + 1) vertices that contains *X*. These points must satisfy the affine independence condition and any subset of them can be used as endmembers. As it was proven in (Ritter & Urcid, 2010), the following theorems establish sufficient conditions to extract two subsets, *W*� and *M*� of affine independent vectors from the columns of both *W* and *M*. The first theorem provides four equivalent conditions that furnish a computationally simple test for the affine independence of the sets *W* and *M*; the symbols **w***<sup>i</sup>* and **m***<sup>i</sup>* denote the *i*-th row of *W* and *M*, respectively; also, **c** = (*c*,..., *c*) denotes a constant vector.

Theorem 1. If *i*, *j* ∈ {1, . . . , *n*}, then the following statements are equivalent: (1) **w***<sup>i</sup>* − **w***<sup>j</sup>* = **c**, (2) **w** *<sup>i</sup>* = **w** *<sup>j</sup>* , (3) **<sup>m</sup>***<sup>i</sup>* <sup>−</sup> **<sup>m</sup>***<sup>j</sup>* <sup>=</sup> **<sup>c</sup>**, and (4) **<sup>m</sup>** *<sup>i</sup>* <sup>=</sup> **<sup>m</sup>** *<sup>j</sup>* .

An important consequence of the Theorem 1 is that to verify that *W* or *M* is affinely independent, all that one needs to do is to check that no two vectors of *W* or *M* are identical. The next theorem provides a simple method for deriving a set of affine independent vectors from *W* and *M*. In this notation, *J*� denotes an arbritrary non-empty subset of *J*.

Theorem 2. *<sup>W</sup>*� <sup>⊂</sup> *<sup>W</sup>* is *affinely independent* if and only if **<sup>w</sup>** *<sup>i</sup>* �<sup>=</sup> **<sup>w</sup>** *<sup>j</sup>* for all distinct pairs {*i*, *<sup>j</sup>*} ⊂ *J*� . Similarly, *<sup>M</sup>*� <sup>⊂</sup> *<sup>M</sup>* is *affinely independent* if and only if **<sup>m</sup>** *<sup>i</sup>* �<sup>=</sup> **<sup>m</sup>** *<sup>j</sup>* for all distinct pairs {*i*, *j*} ⊂ *J*� .

In the next section we will use this method to derive affinely independent sets from *M* ∪ *W* ∪ {**u**, **v**} as endmembers of particular data sets *X*.

#### **6. Identification of endmembers: application examples**

The validity in the convex set representation for endmembers identification, discussed in the previous sections, can be illustrated through experiments using real hyperspectral data sets. In fact, the aim of the application examples is to provide enough details in the use of a novel endmember determination technique. In particular, lattice auto-associative memories, *W*XX and *M*XX, have shown to be an efficient procedure for the autonomous endmember determination, from which a subset of final endmembers can be selected to accomplish hyperspectral image segmentation. As a complement to the theoretical results given before, the endmembers output set from the VCA algorithm will be presented and compared with that set obtained with the LAAMs method. At the end of this section, we present composite abundance maps generated from the estimation of endmember proportions using constrained linear unmixing on each hyperspectral scene.

The following data sets were taken from the SpecTir's extensive hyperspectral baseline environmental dataset (SpecTir, 2009). The available information about the image acquisition indicates that a VNIR-SWIR hyperspectral instrument, covering a wavelength range from 0.395 to 2.45 *μ*m, was used to collect the images. Each scene has a spectral resolution of 5 nm, with a number of 360 spectral bands. Thus, a single hyperspectral cube is conformed by 600 lines × 320 pixels × 360 bands (about 132 Mbytes). Given the high spectral resolution of a hyperspectral image, a common practice to avoid redundant information consists in a spectral dimensionality reduction of the data cube by application of a chosen technique, such as principal component analysis, minimum noise fraction transform, or adjacent band removal of highly correlated bands (Keshava, 2003). These reductions are often necessary to eliminate undesirable effects produced during the acquisition process and to diminish computational requirements. Hence, in the hyperspectral cubes used for this simulation, the number of spectral bands was reduced to 90 by making a selection of spectral bands at subintervals of 20 nm covering the same wavelength interval. This spectral reduction allows to speed up the computation times with no significant effects in the endmembers identification task.

#### **Example 1. Gulf of Mexico wetland sample**

This hyperspectral cube was registered over the *Lower Suwanee National Wildlife Refuge*, which is located in the north coast of the Gulf of Mexico belonging to the USA. The Refuge lodges one of the largest undeveloped river-delta estuarine systems in this nation. Some of the numerous wildlife species that inhabit the zone are: swallow-tailed kites, bald eagles, West Indian manatees, Gulf sturgeon, whitetailed deer, and eastern wild turkeys. Natural salt marshes, tidal flats, bottomland hardwood swamps, and pine forests provide habitat for thousands of creatures. This particular hyperspectral data set was acquired at a spatial resolution of 4 m2, covering tidal wetlands and multiple national wildlife reserves during the period of May to June 2010. In fact, the images captured the state of vegetation at the time of flights and can be used to locate the presence or absence of hydrocarbons at the surface of vegetation. Also, since the data were acquired prior to the oil disaster (occurred in 2010), they can be compared to images from later flights to assist in the damage assessments. Figure 3 shows two color composite images of the Gulf Coast hyperspectral scene. The left part of the figure 12 Will-be-set-by-IN-TECH

In the next section we will use this method to derive affinely independent sets from *M* ∪ *W* ∪

The validity in the convex set representation for endmembers identification, discussed in the previous sections, can be illustrated through experiments using real hyperspectral data sets. In fact, the aim of the application examples is to provide enough details in the use of a novel endmember determination technique. In particular, lattice auto-associative memories, *W*XX and *M*XX, have shown to be an efficient procedure for the autonomous endmember determination, from which a subset of final endmembers can be selected to accomplish hyperspectral image segmentation. As a complement to the theoretical results given before, the endmembers output set from the VCA algorithm will be presented and compared with that set obtained with the LAAMs method. At the end of this section, we present composite abundance maps generated from the estimation of endmember proportions using constrained

The following data sets were taken from the SpecTir's extensive hyperspectral baseline environmental dataset (SpecTir, 2009). The available information about the image acquisition indicates that a VNIR-SWIR hyperspectral instrument, covering a wavelength range from 0.395 to 2.45 *μ*m, was used to collect the images. Each scene has a spectral resolution of 5 nm, with a number of 360 spectral bands. Thus, a single hyperspectral cube is conformed by 600 lines × 320 pixels × 360 bands (about 132 Mbytes). Given the high spectral resolution of a hyperspectral image, a common practice to avoid redundant information consists in a spectral dimensionality reduction of the data cube by application of a chosen technique, such as principal component analysis, minimum noise fraction transform, or adjacent band removal of highly correlated bands (Keshava, 2003). These reductions are often necessary to eliminate undesirable effects produced during the acquisition process and to diminish computational requirements. Hence, in the hyperspectral cubes used for this simulation, the number of spectral bands was reduced to 90 by making a selection of spectral bands at subintervals of 20 nm covering the same wavelength interval. This spectral reduction allows to speed up the

computation times with no significant effects in the endmembers identification task.

This hyperspectral cube was registered over the *Lower Suwanee National Wildlife Refuge*, which is located in the north coast of the Gulf of Mexico belonging to the USA. The Refuge lodges one of the largest undeveloped river-delta estuarine systems in this nation. Some of the numerous wildlife species that inhabit the zone are: swallow-tailed kites, bald eagles, West Indian manatees, Gulf sturgeon, whitetailed deer, and eastern wild turkeys. Natural salt marshes, tidal flats, bottomland hardwood swamps, and pine forests provide habitat for thousands of creatures. This particular hyperspectral data set was acquired at a spatial resolution of 4 m2, covering tidal wetlands and multiple national wildlife reserves during the period of May to June 2010. In fact, the images captured the state of vegetation at the time of flights and can be used to locate the presence or absence of hydrocarbons at the surface of vegetation. Also, since the data were acquired prior to the oil disaster (occurred in 2010), they can be compared to images from later flights to assist in the damage assessments. Figure 3 shows two color composite images of the Gulf Coast hyperspectral scene. The left part of the figure

{**u**, **v**} as endmembers of particular data sets *X*.

linear unmixing on each hyperspectral scene.

**Example 1. Gulf of Mexico wetland sample**

**6. Identification of endmembers: application examples**

Fig. 3. Color images of the Gulf Coast hyperspectral scene used for example 1. Left: image formed by combining bands 54 (red), 34 (green) and 14 (blue). Right: combination of bands 81 (red), 217 (green), and 54 (blue).

was formed by combining bands 54 (red, *λ* = 693 nm), 34 (green, *λ* = 584 nm) and 14 (blue, *λ* = 469.5 nm), giving the appearance of a true color image; the right part was formed with bands 81 (red, *λ* = 851 nm ), 217 (green, *λ* = 1631 nm) and 54 (blue, *λ* = 693 nm), whose combination using two infrared bands highlights the vegetation areas in green, orange and brown colors.

For the endmember determination process, we first form the set *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*} ∈ **<sup>R</sup>***n*, where *k* = 600 × 320 =192,000 and *n* = 90, arranged with the total number of spectral vectors comprising the scene. The second step consists in the computation of the memories *W*XX and *M*XX from *X*, with Eqs. (7) and (8). Using the vectors **v** and **u** calculated from Eqs. (9) and (10), the columns of *W* and *M* are then scaled to obtain *W* and *M*. In order to determine a subset of affinely independent vectors, it is necessary to prove that no two columns of *W* or *M* are equal. For the application in hand, the resulting *W* and *M* are conformed, respectively by 90 affinely independent columns and, therefore, each one provides us with 90 "candidate" endmembers. In addition, because of the additive scaling previously performed, the column vectors from *W* present an "upward spike" since *wii* = *ui*, and vectors from *M* presents a "downward spike" due to *mii* = *vi*. It is then necessary to realize a simple smoothing procedure considering the nearest one or two spectral samples next to *wii* or *mii*, and is given, for any *i* ∈ {1, . . . , *n*}, by

$$z\_{ii} = \begin{cases} z\_{1,2} & \Leftrightarrow \ i = 1, \\ \frac{1}{2}(z\_{i-1,i} + z\_{i+1,i}) \Leftrightarrow 1 < i < n, \\ z\_{n-1,n} & \Leftrightarrow \ i = n. \end{cases} \tag{11}$$

where *z* can be equal to **w** or **m**. Notice that the LAAMs method always gives a number of candidate endmembers that is either equal or slightly less than the spectral dimensionality. In practice, contiguous columns are highly correlated being necessary to use some techniques to discard most of these potential endmembers. For example, minimum mutual information has been used to obtain a final set of endmembers (Graña et al, 2007); a matrix of linear correlation coefficients followed by a threshold process to get a subset of selected endmembers pairs with low correlation coefficients is introduced in (Ritter & Urcid, 2010). Here we use a simpler technique based on the fact that the LAAMs based method forms � <sup>√</sup>*<sup>n</sup>* <sup>+</sup> <sup>1</sup>� subsets, each with � <sup>√</sup>*<sup>n</sup>* <sup>+</sup> <sup>1</sup>� column vectors taken from *<sup>W</sup>* (respectively *<sup>M</sup>*); then, a representative from each group is selected as endmember. Although this technique provides a reasonable number of approximate true endmembers, in practical situations where a reduced number of materials comprises the hyperspectral scene, it is necessary to perform a final selection by considering those spectra that are spectrally different from the others. Therefore, in this application example, from the 20 endmembers candidates derived from *W* ∪ *M* a final selection of uncorrelated endmembers provided a reduced set containing 5 spectral vectors that forms the columns of *<sup>S</sup>*; thus, *<sup>S</sup>* <sup>=</sup> {**w**2, **<sup>w</sup>**24, **<sup>w</sup>**43, **<sup>w</sup>**54, **<sup>w</sup>**79}.

On the other hand, the VCA algorithm was applied to the same hyperspectral data set. According to the implementation, the algorithm requires as input parameter the number of endmembers to be determined; the corresponding output includes the endmembers spectra as well as the pixel positions in the image from they were extracted. Repeated iterations specifying the same number of endmembers produce almost the same output set, with differences in the order in which endmembers appear. After testing different input values, such as 5, 7, 9, and 10, we decided to use the number of endmembers determined with the LAAMs method as the input parameter to the VCA algorithm. Hence, the set *S* of endmembers identified with VCA is conformed by *<sup>S</sup>* <sup>=</sup> {**x**27876, **<sup>x</sup>**90661, **<sup>x</sup>**97850, **<sup>x</sup>**84588, **<sup>x</sup>**191634}. Figure 4 displays three endmembers spectra determined from the columns selection from the set *W* ∪ *M*, and whose spectral curves correspond to natural resources in the hyperspectral scene of the Gulf Coast. Similarly, Figure 5 shows three endmembers spectra obtained with application of the VCA algorithm. In both cases, normalization of reflectance data values in spectral distributions is linearly scaled from the range [0, 6000] to the unit interval [0,1]. Finally, observe that there is a similarity between spectral curves, which is indicated for curves drawn with the same colors.

#### **Example 2. Belstville area**.

The following example was performed using a hyperspectral cube registered over the Belstville area, located in northern Prince George's County in Maryland, USA. The area includes agriculture and vegetation samples. Similar to the previous image, the data cube used for this experiment is of size 600 × 320 × 90, at approximately the same wavelength interval. The left part of Figure 6 displays a color composite image formed by combination of bands 54 (red, *λ*=693 nm), 34 (green, *λ*=583.9 nm), and 14 (blue, *λ*=469.5 nm) simulating 14 Will-be-set-by-IN-TECH

nearest one or two spectral samples next to *wii* or *mii*, and is given, for any *i* ∈ {1, . . . , *n*}, by

where *z* can be equal to **w** or **m**. Notice that the LAAMs method always gives a number of candidate endmembers that is either equal or slightly less than the spectral dimensionality. In practice, contiguous columns are highly correlated being necessary to use some techniques to discard most of these potential endmembers. For example, minimum mutual information has been used to obtain a final set of endmembers (Graña et al, 2007); a matrix of linear correlation coefficients followed by a threshold process to get a subset of selected endmembers pairs with low correlation coefficients is introduced in (Ritter & Urcid, 2010). Here we use a

from each group is selected as endmember. Although this technique provides a reasonable number of approximate true endmembers, in practical situations where a reduced number of materials comprises the hyperspectral scene, it is necessary to perform a final selection by considering those spectra that are spectrally different from the others. Therefore, in this application example, from the 20 endmembers candidates derived from *W* ∪ *M* a final selection of uncorrelated endmembers provided a reduced set containing 5 spectral vectors

On the other hand, the VCA algorithm was applied to the same hyperspectral data set. According to the implementation, the algorithm requires as input parameter the number of endmembers to be determined; the corresponding output includes the endmembers spectra as well as the pixel positions in the image from they were extracted. Repeated iterations specifying the same number of endmembers produce almost the same output set, with differences in the order in which endmembers appear. After testing different input values, such as 5, 7, 9, and 10, we decided to use the number of endmembers determined with the LAAMs method as the input parameter to the VCA algorithm. Hence, the set *S* of endmembers identified with VCA is conformed by *<sup>S</sup>* <sup>=</sup> {**x**27876, **<sup>x</sup>**90661, **<sup>x</sup>**97850, **<sup>x</sup>**84588, **<sup>x</sup>**191634}. Figure 4 displays three endmembers spectra determined from the columns selection from the set *W* ∪ *M*, and whose spectral curves correspond to natural resources in the hyperspectral scene of the Gulf Coast. Similarly, Figure 5 shows three endmembers spectra obtained with application of the VCA algorithm. In both cases, normalization of reflectance data values in spectral distributions is linearly scaled from the range [0, 6000] to the unit interval [0,1]. Finally, observe that there is a similarity between spectral curves, which is indicated for curves

The following example was performed using a hyperspectral cube registered over the Belstville area, located in northern Prince George's County in Maryland, USA. The area includes agriculture and vegetation samples. Similar to the previous image, the data cube used for this experiment is of size 600 × 320 × 90, at approximately the same wavelength interval. The left part of Figure 6 displays a color composite image formed by combination of bands 54 (red, *λ*=693 nm), 34 (green, *λ*=583.9 nm), and 14 (blue, *λ*=469.5 nm) simulating

*z*1,2 ⇔ *i* = 1,

<sup>√</sup>*<sup>n</sup>* <sup>+</sup> <sup>1</sup>� column vectors taken from *<sup>W</sup>* (respectively *<sup>M</sup>*); then, a representative

(11)

<sup>√</sup>*<sup>n</sup>* <sup>+</sup> <sup>1</sup>� subsets,

<sup>2</sup> (*zi*−1,*<sup>i</sup>* + *zi*<sup>+</sup>1,*i*) ⇔ <sup>1</sup> < *<sup>i</sup>* < *<sup>n</sup>*, *zn*−1,*<sup>n</sup>* ⇔ *<sup>i</sup>* = *<sup>n</sup>*,

*zii* =

each with �

drawn with the same colors.

**Example 2. Belstville area**.

⎧ ⎨ ⎩

1

simpler technique based on the fact that the LAAMs based method forms �

that forms the columns of *<sup>S</sup>*; thus, *<sup>S</sup>* <sup>=</sup> {**w**2, **<sup>w</sup>**24, **<sup>w</sup>**43, **<sup>w</sup>**54, **<sup>w</sup>**79}.

Fig. 4. Three endmembers spectra determined with application of the LAAMs method to the hyperspectral cube of the Gulf of Mexico. The associated column values selected from *<sup>W</sup>* <sup>∪</sup> *<sup>M</sup>* are: **<sup>w</sup>**2, **<sup>w</sup>**24, **<sup>w</sup>**54.

a true color image; the right part of the same Figure shows a combination of bands 81 (red, *λ*=851 nm), 54 (green, *λ*=693 nm), and 34 (blue, *λ*=583.9 nm) that allows to emphasize the vegetation areas in red tones. This way, the set of all spectral vectors of the image was formed by *<sup>X</sup>* <sup>=</sup> {**x**1,..., **<sup>x</sup>***k*}, where *<sup>k</sup>* <sup>=</sup> <sup>600</sup> <sup>×</sup> <sup>320</sup> <sup>=</sup> 192, 000 and *<sup>n</sup>* <sup>=</sup> 90. Following the same procedure described in the previous example for endmember determination, we compute the memories *<sup>W</sup>*XX and *<sup>M</sup>*XX <sup>=</sup> <sup>−</sup>*W<sup>t</sup>* XX, as well as the vector bounds **u** and **v** used to obtain the matrices *W* and *M*. Once the affine independence condition is checked, the resulting scaled memories are of size 90 × 90. According to the previous discussion, the spikes effects generated in the diagonal of both memories are removed using Eq. (11), and a selection of 20 endmembers candidates is made from the set *W* ∪ *M*. Finally, the election of spectrally different column vectors is performed to form the final set of endmembers, whose column vectors are defined by *<sup>S</sup>* <sup>=</sup> {**w**24, **<sup>w</sup>**37, **<sup>w</sup>**47, **<sup>w</sup>**64, **<sup>m</sup>**46, **<sup>m</sup>**57, **<sup>m</sup>**62}. Therefore, the matrix *<sup>S</sup>* will be used in Eq. 3 to estimate the fractional abundance of each endmember.

The VCA algorithm was applied to the set *X* containing all the spectral vectors of the image. As the previous example, the number of column vectors selected from the LAAMs method was established as the input parameter of the algorithm. The resulting endmembers spectra determined by VCA were used to form the matrix *<sup>S</sup>* <sup>=</sup> {**s**1,..., **<sup>s</sup>**7}, whose spectra are associated to the column vectors {**x**191845, **<sup>x</sup>**191419, **<sup>x</sup>**9630, **<sup>x</sup>**111446, **<sup>x</sup>**114301, **<sup>x</sup>**191724, **<sup>x</sup>**65969}. Figure 7 displays four of the final endmembers set obtained with the selection of vectors from *W* ∪ *M*; Figure 8 shows four of the endmembers spectra determined by the VCA algorithm that are similar to those computed with the LAAMs method. The similarity between spectral curves

Fig. 5. Three endmembers spectra obtained with application of the VCA algorithm to the hyperspectral cube of the Gulf of Mexico. The column vectors **s***<sup>j</sup>* for *j* = 1, . . . , 5 indicate the corresponding column of the *S* matrix.

can be identified for curves drawn in the same color. Although the spectral curves in both sets seem to be alike, a similarity measure must be applied in order to quantify these similarities. Here, we have computed the correlation coefficients between the sets obtained with the LAAMs method and the VCA algorithm, for each one of the application examples. Table 1 presents the correlation coefficients computed for the spectral curves displayed in Figures 4, 5 and 7, 8, respectively.


Table 1. Correlation coefficients for similar spectra obtained with the LAAMs method and the VCA algorithm from the hyperspectral images of the Gulf of Mexico and Beltsville.

#### **6.1 Constrained linear unmixing**

The spectral unmixing process can be realized by means of the inversion expressed in Eq. (5), subject to the restrictions of full additivity and non-negativity of abundance coefficients. Notice that Eq. (3) is an *overdetermined* system of linear equations such that *n* > *p*. For the examples here discussed, both matrices *W* and *M* have full rank, thus their column vectors 16 Will-be-set-by-IN-TECH

Fig. 5. Three endmembers spectra obtained with application of the VCA algorithm to the hyperspectral cube of the Gulf of Mexico. The column vectors **s***<sup>j</sup>* for *j* = 1, . . . , 5 indicate the

can be identified for curves drawn in the same color. Although the spectral curves in both sets seem to be alike, a similarity measure must be applied in order to quantify these similarities. Here, we have computed the correlation coefficients between the sets obtained with the LAAMs method and the VCA algorithm, for each one of the application examples. Table 1 presents the correlation coefficients computed for the spectral curves displayed in Figures 4, 5 and 7, 8,

> **Gulf of Mexico Beltsville** VCA & LAAMs Corr. Coef. VCA & LAAMs Corr. Coef. **s**<sup>2</sup> and **w**<sup>24</sup> 0.980 **s**<sup>1</sup> and **w**<sup>37</sup> 0.965 **s**<sup>3</sup> and **w**<sup>54</sup> 0.974 **s**<sup>2</sup> and **w**<sup>24</sup> 0.944 **s**<sup>4</sup> and **w**<sup>2</sup> 0.641 **s**<sup>3</sup> and **w**<sup>64</sup> 0.912 <sup>−</sup> <sup>−</sup> **<sup>s</sup>**<sup>4</sup> and **<sup>w</sup>**<sup>47</sup> 0.939

Table 1. Correlation coefficients for similar spectra obtained with the LAAMs method and the

The spectral unmixing process can be realized by means of the inversion expressed in Eq. (5), subject to the restrictions of full additivity and non-negativity of abundance coefficients. Notice that Eq. (3) is an *overdetermined* system of linear equations such that *n* > *p*. For the examples here discussed, both matrices *W* and *M* have full rank, thus their column vectors

VCA algorithm from the hyperspectral images of the Gulf of Mexico and Beltsville.

corresponding column of the *S* matrix.

**6.1 Constrained linear unmixing**

respectively.

Fig. 6. Color images of Beltsville hyperspectral scene used fot example 2. Left: image formed by combining bands 54 (red), 34 (green) and 14 (blue). Right: combination of bands 81 (red), 54 (green), and 34 (blue).

are linearly independent. In addition, the set of final endmembers, either determined with the LAAMs method or those identified by the VCA algorithm, is a linear independent set whose pseudoinverse matrix is unique. Although the unconstrained solution corresponding to Eq. (5), where *n* > *p* (*n* = 90 and *p* = 7 or *p* = 5), has a single solution, some coefficients may be negative for many pixel spectra and do not sum up to unity. If full additivity is enforced, negative coefficients appear. Therefore, the best approach consists of imposing non-negativity for the abundance proportions, relaxing full additivity by considering the inequality <sup>∑</sup>*<sup>p</sup> <sup>i</sup>*=<sup>1</sup> *ap* < 1. For the examples here presented we use the *non-negative least squares* (NNLS) algorithm that solves the problem of minimizing the Euclidian norm �*S***a** − **x**�<sup>2</sup> subjected to the condition **a** > 0 (Lawson & Hanson, 1974).

Figure 9 displays the color abundance maps of the endmembers determined with each one of the methods here discussed. These maps were generated using the NNLS numerical method implemented in Matlab 7.6; in these images, brighter areas represent maximum distribution of the corresponding endmember. The left part of Figure 9 shows the distribution of four natural resources that were determined with implementation of the LAAMs method in the hyperspectral cube of the Gulf Coast. The right part of the same Figure displays the distribution of three natural resources that were determined using the VCA algorithm. In

Fig. 7. Four endmembers spectra determined with application of the LAAMs method to the hyperspectral cube of Beltsville. The associated column vectors selected from *W* ∪ *M* are: **w**24, **w**37, **w**47, **w**64.

Fig. 8. Four endmembers spectra obtained with application of the VCA algorithm to the hyperspectral cube of Belstville. The column vectors **s***<sup>j</sup>* for *j* = 1, . . . , 7 indicate the corresponding column of the *S* matrix.

18 Will-be-set-by-IN-TECH

Fig. 7. Four endmembers spectra determined with application of the LAAMs method to the hyperspectral cube of Beltsville. The associated column vectors selected from *W* ∪ *M* are:

Fig. 8. Four endmembers spectra obtained with application of the VCA algorithm to the hyperspectral cube of Belstville. The column vectors **s***<sup>j</sup>* for *j* = 1, . . . , 7 indicate the

**w**24, **w**37, **w**47, **w**64.

corresponding column of the *S* matrix.

Fig. 9. Color abundance maps of natural resources determined with the autonomous identification of endmembers in the hyperspectral cube of the Gulf Coast. Left: abundances of four endmembers determined with the LAAMs method, whose distribution of colors corresponds to yellow = **w**2, magenta = **w**24, green = **w**43, blue = **w**54. Right: abundances of three endmembers determined with the VCA algorithm whose distribution of colors is yellow = **s**4, magenta = **s**2, green = **s**3. Brighter areas mean higher distributions of the corresponding natural resource.

both cases we present only the abundance maps that provide meaningful information; thus, the maps presenting redundant information or predominant dark areas were not included. Although the region is characterized by the presence of wetlands, it has not been possible to use a set of reference spectra to identify the natural resources. Furthermore, Figure 10 displays the color abundance maps of the endmembers determined with the LAAMs method (left part), as well as the VCA algorithm (right part) from the hyperspectral data cube of Beltsville. Although the set *S* has conformed by seven endmembers spectra, we have included in both cases the abundance maps that best match the distribution of vegetation according to a visual inspection of Figure 6. In this example, it is evident that the identification of vegetation types, produced with each one of the methods, presents similar results for the segmentations colored in yellow and magenta. However, the results presents important differences particularly in the green and blue segmentations. These differences are mainly caused by the endmember search procedure used in each technique; in addition, the fact that spectral curves of vegetation types are alike, varying in certain absorption bands, contributes with the disagreements in these segmentation results.

Fig. 10. Color abundance maps of vegetation types obtained with the autonomous identification of endmembers in the Belstville hyperspectral image. Left: abundances of four endmembers determined with the LAAMs method whose distribution of colors corresponds to magenta = **w**24, yellow = **w**67, blue = **m**46, green = **m**62. Right: abundances of four endmembers determined with the VCA algorithm, whose distribution of colors is magenta = **s**2, yellow = **s**3, blue = **s**5, green = **s**7. Brighter areas correspond to higher distributions of the corresponding natural resource.

#### **7. Conclusion**

The use of high resolution image spectrometers for Earth observation purposes has given place to different applications oriented toward the identification, classification and monitoring of natural resources from remotely sensed data. In this chapter we have described the physical foundations behind the acquisition and calibration of hyperspectral imagery that constitute the basis of modern hyperspectral instruments, such as AVIRIS, HYDICE, and SpecTir's imaging spectrometers. Also, we have made a review of past, as well as recent methods for the autonomous endmember determination process based on the geometry of convex sets. The mathematical foundation behind these methods is to model the spectral mixtures acquired at a pixel basis as a linear combination of constituent materials. Hence, the aim of these techniques is to determine the constituent materials that are identified as the purest pixels in the scene. Among the methods discussed, we have emphasized a lattice algebra based method that uses two canonical associative memories, the min-*W*XX and the max-*M*XX, to determine a 2(*n* + 1)-simplex enclosing the hyperspectral data set. Thus, any subset of vertices of the simplex can be used as the endmember set to perform the unmixing process. The application of the LAAMs method and the VCA algorithm for the autonomous segmentation of real hyperspectral scenes taken from the SpecTir's imaging spectrometer has shown the effectiveness of convex set approaches. Although there exist some differences in the results obtained with both methods, any of them can be used for unsupervised hyperspectral segmentation, in particular if there is no reference data of the area, as the example cases treated in this chapter.

#### **8. Acknowledgments**

The authors acknowledge to SpecTir for providing the data sets used in these experiments. Juan C. Valdiviezo-N thanks the National Council of Science and Technology (CONACYT) for doctoral scholarship # 175027. Gonzalo Urcid is grateful with the National Research System (SNI-CONACYT) for partial financial support through grant # 22036.

#### **9. References**

20 Will-be-set-by-IN-TECH

are alike, varying in certain absorption bands, contributes with the disagreements in these

Fig. 10. Color abundance maps of vegetation types obtained with the autonomous

identification of endmembers in the Belstville hyperspectral image. Left: abundances of four endmembers determined with the LAAMs method whose distribution of colors corresponds to magenta = **w**24, yellow = **w**67, blue = **m**46, green = **m**62. Right: abundances of four

endmembers determined with the VCA algorithm, whose distribution of colors is magenta = **s**2, yellow = **s**3, blue = **s**5, green = **s**7. Brighter areas correspond to higher distributions of the

The use of high resolution image spectrometers for Earth observation purposes has given place to different applications oriented toward the identification, classification and monitoring of natural resources from remotely sensed data. In this chapter we have described the physical foundations behind the acquisition and calibration of hyperspectral imagery that constitute the basis of modern hyperspectral instruments, such as AVIRIS, HYDICE, and SpecTir's imaging spectrometers. Also, we have made a review of past, as well as recent methods for the autonomous endmember determination process based on the geometry of convex sets. The mathematical foundation behind these methods is to model the spectral mixtures acquired at a pixel basis as a linear combination of constituent materials. Hence,

segmentation results.

corresponding natural resource.

**7. Conclusion**


## **Part 3**

**Remote Sensing and GIS in Earth Observation Applications** 

22 Will-be-set-by-IN-TECH

174 Earth Observation

Jensen, J.R. (2007). *Remote Sensing of the Environment: an Earth Resource Perspective*, 2nd edition,

Kaburlasos, V.G. Ritter G.X. (eds.) (2007), *Computational Intelligence based on Lattice Theory*, Vol.

Keshava N. (2003). A survey of spectral unmixing algorithms, *Lincoln Laboratory Journal*, Vol.

Keshava, N. and Mustard, J.F. (2002). Spectral unmixing, *IEEE Signal Processing Magazine*, Vol.

Lawson, C.L. Hanson, R.J. (1974). *Solving least squares problems*, chap. 23, Prentice-Hall,

Nascimento, J.M.P. and Bioucas-Dias, J.M. (2005). Vertex component analysis: a fast algorithm

Ritter, G.X. Sussner, P. Díaz de León, J.L. (1998). Morphological associative memories, *IEEE*

Ritter, G.X. Urcid G., and Iancu, L. (2003). Reconstruction of noisy patterns using

Ritter, G.X. Gader, P. (2006). Fixed points of lattice transforms and lattice associative memories.

Ritter, G.X. Urcid, G. Schmalz, M.S. (2009). Autonomous single-pass endmember

Ritter, G.X., Urcid G. Lattice algebra approach to endmember determination in hyperspectral

Urcid, G. Valdiviezo, J.C. (2009). Color image segmentation based on lattice auto-associative

Valdiviezo, J.C. and Urcid, G. (2007). Hyperspectral endmember detection based on strong

Winter M.E. (2000). Comparison of approaches for determining end-members in hyperspectral

6696, pp. 669625 :1–12, San Diego, CA, USA, August 2007, SPIE Press. Winter M.E. (1999). NFIND-R: an algorithm for fast autonomous spectral endmember

SpecTir (2009). SpecTir: end to end hyperspectral solutions, website: www.spectir.com. Urcid, G. Valdiviezo-N., J.C. (2007). Generation of lattice independent vector sets for

*Trans. Neural Networks*, Vol. 9, No. 2, March 1998, pp. 281–293.

to unmix hyperspectral data, *IEEE Trans. on Geoscience and Remote Sensing*, Vol. 43,

morphological associative memories, *Journal of Mathematical Imaging and Vision*, Vol.

In: *Advances in imaging and electron physics*, Vol. 144, P. Hawkes editor, 165–242.

approximation using lattice auto-associative memories, *Neurocomputing*, Vol. 72,

imagery. In: *Advances in Imaging and Electron Physics*, Vol. 160, Peter W. Hawkes

pattern recognition applications, *SPIE Proceedings, Mathematics of Data/Image Pattern Recognition, Compression, Coding, and Encryption X with Applications*, Vol. 6700, pp.

memories, *Proceeding of IASTED, Artificial ingelligence and soft computing*, pp. 166-173,

lattice independence, *Proc. SPIE, Applications of Digital Image Processing XXX*, Vol.

determination in hyperspectral data, *Proceedings of SPIE, Imaging Spectrometry V*, Vol.

data, *Proceedings of IEEE: Aerospace Conference*, Vol. 3, pp. 305–313, Big Sky, MT, USA,

Lay, S.R. (2007). *Convex Sets and Their Applications,* Dover Publications, New York, USA. More, K.A. (2005). Spectrometers, In: *Encyclopedia of Modern Optics*, Vol. 1, Robert D. Guenther

Pearson Prentice Hall.

14, No. 1, pp. 55–78.

Englewood Cliffs NJ.

19, No. 5, pp. 95–111.

Elsevier, San Diego, CA.

March 2000, IEEE Press.

Issues 10-12, June 2009, pp. 2101–2110.

editor, pp. 113-169, Elsevier Inc, Academic Press.

67000C :1–12, San Diego, CA, August 2007, SPIE Press.

Palma de Mallorca, Spain, September 2009, Acta Press.

3753, pp. 266–275, Denver, CO, USA, July 1999, SPIE Press.

67. Springer Verlag, Heidelberg, Germany.

editor, pp. 324–336, Elsevier, Academic Press.

19, No. 1, January 2002, pp. 44–57.

No. 4, April 2005, pp. 898–910.

### **Forest Fires and Remote Sensing**

Abel Calle and José Luis Casanova *University of Valladolid, Spain* 

#### **1. Introduction**

The use of remote sensing techniques for the study of forest fires is a subject that started already several years ago and whose possibilities have been increasing as new sensors were incorporated into earth observation international programmes and new goals were reached based on the improved techniques that have been introduced. Three main topics can be distinguished, in which remote sensing provides results that can be applied directly to the subject of forest fires: risk of fire spreading, detection of hot-spots and establishment of fire thermal parameters and, finally, cartography of affected areas. In the last years, other two important topics are getting increasing interest; the first one is the estimation of severity, related to the post-fire phase, and the other one is the atmospheric impact of fire emissions.

With respect to the risk of fires, remote sensing has provided very valuable results in real time, which was the required aim. However, in order to be able to predict the existence of fires, it is necessary to incorporate indicators of very heterogeneous types which sometimes fall out of the field of earth observation studies; indicators related to economy, social and human activities or historical statistics among others, should, for example, be taken into account. That's why remote sensing must be restricted to a very limited aspect which makes it only suitable for the estimation of the spreading risk related to the vegetation dryness and surface temperature values. The main magnitude used as an indicator is the vegetation index, above all, the NDVI (Normalized Difference Vegetation Index). The first results in the estimation of the fire risk, although not in real time, were obtained through analyses by the satellites belonging to NOAA (National Oceanographic and Atmospheric Administration) series, by means of AVHRR (Advanced Very High Resolution Radiometer) sensor. Later on, further indicators coming from the same sensors were incorporated so as to improve the algorithms and include the information relative to meteorological conditions like the surface temperature obtained through satellites. The combination of the NDVI with the surface temperature has given place to a mixed index in which the lineal regression slope in both magnitudes established cells of terrain, presents a good correlation with the vegetation evapotranspiration and water stress (Nemani & Running, 1989). The use of the slope in this relation has been incorporated through different algorithms by different authors in order to establish another risk indicator (Illera *et al*., 1996); thus, Casanova *et al*., (1998) introduced it to work in real time within the operation in Mediterranean countries. The possibility of using the spectral information in the middle infrared, in the 1.6 m region, has given place to the introduction of other indicators related to the fuel's moisture since the vegetation's reflectivity in this wavelength interval is strongly influenced by the water contained in it. Hunt & Rock (1989) suggested a new vegetation index similar in the equation to the NDVI but including the reflectance in the near infrared and the reflectance in the 1.6 m region, an index indicating the fuel's moisture. At first, this index could only be applied to the Landsat-TM (Thematic Mapper) sensor for the creation of fuel maps (Chuvieco *et al.,* 2002). Today, it can be used in real time on the AVHRR and MODIS (Moderate Resolution Imaging Spectroradiometer) sensors to be incorporated to the risk maps as a new indicator.

The detection of hot spots and, together with it, the establishment of fire parameters, is the most complex task of the ones presented here, due to the orbital configuration of the current spacecrafts. The methodologies are very clear from the point of view of physics, but the restrictions of the current sensors introduce difficulties in order to get quality results. By detection, it is understood the task of determining the location of a hot spot independently of its size. By monitoring, it is understood the establishment of the most important fire parameters with a view to obtain relevant information on this phenomenon. Among these parameters are the fire's temperature, the area taken by the fire, the energy intensity and, when the sensor's capacity allows it, the establishment of the advancing fire line. In order to place this subject of study in its appropriate context, it must be pointed out that fire detection with an aim to create alarms that facilitate a rapid extinction is a necessity that hasn't been fully resolved yet. Despite its limitations the NOAA-AVHRR sensor has been the most important for fire detection and has provided a benchmark for subsequent sensors. An excellent revision of the algorithms used on AVHRR can be found in Li *et al* (2001). The case of the European sensor (A)ATSR (Advanced Along Track Scanning Radiometer) and the World Fire Atlas from 1997 published by the ESA with the ERS-1 and ERS-2 (European Remote Sensing Satellite) satellites data (Arino & Rosaz, 1999) has been used to demonstrate its suitability to fire detection and assessment of vegetation fire emissions. The appearance of the MODIS sensor heralded a significant step forward in the observation of forest fires (Giglio *et al.,* 2003) and, at this moment, the MODIS fire product is a consolidated product and a reference for global Earth observation. Fire product has been identified as an important input for global change analysis; however, although the radiometric availability is satisfactory, the main problem is the time resolution to operate in real time. Detection of high temperature events through geostationary satellites has been taken into account with the different perspective. The improvements introduced in the sensors have allowed us to use geostationary satellites beyond their meteorological capabilities, adapting them to Earth observation; this is the response to the need for series of stable fire activity observations for the analysis of global change, changes in land use and risk monitoring. The GOES (Geostationary Operational Environmental Satellite) has been the worldwide reference for fire monitoring through geostationary platforms. Since 2000, the Geostationary Wildfire Automated Biomass Burning Algorithm (WF\_ABBA) has been generating products for the western hemisphere in real-time with a time resolution of 30 minutes and this detection system has been operational within the NOAA NESDIS programme since 2002. The GOES-East and GOES-West spacecrafts are located in the Equator, providing diurnal coverage of North, Central and South America and data based on fire and smoke detection. The results provided by the GOES programme have been the starting point of a global geostationary system for fire monitoring, initially comprising four geostationary satellites that were already operational: two GOES platforms, from the USA, the European MSG (Meteosat Second Generation) and the Japanese MTSAT (Multifunctional Transport Satellite, covering Southeast Asia and several parts of India as observation regions. The minimum fire

Hunt & Rock (1989) suggested a new vegetation index similar in the equation to the NDVI but including the reflectance in the near infrared and the reflectance in the 1.6 m region, an index indicating the fuel's moisture. At first, this index could only be applied to the Landsat-TM (Thematic Mapper) sensor for the creation of fuel maps (Chuvieco *et al.,* 2002). Today, it can be used in real time on the AVHRR and MODIS (Moderate Resolution Imaging

The detection of hot spots and, together with it, the establishment of fire parameters, is the most complex task of the ones presented here, due to the orbital configuration of the current spacecrafts. The methodologies are very clear from the point of view of physics, but the restrictions of the current sensors introduce difficulties in order to get quality results. By detection, it is understood the task of determining the location of a hot spot independently of its size. By monitoring, it is understood the establishment of the most important fire parameters with a view to obtain relevant information on this phenomenon. Among these parameters are the fire's temperature, the area taken by the fire, the energy intensity and, when the sensor's capacity allows it, the establishment of the advancing fire line. In order to place this subject of study in its appropriate context, it must be pointed out that fire detection with an aim to create alarms that facilitate a rapid extinction is a necessity that hasn't been fully resolved yet. Despite its limitations the NOAA-AVHRR sensor has been the most important for fire detection and has provided a benchmark for subsequent sensors. An excellent revision of the algorithms used on AVHRR can be found in Li *et al* (2001). The case of the European sensor (A)ATSR (Advanced Along Track Scanning Radiometer) and the World Fire Atlas from 1997 published by the ESA with the ERS-1 and ERS-2 (European Remote Sensing Satellite) satellites data (Arino & Rosaz, 1999) has been used to demonstrate its suitability to fire detection and assessment of vegetation fire emissions. The appearance of the MODIS sensor heralded a significant step forward in the observation of forest fires (Giglio *et al.,* 2003) and, at this moment, the MODIS fire product is a consolidated product and a reference for global Earth observation. Fire product has been identified as an important input for global change analysis; however, although the radiometric availability is satisfactory, the main problem is the time resolution to operate in real time. Detection of high temperature events through geostationary satellites has been taken into account with the different perspective. The improvements introduced in the sensors have allowed us to use geostationary satellites beyond their meteorological capabilities, adapting them to Earth observation; this is the response to the need for series of stable fire activity observations for the analysis of global change, changes in land use and risk monitoring. The GOES (Geostationary Operational Environmental Satellite) has been the worldwide reference for fire monitoring through geostationary platforms. Since 2000, the Geostationary Wildfire Automated Biomass Burning Algorithm (WF\_ABBA) has been generating products for the western hemisphere in real-time with a time resolution of 30 minutes and this detection system has been operational within the NOAA NESDIS programme since 2002. The GOES-East and GOES-West spacecrafts are located in the Equator, providing diurnal coverage of North, Central and South America and data based on fire and smoke detection. The results provided by the GOES programme have been the starting point of a global geostationary system for fire monitoring, initially comprising four geostationary satellites that were already operational: two GOES platforms, from the USA, the European MSG (Meteosat Second Generation) and the Japanese MTSAT (Multifunctional Transport Satellite, covering Southeast Asia and several parts of India as observation regions. The minimum fire

Spectroradiometer) sensors to be incorporated to the risk maps as a new indicator.

detection sizes of GOES, MSG and MTSAT, with time resolution less than 30 minutes has allowed the international community to think in a global observation network in real time. The implementation of this network is the aim of the Global Observations of Forest Cover and Land Cover Dynamics (GOFC/GOLD) FIRE Mapping and Monitoring program, internationally focussing on decision-taking concerning research into Global Change. The GOFC/GOLD FIRE program and the Committee on Earth Observation Satellites (CEOS) Land Product validation held a workshop dedicated to the applications of the geostationary satellites for forest fire monitoring (Prins *et al.*, 2004).

The cartography of areas affected by fires is a subject that has been dealt with in depth. Remote sensing has proved to be very useful in the study of forest fires cartography and severity since the time resolution does not prevent the subsequent evaluation of the consequences. Different radiometric procedures have been used based on the application of fixed thresholds to the NDVI in the case of low spatial resolution; the results are satisfactory but the difficulty of these procedures lies in the search of a fixed threshold value, because what seems quite probable is the dependence of the threshold values according to the area analysed and the time of the year in which the study is carried out. Methodologies based on neural networks (Al-Rawi *et al*., 2001) have also been applied although they have the difficulty of training the neural net so that the final results will depend on the variability of the statistical sample used in the preparation of the neural net. The multi-temporal use of the NDVI for the radiometric analysis and the establishment of thresholds on the NDVI through a spatial contextual analysis is a procedure has been used for low spatial resolution. In the case of high spatial resolution, several procedures have been suggested using many different methodologies to be applied to the TM sensor, on board of Landsat satellites being one of the most frequent the spectral classification. However, an important problem is that it requires the distribution of data probability; another disadvantage is that in order to obtain higher quality results a supervised classification of the zones must be carried out, requiring interaction by user. Within the automatic methodologies, the lineal transformations have shown a great capacity for the obtaining of results. Thus, the application of the Principal Components to the reflectance bands obtains almost immediate cartographic results, since they can be analysed visually through a RGB (Red Green Blue) composite of the output components. An issue linked to the fire cartography is the estimation of severity. Each summer large fires affect to the Mediterranean Europe due to changes in traditional land use patterns which have led to an unusual accumulation of forest fuels, notably increasing fire risk and fire severity. According with Roldán-Zamarrón *et al.* (2006), there is interest in finding a quick and affordable methodology for obtaining fire severity maps that can be made available only a few days after the fire, as this information could prove very valuable in the early stages of rehabilitation planning for large fires. These maps should be based on independent data sources, such as remote sensing, employ automatic or semiautomatic methods, and produce results of an acceptable reliability. Remote sensing techniques are a useful tool in order to generate maps showing different degrees of damage affecting vegetation after a large wildfire in an effective manner. Objective of these severity maps is to locate priority intervention areas and plan forest restoration works.

#### **2. Fires and climate**

Following the GCOS (Global Climate Observing System) document "Systematic Observation Requirements for Satellite-based products for Climate", and ESA Climate Initiative, the emissions of greenhouse gases (GHGs) and aerosols from fires are important climate forcing factors, contributing on average between 25-35% of total CO2 emissions to the atmosphere, as well as CO, methane and aerosols. Hence, estimates of GHG emissions due to fire are essential for realistic modelling of climate and its critical component, the global carbon cycle. Fires caused deliberately for land clearance (agriculture and ranching) or accidentally (lightning strikes, human error) are a major factor in land-cover changes, and hence affect fluxes of energy and water to the atmosphere. Burnt area, as derived from satellites, is considered as the primary variable that requires climate-standard continuity. It can be combined with information on burn efficiency and available fuel load to estimate emissions of trace gases and aerosols. Measurements of burnt area can be used as a direct input to climate and carbon cycle models, or, when long time series of data are available, to parameterize climate-driven models for burnt area. Burnt area, combined with other information (burn efficiency and available fuel load) provides estimates of emissions of trace gases and aerosols. Measurements of burnt area can be used as a direct input to climate and carbon-cycle models, or, when long time series of data are available, to parameterise climate-driven models for burnt area (fire is dealt with in many climate and biosphere models using the latter approach). Fire-induced emissions are a significant terrestrial source of GHGs, with large spatial and interannual variability. Detection of active fires serves as part of the validation process for burnt area (i.e., is the burnt area associated with previous observations of active fire). Detection of active fires provides an indicator of seasonal, regional and interannual variability in fire frequency and shifts in geographic location and timing of fire events. Strong empirical relations exist between the FRP (Fire Radiative Power) and rates of combustion; so, the use of multiple FRP observations to integrate over the lifetime of the fire provides an estimate of the total CO2 emitted. FRP provides a means to derive a CO2 emissions estimate from remotely-sensed observations without relying on difficult-to-acquire ancillary data on fuel load and combustion completeness factors.

#### **3. Fire detection**

#### **3.1 Physic principle of fire detection**

As is to be expected, the process of the detection of hot spots is based on the use of bands in the middle and thermal infrared spectrum. There are three laws of Physics that govern the detection process: law of Plank, Wien's displacement law and Stefan-Boltzmann's law. The radiance emission corresponding to a body with a temperature of 300 K, as can be the Earth's mean temperature, will have a maximum value close to 10 m, and a spectral band situated in this one would receive a very strong signal. For a temperature of 800 K, the maximum value will have displaced to wavelengths close to 3.6 m and whereas the signal here would be very intense, it wouldn't be nearly as intense in higher wavelengths. The fire detection is based precisely on this inversion, which is possible to detect with thermal bands situated in the spectral regions of 11-12 m and 3-4 m. Figure 1, shows this physic principle graphically. The location of two generic bands, in the middle and thermal infrared, are also shown. It's clear that at a temperature of 300K, the radiance received in MIR (Middle InfraRed) is lower than the one received in TIR (Thermal InfraRed). However, at 500 K, this behaviour has inverted and now the radiance is higher in MIR.

The basic principle followed for the location of the spectral bands in a sensor, is described through two questions; first: what I want to see? And second: where the atmosphere allows

Initiative, the emissions of greenhouse gases (GHGs) and aerosols from fires are important climate forcing factors, contributing on average between 25-35% of total CO2 emissions to the atmosphere, as well as CO, methane and aerosols. Hence, estimates of GHG emissions due to fire are essential for realistic modelling of climate and its critical component, the global carbon cycle. Fires caused deliberately for land clearance (agriculture and ranching) or accidentally (lightning strikes, human error) are a major factor in land-cover changes, and hence affect fluxes of energy and water to the atmosphere. Burnt area, as derived from satellites, is considered as the primary variable that requires climate-standard continuity. It can be combined with information on burn efficiency and available fuel load to estimate emissions of trace gases and aerosols. Measurements of burnt area can be used as a direct input to climate and carbon cycle models, or, when long time series of data are available, to parameterize climate-driven models for burnt area. Burnt area, combined with other information (burn efficiency and available fuel load) provides estimates of emissions of trace gases and aerosols. Measurements of burnt area can be used as a direct input to climate and carbon-cycle models, or, when long time series of data are available, to parameterise climate-driven models for burnt area (fire is dealt with in many climate and biosphere models using the latter approach). Fire-induced emissions are a significant terrestrial source of GHGs, with large spatial and interannual variability. Detection of active fires serves as part of the validation process for burnt area (i.e., is the burnt area associated with previous observations of active fire). Detection of active fires provides an indicator of seasonal, regional and interannual variability in fire frequency and shifts in geographic location and timing of fire events. Strong empirical relations exist between the FRP (Fire Radiative Power) and rates of combustion; so, the use of multiple FRP observations to integrate over the lifetime of the fire provides an estimate of the total CO2 emitted. FRP provides a means to derive a CO2 emissions estimate from remotely-sensed observations without relying on

difficult-to-acquire ancillary data on fuel load and combustion completeness factors.

As is to be expected, the process of the detection of hot spots is based on the use of bands in the middle and thermal infrared spectrum. There are three laws of Physics that govern the detection process: law of Plank, Wien's displacement law and Stefan-Boltzmann's law. The radiance emission corresponding to a body with a temperature of 300 K, as can be the Earth's mean temperature, will have a maximum value close to 10 m, and a spectral band situated in this one would receive a very strong signal. For a temperature of 800 K, the maximum value will have displaced to wavelengths close to 3.6 m and whereas the signal here would be very intense, it wouldn't be nearly as intense in higher wavelengths. The fire detection is based precisely on this inversion, which is possible to detect with thermal bands situated in the spectral regions of 11-12 m and 3-4 m. Figure 1, shows this physic principle graphically. The location of two generic bands, in the middle and thermal infrared, are also shown. It's clear that at a temperature of 300K, the radiance received in MIR (Middle InfraRed) is lower than the one received in TIR (Thermal InfraRed). However, at 500 K, this

The basic principle followed for the location of the spectral bands in a sensor, is described through two questions; first: what I want to see? And second: where the atmosphere allows

**3. Fire detection**

**3.1 Physic principle of fire detection** 

behaviour has inverted and now the radiance is higher in MIR.

Fig. 1. Law of Planck, showing blackbody emission for different temperatures values of source, and location of MIR and TIR spectral bands (adapted from Li *et al*, 2001).

me to do? Fortunately, atmospheric absorption is selective in several spectral bands; the water vapour absorption is very strong below 3.4 m but there is a atmospheric window in the interval [3.5-4.2 m]; so, the MIR bands must be located preventing the absorption of water vapour in the 3-4 m region. In the case of TIR region, there is a strong absorption band centred at 9.6m, due to ozone, but there is an atmospheric window in the interval of [10-12 m], with a weak effect of water vapour; its easy to remove this effect by means of two spectral bands located in this window (split-window technique, Price 1984).

A technical problem to take into account is that the radiance obtained by the sensor, coming from a concrete pixel in which there is a fire, does not only depend on the fire's temperature and, as it is logical, on the temperature of the surrounding surface, but also on the location of the fire inside the pixel since, at the end, the sensor's PSF (Point Spread Function) will determine the filtering that is carried out on the original image. In any case, to obtain a positive detection from an active fire is easy. The main problem to detect fires is to obtain a positive detection when the fire does not exist; that is: a false alarm. The false detections, in the 4 m region, are due to radiance not only is coming from the emission, but there also exists a component due to the effect of reflection. That's why we can find ourselves in situations in which a high radiance signal does not necessarily correspond to a hightemperature pixel, except in the case of night observation, when the reflection component, evidently, does not exist. As is to be expected, the radiance that gets to the sensor in that part of the spectrum where the reflection and the emission effects superimpose, as is the 4

m case. The value of the reflected component Lreflection is <sup>0</sup> cos( ) *E sun* , with sun being the sun's zenith angle, E0 the extraterrestrial sun's irradiance in that spectral band and the spectral reflectance (the atmospheric effect are not included). The emission component, Lemission is 1·, *B T surface* , in which B(,T) is the function of Planck, the wavelength in that spectral band and Tsurface the surface temperature observed by the sensor. Note that

=(1-) and =(1-), with the emissivity. If a surface has a high value of reflectance in the MIR region, radiance coming from source is high and brightness temperature will be higher than surface temperature in several Kelvin. As an example, it can be observed that for a reflectance value of 20% and a sun's zenith angle of 30º the brightness temperature can increase more than 20 K higher due to reflectance effect. When the contribution of the reflection component is very marked and as a consequence the radiance increases in the middle infrared band, a pixel appears with an apparent high temperature that can be mistaken with a hot spot, producing a false alarm. These situations are more frequent in the highest spatial resolution sensors and when high reflectance surfaces coincide with sunsatellite geometrical situations close to specular reflection conditions. This is the case of small water surfaces, for example, and it is called *sun glint.* It must be pointed out that the problem of the appearance of false alarms is more difficult to solve than the detection itself due to the difficulty in separating both effects. Finally, it must be mentioned that clouds are also an important source of false alarms due to the sun's reflection. Their reflectance is high and they cause a strong signal in the MIR spectral band in situations of very high sun zenith angles.

#### **3.2 Fire detection using heliosynchronous platforms**

Not only was the NOAA-AVHRR sensor the first one to provide results, but it has also been a research platform in the development of hot-spot detection algorithms. This has been possible thanks to its high time resolution (among the polar heliosynchronous sensors) and to which the physic principles of detection mentioned above can be applied. It must be pointed out, however, that the AVHRR sensor has important limitations. The most important of these is the low saturation level, 320-331 K (Robinson, 1991*),* of the main band involved in the detection, the 3.7m band. This limit is so low that a fire with a temperature of 1000 K on a non-reflective surface of 300K only needs a 13x 13 m2 surface to reach the pixel's saturation. This important drawback makes the sensor suitable for the detection of hot-spots but in most cases, it makes it unsuitable for the analysis at a sub-pixel level. In spite of its limitations it is unavoidable to use this sensor as a comparative reference for subsequent, more operative sensors such as MODIS (Ichoku *et al*., 2003). The detection has been developed through different algorithms that can be schematically classified into algorithms based on fixed thresholds and contextual algorithms, whose parameters have been adapted to the different zones of study. Both types of algorithms have advantages and disadvantages and their application will depend on the type of sensors to which they are going to be applied. The detection algorithms based on fixed thresholds, also called multi-

situations in which a high radiance signal does not necessarily correspond to a hightemperature pixel, except in the case of night observation, when the reflection component, evidently, does not exist. As is to be expected, the radiance that gets to the sensor in that part of the spectrum where the reflection and the emission effects superimpose, as is the 4

sun's zenith angle, E0 the extraterrestrial sun's irradiance in that spectral band and the spectral reflectance (the atmospheric effect are not included). The emission component,

that spectral band and Tsurface the surface temperature observed by the sensor. Note that =(1-) and =(1-), with the emissivity. If a surface has a high value of reflectance in the MIR region, radiance coming from source is high and brightness temperature will be higher than surface temperature in several Kelvin. As an example, it can be observed that for a reflectance value of 20% and a sun's zenith angle of 30º the brightness temperature can increase more than 20 K higher due to reflectance effect. When the contribution of the reflection component is very marked and as a consequence the radiance increases in the middle infrared band, a pixel appears with an apparent high temperature that can be mistaken with a hot spot, producing a false alarm. These situations are more frequent in the highest spatial resolution sensors and when high reflectance surfaces coincide with sunsatellite geometrical situations close to specular reflection conditions. This is the case of small water surfaces, for example, and it is called *sun glint.* It must be pointed out that the problem of the appearance of false alarms is more difficult to solve than the detection itself due to the difficulty in separating both effects. Finally, it must be mentioned that clouds are also an important source of false alarms due to the sun's reflection. Their reflectance is high and they cause a strong signal in the MIR spectral band in situations of very high sun zenith

Not only was the NOAA-AVHRR sensor the first one to provide results, but it has also been a research platform in the development of hot-spot detection algorithms. This has been possible thanks to its high time resolution (among the polar heliosynchronous sensors) and to which the physic principles of detection mentioned above can be applied. It must be pointed out, however, that the AVHRR sensor has important limitations. The most important of these is the low saturation level, 320-331 K (Robinson, 1991*),* of the main band involved in the detection, the 3.7m band. This limit is so low that a fire with a temperature of 1000 K on a non-reflective surface of 300K only needs a 13x 13 m2 surface to reach the pixel's saturation. This important drawback makes the sensor suitable for the detection of hot-spots but in most cases, it makes it unsuitable for the analysis at a sub-pixel level. In spite of its limitations it is unavoidable to use this sensor as a comparative reference for subsequent, more operative sensors such as MODIS (Ichoku *et al*., 2003). The detection has been developed through different algorithms that can be schematically classified into algorithms based on fixed thresholds and contextual algorithms, whose parameters have been adapted to the different zones of study. Both types of algorithms have advantages and disadvantages and their application will depend on the type of sensors to which they are going to be applied. The detection algorithms based on fixed thresholds, also called multi-

*B T surface* , in which B(,T) is the function of Planck, the wavelength in

 *E sun* 

, with sun being the

m case. The value of the reflected component Lreflection is <sup>0</sup> cos( )

**3.2 Fire detection using heliosynchronous platforms**

Lemission is 1·, 

angles.

channel, are based on the establishment of minimum temperature values in different spectral bands from which the detection is established. The most common scheme is to consider that a pixel is affected by a fire when the following conditions are fulfilled simultaneously:

$$T\_{\rm MIR} > V\_{\rm MIR} \; ; \; T\_{\rm MIR} - T\_{\rm CIR} > V\_{\rm DIF} \; ; \; T\_{\rm TIR} > V\_{\rm TIR} \; ; \; R\_{\rm NIR} < V\_{\rm NIR} \tag{1}$$

where TMIR and TTIR refer to the brightness temperature in the spectral bands of the 3.7 m and 11 m regions respectively, and V is the adopted threshold. In the former test, the first two conditions are the ones that carry out the detection of hot-spots strictly speaking according to the physic principles previously stated. The TMIR test is for fire detection and the TMIR-TTIR test is to carry out the differentiation between the fire, which has high values in the MIR, and the hot surfaces which have high values both in the MIR and TIR. The TTIR test is a cloud filter to apply the test to images in which the cloud cover has not been removed through other procedures. The RNIR test is to filter the reflectance in sun-glint situations that are responsible for the appearance of false alarms. The threshold values established are varied. They depend on the algorithm and, above all, on the geographic area due to the influence of the background temperature. Thus, normally low surface temperature values use lower MIR threshold values without the appearance of false alarms. Two examples of this type of algorithms, operating on NOAA-AVHRR, are the used by the CCRS (Canadian Centre of Remote Sensing) (Li *et al*., 2000) and the ESA (European Space Agency) (Arino and Mellinote, 1998). The disadvantage of the algorithms based on fixed thresholds is that the values established depend on the zone of study and their environmental temperatures. In order to avoid this dependence, contextual algorithms can be used. They are based on the obtaining of threshold values carrying out a statistical analysis of the environment. The basic scheme is summarised in the following test:

$$T\_{\rm MIR} > \mu\_{\rm MIR} + f \cdot \sigma\_{\rm MIR} \; ; \; T\_{\rm MIR} - T\_{\rm TIR} > \mu\_{\rm DIF} + f \cdot \sigma\_{\rm DIF} \; ; \; R\_{\rm NIR} < \mu\_{\rm MIR} - f \cdot \sigma\_{\rm NIR} \tag{2}$$

where and are the mean values and the standard deviation in the environment of the pixel analysed and f is a factor that has to be established. The environment is analysed in a matrix with a size of NxN pixels, being N an odd value depending on the sensor to which it is applied. Two examples are the IGBP (International Geosphere and Biosphere Programme) algorithm (Justice & Malingreau, 1993), and an adaptation of the current algorithm on MODIS (Kaufman *et al.,* 1998). Contextual algorithms have the advantage of making the detection process independent from the season and the zone analysed, since the thresholds are obtained by means of a statistical analysis of the environment. However, they have a serious drawback when they are applied to images in which the clouds have not been filtered since cloud edges cause false alarms. A variant to the basic contextual algorithm exposed is the one suggested by Lasaponara *et al*. (1998), in which the mean statistical parameters and the standard deviation are determined by using not just the spatial environment but also the temporal one, extending the matrix of analysis to the images of previous days, looking for the changes in the brightness temperature not only in a spatial scale but in a temporal interval too.

The launch of the MODIS sensor in 1999 on the Terra platform and in 2002 on Aqua with 36 different-spatial-resolution spectral bands has provided much more reliable results in detection. This sensors includes two spectral bands in the 4 m spectral zone with saturation values very high to the MIR AVHRR band and the applied algorithm uses a large number of bands to consolidate the results. Another important characteristic of MODIS is an excellent radiometric resolution of 12 bits (instead AVHRR sensor with 10 bits) very interesting to fire monitoring. The original algorithm has been improved (Giglio *et al*., 2003) and it carries out three test phases: cloud cover filtering, detection and consolidation test. The cloud-andwater-filtering phase uses three spectral bands: the reflectances in bands 1 and 2 with a spatial resolution of 250 meters, centred in 0.65 m and 0.86 m, respectively, and the temperature in the band of 12 m, T12. Thus, the pixels fulfilling any of the three following conditions will be rejected: having a T12 value lower than 265 K or a sum of reflectances higher than 0.9 or T12 lower than 285 K and sum of reflectances higher than 0.7 simultaneously. First, the detection phase establishes the potential pixels that must be analysed according to the criteria used by AVHRR with fixed thresholds, analysing the temperature in band 21, around 4 m, establishing a threshold of 310 K and the difference of this band with band 31, around 11 m, with a threshold difference of 10 K. Later, the identification of fire pixels, among the potential ones, is carried out through two procedures: first, an absolute test for the ones that have a T4m value higher than 360 K during the day and 320 K at night. Secondly, an alternative test which carries out a characterisation of the environment' s temperature through a contextual analysis on the pixels that were not considered potential and with a variable window until a significant number of points is obtained. This contextual analysis is similar to the one used by the AVHRR algorithms, but it follows additional steps to eradicate false alarms and it differentiates between day and night pixels. The methodology considers three different sources of false alarms: the first one is the possibility of sun-glint, which is solved through a geometrical analysis with the sunpixel-satellite directions in order to reject situations of specular reflection; the second one are hot desert pixels and the third one the coast lines. The two latter are solved through the establishment of temperature and reflectance thresholds simultaneously. Finally, the consolidation phase establishes a statistical analysis to obtain well-confirmed pixels affected by a fire. This is due to the fact that the spatial resolution of MODIS in the thermal is 2 km, with step of 1km. This may cause that the same fire, located in a zone where two pixels are superimposed can be revealed by both of them. The consolidation test is carried out on the pixels adjacent to the one which is being analysed. More details about the algorithm can be seen in (Giglio *et al*., 2003). It must be pointed out that MODIS has two bands in the MIR region used in detection: bands 21 and 22, both centred in the 3.9 m. The difference is that the saturation level for the first one is 500K whereas for the second one it is 331. However, band 22 has less noise and a smaller error in the calibration. That's why, if the pixel is not saturated, the algorithm uses band 22. Otherwise, band 21 is used.

In spite of the tools we have shown for fire detection, it must be said that their results have not been brought into operation due to the lack of continuity in the monitoring of heliosynchronic satellites. Several monitoring programmes have been designed based on the co-ordination of several satellites in different orbital planes in order to increase the number of daily observations on a concrete place. Projects such as FUEGO originally and FUEGOSAT nowadays, are funded by ESA in order to obtain a product that can be put into operation for the monitoring of forest fires. On the other hand, the effectiveness in detection, of the sensors mentioned, could be improved through the design of sensors specially dedicated to fire detection. There was a prototype satellite fulfilling these characteristics and that has provided results to be analysed. It is the BIRD (Bi-spectral Infrared Detection),

values very high to the MIR AVHRR band and the applied algorithm uses a large number of bands to consolidate the results. Another important characteristic of MODIS is an excellent radiometric resolution of 12 bits (instead AVHRR sensor with 10 bits) very interesting to fire monitoring. The original algorithm has been improved (Giglio *et al*., 2003) and it carries out three test phases: cloud cover filtering, detection and consolidation test. The cloud-andwater-filtering phase uses three spectral bands: the reflectances in bands 1 and 2 with a spatial resolution of 250 meters, centred in 0.65 m and 0.86 m, respectively, and the temperature in the band of 12 m, T12. Thus, the pixels fulfilling any of the three following conditions will be rejected: having a T12 value lower than 265 K or a sum of reflectances higher than 0.9 or T12 lower than 285 K and sum of reflectances higher than 0.7 simultaneously. First, the detection phase establishes the potential pixels that must be analysed according to the criteria used by AVHRR with fixed thresholds, analysing the temperature in band 21, around 4 m, establishing a threshold of 310 K and the difference of this band with band 31, around 11 m, with a threshold difference of 10 K. Later, the identification of fire pixels, among the potential ones, is carried out through two procedures: first, an absolute test for the ones that have a T4m value higher than 360 K during the day and 320 K at night. Secondly, an alternative test which carries out a characterisation of the environment' s temperature through a contextual analysis on the pixels that were not considered potential and with a variable window until a significant number of points is obtained. This contextual analysis is similar to the one used by the AVHRR algorithms, but it follows additional steps to eradicate false alarms and it differentiates between day and night pixels. The methodology considers three different sources of false alarms: the first one is the possibility of sun-glint, which is solved through a geometrical analysis with the sunpixel-satellite directions in order to reject situations of specular reflection; the second one are hot desert pixels and the third one the coast lines. The two latter are solved through the establishment of temperature and reflectance thresholds simultaneously. Finally, the consolidation phase establishes a statistical analysis to obtain well-confirmed pixels affected by a fire. This is due to the fact that the spatial resolution of MODIS in the thermal is 2 km, with step of 1km. This may cause that the same fire, located in a zone where two pixels are superimposed can be revealed by both of them. The consolidation test is carried out on the pixels adjacent to the one which is being analysed. More details about the algorithm can be seen in (Giglio *et al*., 2003). It must be pointed out that MODIS has two bands in the MIR region used in detection: bands 21 and 22, both centred in the 3.9 m. The difference is that the saturation level for the first one is 500K whereas for the second one it is 331. However, band 22 has less noise and a smaller error in the calibration. That's why, if the pixel is not

saturated, the algorithm uses band 22. Otherwise, band 21 is used.

In spite of the tools we have shown for fire detection, it must be said that their results have not been brought into operation due to the lack of continuity in the monitoring of heliosynchronic satellites. Several monitoring programmes have been designed based on the co-ordination of several satellites in different orbital planes in order to increase the number of daily observations on a concrete place. Projects such as FUEGO originally and FUEGOSAT nowadays, are funded by ESA in order to obtain a product that can be put into operation for the monitoring of forest fires. On the other hand, the effectiveness in detection, of the sensors mentioned, could be improved through the design of sensors specially dedicated to fire detection. There was a prototype satellite fulfilling these characteristics and that has provided results to be analysed. It is the BIRD (Bi-spectral Infrared Detection), designed by the DLR German laboratory, as a sensor prototype, and its detection capacities have been very satisfactory thanks to its design (Briess *et al.,* 2003); the HSRS (Hot Spot Recognition Sensor System), with a visual field of 19º (190 km), a spatial resolution of 370 m and a radiometric resolution of 14 bits. Apart from the new spatial resolution in the thermal and its excellent radiometric resolution, this sensor is able to establish a dynamic rank of calibration that is completed with two successive expositions of the scene with a short time of integration; this makes it possible to establish a saturation limit close to 1000K, with a temperature resolution in the interval [0.1-0.2 K]. The algorithm includes 5 consecutive tests through which different threshold values of analysis are established: an adaptive test in the MIR to detect potential hot-spots, a threshold in the NIR to reject the sun reflection, which is a source of false alarms during day observations, a threshold adaptive to the MIR/NIR fraction of radiances to reject clouds and other high-reflective objects, a threshold adaptive to the MIR/TIR fraction of radiances to reject hot surfaces and finally, the gathering of pixels that are adjacent to the fire to obtain the fire's temperature and area parameters. It is important to mention that all the adaptive thresholds mentioned are obtained through the contextual spatial analysis. BIRD satellite must be considered as a very low-cost prototype to operate with several units in orbital co-ordination. The fire parameters provided by BIRD have been able to locate the flaming front very accurately (Wooster *et al.,* 2003).

#### **3.3 Fire detection using geostationary platforms**

As we have mentioned in the section "Introduction", the geostationary sensors can improve the fire detection results, due to its very short revisit time, even when spatial resolution is very limited due to location in space of geostationary platforms. Currently, the users international community feels that a real-time global observation network may become a reality by means of geostationary sensors such as GOES, MSG and MTSAT. This is one of the objectives of the Global Observations of Forest Cover and Land Cover Dynamics (GOFC/GOLD) FIRE Mapping and Monitoring program, focussing internationally on decision-taking concerning research into Global Change and its ecological and environmental implications. Major efforts are also being made by ESA-EUMETSAT to increase the use of MSG in environmental observation tasks. SEVIRI (Spinning Enhanced Visible and Infrared Imager) on board MSG platforms is a very interesting example of suitable sensor to perform forest fire monitoring in real time (Calle *et al.*, 2006). Some analyses are shown in the particular case of the geographical latitude of the Mediterranean Europe where, during the last years, detection campaigns and dissemination of results in real time have been carried out. The theoretical analysis of the minimum detectable size, including atmospheric effects and saturation conditions, are especially important to delimit the operational range of this sensor in Mediterranean latitudes, where the effects of forest fires are increasingly devastating each year, both in terms of financial as well as human losses. MSG-SEVIRI is geostationary sensor with a time resolution of 15 minutes; so, the comparison between successive scenes provides reliable results once the difference temperature threshold is established for such an interval. Thus, if a Time Thermal Gradient, TTG, higher to the one considered as normal, is detected, we will have a high temperature event. In order to estimate this gradient let's consider a day's thermal evolution as a sinusoidal curve responding to the form: <sup>2</sup> *MIR Temp A wt B w* \_ · sin( ) ; ; *<sup>T</sup>* where T is the day's period in units of 15 minutes (T=96), A is semi-daily thermal oscillation and B is a not relevant coefficient. According to this model, the maximum difference in the MIR standard temperature between two consecutive SEVIRI scenes is 1.5K for a diurnal-cycle thermal oscillation of around 30K, which is typical of summer days in middle latitudes. This estimation agrees with the experimental values found in the analysis of the series of MIR temperature evolution curves selected for different test sites in the Mediterranean Europe, during summer. Like this, the maximum temperature difference found, in absolute values, in the 98.2% of cases was lower than 2K. The averaged of differences found, only considering the intervals with thermal variability [05:00-11:00 GMT and [14:00-20:00 GMT, was 1.2K, with a standard deviation of 0.5 K. So, we have considered appropriate to establish a threshold of 4K as the temperature increase value to detect the beginning of a fire without providing false alarms. In any case, it must be pointed out that there are two daily periods very well defined: from sunrise to midday, in which the temperature is increasing and where the estimation of 4K is appropriate, and the second one between midday and sunset, for which a value of 2-4K would be enough, being a negative gradient. During night periods detection is easier. In order to estimate minimum fire size detectable by the SEVIRI sensor, simulations have been done by means of MODTRAN radiative transfer code (Berk *et al.,* 1996) by introducing different surface and fire temperatures according to different time thermal gradient values. Radiance observed by sensor was simulated as: *L pL p L sensor* · 1· *fire surface*

where p is the surface fraction affected by fire and where two homogeneous phases have been considered: fire and surface; Lfire and Lsurface are the radiances incoming from fire and surface. Spectral radiance was integrated with 20 cm-1 resolution by means of spectral response function and considering different atmospheric attenuation conditions. Results are shown in figure 2, for a standard atmosphere of middle latitude summer and aerosol depth according to visibility 23 km. Abscissa axis shows the potential fire temperature and ordinate axis shows the minimum detectable area expressed in ha. Different magnitudes of influence must be analysed separately,being the most important the threshold of TTG considered, *T t* , but geographic latitude of observation too. The figure contains the results for three different values of the gradient: 4, 6 and 2K/15\_minutes and for two locationstype, at 20º and 50º latitude. With respect to the latitude, it must be taken into account that although the pixels's area in the nadir point is 9 km2, at latitude of 20º it is 10km2 and at 50º it has increased up to 18 km2. Thus, for a required gradient of 4K/15\_min. and a fire of 600K, the detectable area at 20º latitude is 0.5 ha, whereas at 50º latitude it would be 1ha. The geographic longitude has not been analysed since it has a very low distortion in the pixels'

area. With respect to the thermal gradient, 4K/15\_min is the reference for the analysis carried out in previous paragraphs. The figure shows results for a value 2K/15\_min that can be applied in the descendant period of daily thermal evolution [14.00-20:00, because during this period *<sup>T</sup>* <sup>0</sup> *<sup>t</sup>* is expected and the value 2K/15\_min could be enough. This means that during the evening, fires are more easily detected through this methodology and the fire starting can be established at 600K with 0.24 ha at 20º latitude and 0.48 ha at 50º latitude. As can be seen, the detectable sizes during the day at 20º latitude are similar to the ones in the evening at 50º latitude. Latitude has influence in the variability of the pixel's area and in the atmospheric transmittance, with the cenital angle, which has also been taken into account to obtain results. Results obtained for different atmospheric profiles do not differ too much. Another very important magnitude to be considered is the surface temperature since the considered methodology is presented with continuity throughout the day and night, a

a not relevant coefficient. According to this model, the maximum difference in the MIR standard temperature between two consecutive SEVIRI scenes is 1.5K for a diurnal-cycle thermal oscillation of around 30K, which is typical of summer days in middle latitudes. This estimation agrees with the experimental values found in the analysis of the series of MIR temperature evolution curves selected for different test sites in the Mediterranean Europe, during summer. Like this, the maximum temperature difference found, in absolute values, in the 98.2% of cases was lower than 2K. The averaged of differences found, only considering the intervals with thermal variability [05:00-11:00 GMT and [14:00-20:00 GMT, was 1.2K, with a standard deviation of 0.5 K. So, we have considered appropriate to establish a threshold of 4K as the temperature increase value to detect the beginning of a fire without providing false alarms. In any case, it must be pointed out that there are two daily periods very well defined: from sunrise to midday, in which the temperature is increasing and where the estimation of 4K is appropriate, and the second one between midday and sunset, for which a value of 2-4K would be enough, being a negative gradient. During night periods detection is easier. In order to estimate minimum fire size detectable by the SEVIRI sensor, simulations have been done by means of MODTRAN radiative transfer code (Berk *et al.,* 1996) by introducing different surface and fire temperatures according to different time thermal gradient values. Radiance observed by sensor was simulated as: *L pL p L sensor* · 1· *fire surface* where p is the surface fraction affected by fire and where two homogeneous phases have been considered: fire and surface; Lfire and Lsurface are the radiances incoming from fire and surface. Spectral radiance was integrated with 20 cm-1 resolution by means of spectral response function and considering different atmospheric attenuation conditions. Results are shown in figure 2, for a standard atmosphere of middle latitude summer and aerosol depth according to visibility 23 km. Abscissa axis shows the potential fire temperature and ordinate axis shows the minimum detectable area expressed in ha. Different magnitudes of influence must be analysed separately,being the most important the threshold of TTG

, but geographic latitude of observation too. The figure contains the results

is expected and the value 2K/15\_min could be enough. This means that

for three different values of the gradient: 4, 6 and 2K/15\_minutes and for two locationstype, at 20º and 50º latitude. With respect to the latitude, it must be taken into account that although the pixels's area in the nadir point is 9 km2, at latitude of 20º it is 10km2 and at 50º it has increased up to 18 km2. Thus, for a required gradient of 4K/15\_min. and a fire of 600K, the detectable area at 20º latitude is 0.5 ha, whereas at 50º latitude it would be 1ha. The geographic longitude has not been analysed since it has a very low distortion in the pixels' area. With respect to the thermal gradient, 4K/15\_min is the reference for the analysis carried out in previous paragraphs. The figure shows results for a value 2K/15\_min that can be applied in the descendant period of daily thermal evolution [14.00-20:00, because during

during the evening, fires are more easily detected through this methodology and the fire starting can be established at 600K with 0.24 ha at 20º latitude and 0.48 ha at 50º latitude. As can be seen, the detectable sizes during the day at 20º latitude are similar to the ones in the evening at 50º latitude. Latitude has influence in the variability of the pixel's area and in the atmospheric transmittance, with the cenital angle, which has also been taken into account to obtain results. Results obtained for different atmospheric profiles do not differ too much. Another very important magnitude to be considered is the surface temperature since the considered methodology is presented with continuity throughout the day and night, a

considered, *T*

this period *<sup>T</sup>* <sup>0</sup> *<sup>t</sup>*

*t* 

period for which different values are presented. It must be pointed out that lower surface temperatures make the detection considerably easier. Thus, if we go down from 300K to 290K, there is a decrease in the minimum detectable area of around 20-23%. This value is constant for different fire temperatures and also independent from the latitude considered.

Fig. 2. Minimum size of fire (ha) to be detected by SEVIRI, for different fire temperature and latitude, applying TTG values of 2, 6 and 4K/15\_minutes, taking into account atmospheric attenuation (taken from Calle *et al*., 2006).

Establishing the outbreak of a fire, as accurately as possible, is crucial to alerting firefighting teams as quickly as possible. If the detection process takes into account the comparison with the previous image the delay can be up to 30 minutes in the worst cases. To show some representative results we have analyzed the day on which Spain's worst fire in the previous decades in terms of human losses occurred. This fire, which started between 12:30 and 12:45 on 16th July 2005, spread for over five consecutive days and devastated around 13,000ha. Figure 3 shows the image of the 3.9 m spectral band corresponding to a few hours after the fire. The visual analysis of the image shows the existence of many fires in Spain and Portugal. Given their importance, two have been highlighted and shown. Number 1 is the fire in Guadalajara (Spain) and number 2, one of the fires that affected the natural park of Lago de Sanabria (Zamora, Spain) during the summer of 2005, whose initial characteristics, as will be seen, differ from the first. In the figure, we have indicated the wind direction in fire #1 from the smoke plume, which is perfectly visible and which will be useful later to analyze the spread of the fire. Below in the same figure are the two thermal evolution diagrams corresponding to these fires. The diagram shows the temperature evolution of band 3.9 m, in ºC, in the primary axis of the ordinate according to the time of the day, between 06:00 and 16:00 GMT. The secondary axis of the ordinate shows the evolution of the time thermal gradient of the same band, in ºC/15\_minutes. If we compare both temperature evolution curves, we can see that they are practically identical on the primary axis up to the moment at which the fire starts, at 12:30 in #1 and at 13:45 in #2 despite being different vegetation covers with different fuel moisture content since they occur in different climate zones. The analysis of the curve of the time thermal gradient is much more conclusive. The change in the temperature value is 1.5ºC/15\_minutes in both curves prior to the outbreak of the fire reaching a maximum of 2.3 in #1 and 1.8 in #2, which are exceptional considering the rest of the values. Case #2 was a fire that started with a time thermal gradient of 4.2ºC/15\_minutes in the first scene at 14:00 GMT, immediately jumping to 15ºC/15\_minutes in the following scene at 14:15 GMT. It is clear that it began between 13:45 and 14:00 as the figure shows. The case of fire #1 presents a much more abrupt beginning, with a time thermal gradient of 8ºC/15\_minutes in the first scene at 12.45 GMT. In this case, the fire broke out between 12:30 and 12:45 GMT. Apart from its initial causes, the characteristics of a fire at its onset depend on the combustible material and moisture. In this comparison, it is not surprising that the outbreak was slower in case #2, whose gradient was below #1, as this was a climate zone with higher moisture content.

Fig. 3. This figure shows the methodology to detect the start of a fire for two different cases. The upper part of the figure shows the 3.9 µm band, highlighting several fires validated by MODIS) as well as wind direction. The second part shows the thermal evolution, in the left scale, and the time thermal gradient, in the right scale, in ºC/15\_minutes, for the two selected cases. (Calle *et al*., 2006).

The methodology proposed to detect the beginning of the fire is no longer valid as the fire keeps developing since the temperature differences between the different scenes experiment strong variations. Even the frequent appearance of saturated pixels causes sharp changes that cannot be analysed. Further, for the subsequent monitoring of the fire, a methodology for detecting hot spots (after the starting) is required. Detection methods on other sensors used as a reference are sometimes based on physical models. However, experimental

occur in different climate zones. The analysis of the curve of the time thermal gradient is much more conclusive. The change in the temperature value is 1.5ºC/15\_minutes in both curves prior to the outbreak of the fire reaching a maximum of 2.3 in #1 and 1.8 in #2, which are exceptional considering the rest of the values. Case #2 was a fire that started with a time thermal gradient of 4.2ºC/15\_minutes in the first scene at 14:00 GMT, immediately jumping to 15ºC/15\_minutes in the following scene at 14:15 GMT. It is clear that it began between 13:45 and 14:00 as the figure shows. The case of fire #1 presents a much more abrupt beginning, with a time thermal gradient of 8ºC/15\_minutes in the first scene at 12.45 GMT. In this case, the fire broke out between 12:30 and 12:45 GMT. Apart from its initial causes, the characteristics of a fire at its onset depend on the combustible material and moisture. In this comparison, it is not surprising that the outbreak was slower in case #2, whose gradient

Fig. 3. This figure shows the methodology to detect the start of a fire for two different cases. The upper part of the figure shows the 3.9 µm band, highlighting several fires validated by MODIS) as well as wind direction. The second part shows the thermal evolution, in the left scale, and the time thermal gradient, in the right scale, in ºC/15\_minutes, for the two

The methodology proposed to detect the beginning of the fire is no longer valid as the fire keeps developing since the temperature differences between the different scenes experiment strong variations. Even the frequent appearance of saturated pixels causes sharp changes that cannot be analysed. Further, for the subsequent monitoring of the fire, a methodology for detecting hot spots (after the starting) is required. Detection methods on other sensors used as a reference are sometimes based on physical models. However, experimental

selected cases. (Calle *et al*., 2006).

was below #1, as this was a climate zone with higher moisture content.

statistical models have shown better results and are easier to apply as the contextual models operating on AVHRR and MODIS, as we have seen in the paragraph before.

#### **3.4 Spatial characterization of fire detection**

The pixel dimension is the main parameter that characterises sensors concerning their spatial resolution. However, the radiance quantification and image interpretation need an appropriate analysis to obtain several physical parameters. The review of Cracknell (1998), describes spatial and radiometric considerations regarding the pixel precisely. In order to answer the question "what's in a pixel?", title of the mentioned paper, it is firstly necessary to carry out an accurate analysis of the target area that emits the radiance reaching the sensor which, in fact, never coincides exactly with the spatial resolution assigned to it nor with the square shape that it is imagined for the matrix elements of an image. The simplified concept of image as a mosaic of elements is quite far from the reality, something that becomes evident when trying to observe image detail or compare images from different sensors with similar spatial resolution. Moreover, the concept of spatial resolution is often identified with Ground Sampling Distance (GSD), defined as the distance between centres of neighbour pixels, or the use of Instantaneous Geometric Field of View (IGFOV), the geometric size of the image projected by the detector on the ground through the optical system introducing confusion in the sensor's spatial characterization. Since there are sensors with similar IGFOV but different Modulation Transfer Function (MTF), it is more realistic to define a quantity in the topic of MTF. The concept of Effective Instantaneous Field Of View (EIFOV) introduced by NASA, 1973, is defined as the resolution corresponding to a spatial frequency for which the MTF system is 0.5. The MTF shape in the frequency domain and, consequently, the Point Spread Function (PSF) in the spatial domain has not a special relevance when the surface observed shows a homogeneous distribution of radiance; nevertheless, when there are heterogeneous distribution of radiance inside the pixel, as is frequently the case of forest fires, PSF and deconvolution processes must be considered. In this paragraph, results by using real MTF functions of the SEVIRI sensor, are shown.

On the other hand, many thermal parameters in remote sensing are estimated by solving multi-spectral processes, such as the estimation of the temperature using split-window procedures or the estimation of thermal parameters in hot-spots through Dozier's method (Dozier, 1981; Matson and Dozier, 1981). In these estimations, it is assumed that the pixels of the bands involved correspond to the same spatial target and contribute with the same sensitivity to the radiance measurement. However, even in the case of a perfect coregistration between bands, this assumption would not be true since each band has a different PSF. This is one of the problems mentioned by Wooster *et al*. 2005, in order to propose a single-channel method to estimate the fire temperature instead of applying a bispectral method. In addition, the influence of the PSF has been highlighted as responsible for the differences in the Fire Radiative Power (FRP) when different sensors are compared. Concerning geostationary satellites, MSG is providing operational results in fire detection and biomass burning in Africa (Wooster *et al*., 2005) and Mediterranean countries (Calle *et al.,* 2006) and Geostationary Operational Environmental Satellites (GOES) are used operationally in South-Central-North-America (Prins and Menzel, 1994). The issue of fire detection is understood in the framework of global geostationary fire monitoring applications and requires evaluating the impact of the MTF's shape in the estimation of thermal parameters.

In order to estimate the impact of PSF shape on detection suitability, it's interesting to analyze a sensor with low spatial resolution, as SEVIRI sensor onboard of MSG satellite (Calle *et al*., 2009). Pixel affected by fire appears a a typical cross shape when fire is detected, due to PSF effects and overlapping between pixels. Figure 4 shows a three-dimensional graph where the brightness temperature in the 3.9 m band (vertical axis) is shown versus the fire temperature (left part of figure) and background temperature (right part of figure) and the distance from the pixel centre (PSF impact), where the background temperature is 300 K (left), the fire temperature is 500 K (rigth) and the one-dimensional burning area is 50m (both cases). Saturation plane is shown in the figures. Note that for low fire temperatures (below 450 K, taking into account that we are talking of flaming and smouldering mixed phases, the PSF impact is not noticeable. However, large differences in brightness temperature are found in hotter fires. In order to explain the importance of a 10 K-difference in the 3.9 m band, note that if a contextual detection algorithm is applied the detection will be lost when the standard brightness temperature deviation around the pixel is higher than 3 K.

Fig. 4. Left part: Brightness temperature in the 3.9 m band (vertical axis) *versus* fire temperature and distance from pixel centre (PSF impact); background temperature 300 K is considered. Rigth part: Brightness temperature in the 3.9 m spectral band (vertical axis) versus background temperature and distance from pixel centre (PSF impact). fire temperature of 500 K is considered. One-dimensional burning size of 50 m. (Calle *et al*., 2009)

#### **4. Fire monitoring**

The concept of detection is very clear, but it is not so clear the concept of monitoring. It could be said that monitoring comprises all the aspects related to the knowledge of a fire while it is taking place. Thus, we can talk of the fire temperature, the active area, the fire's energy intensity and the fire's front. However, all these parameters are subject to the technical possibilities of the spatial sensor used, especially the spatial resolution in the thermal spectrum. The main problem with monitoring tasks lies in the necessity of having available the time resolution typical of geostationary satellites in order to be able to know

In order to estimate the impact of PSF shape on detection suitability, it's interesting to analyze a sensor with low spatial resolution, as SEVIRI sensor onboard of MSG satellite (Calle *et al*., 2009). Pixel affected by fire appears a a typical cross shape when fire is detected, due to PSF effects and overlapping between pixels. Figure 4 shows a three-dimensional graph where the brightness temperature in the 3.9 m band (vertical axis) is shown versus the fire temperature (left part of figure) and background temperature (right part of figure) and the distance from the pixel centre (PSF impact), where the background temperature is 300 K (left), the fire temperature is 500 K (rigth) and the one-dimensional burning area is 50m (both cases). Saturation plane is shown in the figures. Note that for low fire temperatures (below 450 K, taking into account that we are talking of flaming and smouldering mixed phases, the PSF impact is not noticeable. However, large differences in brightness temperature are found in hotter fires. In order to explain the importance of a 10 K-difference in the 3.9 m band, note that if a contextual detection algorithm is applied the detection will be lost when the standard brightness temperature deviation around the pixel

290

3

0

300

3

310

3

320

330

Brightness temperature K( )

340

1

2

Distance from pixel centre (km)

300

**Saturation level**

310

Background temperature (K)

0 1

Fig. 4. Left part: Brightness temperature in the 3.9 m band (vertical axis) *versus* fire

versus background temperature and distance from pixel centre (PSF impact). fire temperature of 500 K is considered. One-dimensional burning size of 50 m. (Calle *et al*.,

temperature and distance from pixel centre (PSF impact); background temperature 300 K is considered. Rigth part: Brightness temperature in the 3.9 m spectral band (vertical axis)

The concept of detection is very clear, but it is not so clear the concept of monitoring. It could be said that monitoring comprises all the aspects related to the knowledge of a fire while it is taking place. Thus, we can talk of the fire temperature, the active area, the fire's energy intensity and the fire's front. However, all these parameters are subject to the technical possibilities of the spatial sensor used, especially the spatial resolution in the thermal spectrum. The main problem with monitoring tasks lies in the necessity of having available the time resolution typical of geostationary satellites in order to be able to know

2

**Saturation level**

> Distance from pixel centre (km)

400

3

500

600

700

Fire temperature (K)

**4. Fire monitoring** 

300

2009)

320

340

Brightness temperature (K)

is higher than 3 K.

not just the instant value of the above mentioned parameters, but also their evolution throughout the fire's development. However, these sensors are currently very far from providing detailed results. Next, we will see the type of information we can get according to the capacities of different sensors.

For the knowledge of fire parameters, we need first an analysis at a sub-pixel level through the application of Dozier's methodology (*1981).* This methodology allows us to establish both the fire temperature and the fraction of the area that is burning simultaneously. This procedure can be applied to any sensor and it is based on the solution of the following system of equations: given a pixel affected by a fire at a temperature Tf that occupies the fraction of the pixel p, and it is surrounded by a surface at a temperature Tsurf., then the radiances detected in the MIR and TIR bands will be given by the expressions:

$$\begin{cases} \mathcal{L}\_{\text{MIR}} = p \, \mathcal{B} \left( \mathcal{A}\_{\text{MIR}}, T\_f \right) + \left( 1 - p \right) \mathcal{B} \left( \mathcal{A}\_{\text{MIR}}, T\_{\text{surf}} \right) \\ \mathcal{L}\_{\text{TIR}} = p \, \mathcal{B} \left( \mathcal{A}\_{\text{TIR}}, T\_f \right) + \left( 1 - p \right) \mathcal{B} \left( \mathcal{A}\_{\text{TIR}}, T\_{\text{surf}} \right) \\ \end{cases} \tag{3}$$

where LMIR and L,TIR are the radiances observed by the sensor in the spectral regions of 3.7m and 11m respectively and B(,T) is the function of Planck. This system of equations provides the fire temperature value and the fraction of the pixel that is burning.

Before analysing some approximations taken in this methodology, we must point out two very important restrictions concerning its operating capacity. In the first place, it must be said that the equations are based on the establishment of the radiance emitted by the thermal spectrum. The 11 m region has no other nature, but the radiance observed in the MIR region has a reflection component that has been analysed in the false alarms section. That's why, the application of these equations to diurnal images should include an additional solar term. Otherwise, they would only be valid for night images. On the other hand, in order to obtain reliable results, it is necessary to avoid saturation as much as possible.

Dozier's system of equations is very simple to understand although many of the approximations it takes are not realistic and should be analysed. In the first place, the pixel observed is divided into two parts, fire and surface, those are considered homogenous, but this is not the case, especially because of the surface's heterogeneity. On the other hand, the atmospheric effects have been neglected in this scheme. The most serious approximation with respect to the error magnitude is probably found in the establishment of a surface temperature value. Dozier suggested for this value the mean value of the pixels surrounding the fire but not affected by it. It must be highlighted that the results obtained depend to a great extent on this parameter. Simulations carried out on a real fire changing the surface temperature value (Calle *et al*., 2005) show that the error in the surface temperature affects the fire temperature with a value multiplied by 10. Finally, another approximation taken is not to include in the equations the emissivity of the radiance received by the sensor. Although it is true that the fire performance is very similar to that of a blackbody, the same does not happen with the non-affected surface, which seems to have variable values. The deduction of the emissivity is justified in the fact that the zones observed for fire purposes are always forest zones and the emissivity values in this kind of environment are comprised in the interval [0.983-0.995] for the TIR band. A more realistic scheme derived from Dozier's methodology is the one suggested by Giglio & Kendall (2001). This scheme modifies the former one by including terms of emissivity, atmospheric effects and sun reflection in the radiance equation of the MIR band. The following are the modified equations of Dozier:

$$\begin{cases} L\_{\text{MIR}} = \tau\_{\text{MIR}} p \, B \left( \mathcal{A}\_{\text{MIR}}, T\_f \right) + (1 - p) L\_{\text{surf}, \text{MIR}} + p L\_{\text{atm,MIR}} \\ L\_{\text{TIR}} = \tau\_{\text{TIR}} p \, B \left( \mathcal{A}\_{\text{TIR}}, T\_f \right) + (1 - p) L\_{\text{surf}, \text{TIR}} + p L\_{\text{atm,TIR}} & 0 < p < 1 \end{cases} \tag{4}$$

where Latm,MIR and Latm,TIR are the radiances emitted by the atmosphere to the sensor in the MIR and TIR bands respectively. These terms are worthless with respect to the radiances emitted by the surface, Lsurf,MIR and Lsurf,TIR, and can be disregarded. is the atmosphere's spectral transmittance. The difference in these equations with respect to the original ones lies in the intervention of the radiances of the surrounding pixels instead of the temperature and finally, although they are taken into account, the surface's emissivity and temperature are not usually known explicitly. The techniques mentioned for the obtaining of fire parameters imply some difficulties related to the errors that are made. In the first place, they are not analytic equations so that their solution must be found by means of numerical calculation techniques. However, it must be said that their solution comes, in the end, from a convergent system. Other important sources of errors have their origin in different magnitudes that have been analysed by Giglio & Kendall (2001) and that will be mentioned here next.

First, a source of error in the results is the error in the calculation of the surface's radiance introduced in the equations. The values for the fire temperature and size are more sensitive to errors in the radiance of 11.0µm than in the 3.7µm. At low temperatures, this is not a big error, but, with a high fire temperature, the error increases noticeably both in the fraction of the pixel affected and in the fire temperature itself. Another source of error to consider is the one corresponding to the atmospheric transmittance. However, in this case, the errors made in the temperature and fraction of the pixel affected, are compensated in the MIR and TIR bands as long as such errors are caused by either an underestimation or an overestimation in both cases. Otherwise, the errors in the results will add up. Thus, an overestimation in the MIR transmittance overestimates the temperature calculated whereas an overestimation in the TIR transmittance produces the opposite effect. A third source of error in the calculations is due to the instrument's noise, although in this case it introduces an accidental systematic error. Finally, the omission of the atmospheric radiance that reaches the sensor is less important than the causes considered formerly, so that in no case does the temperature go over 1.5K or the area over 2%. A very interesting aspect in the theory developed is the one that refers to the fire's emissivity. A fire has always been considered as a blackbody. In fact, and strictly speaking, this is only true when the length of the flame seen from the sensor is larger than 6 metres (Langaas, 1995). This would make us reconsider this aspect in the case of smaller fires so that in these cases we should consider the fire as a grey body. In these cases that separate from the characteristics of a blackbody, the errors made for considering that the fire has an emissivity one, result in an underestimation of both the fire temperature and the fire area, and they are independent from the fraction of the pixel that is affected and the fire temperature. In spite of all the methodology developed, it is important to point out that a forest fire is, in reality, a very complex phenomenon in which different series of phenomena overlap. In this situation, we could ask ourselves what the parameter

methodology is the one suggested by Giglio & Kendall (2001). This scheme modifies the former one by including terms of emissivity, atmospheric effects and sun reflection in the radiance equation of the MIR band. The following are the modified equations of Dozier:

, ,

(4)

, ,

, 1 0 1

*L pB T p L pL*

, 1

*TIR TIR TIR f surf TIR atm TIR*

*MIR MIR MIR f surf MIR atm MIR*

*L p B T p L pL p*

where Latm,MIR and Latm,TIR are the radiances emitted by the atmosphere to the sensor in the MIR and TIR bands respectively. These terms are worthless with respect to the radiances emitted by the surface, Lsurf,MIR and Lsurf,TIR, and can be disregarded. is the atmosphere's spectral transmittance. The difference in these equations with respect to the original ones lies in the intervention of the radiances of the surrounding pixels instead of the temperature and finally, although they are taken into account, the surface's emissivity and temperature are not usually known explicitly. The techniques mentioned for the obtaining of fire parameters imply some difficulties related to the errors that are made. In the first place, they are not analytic equations so that their solution must be found by means of numerical calculation techniques. However, it must be said that their solution comes, in the end, from a convergent system. Other important sources of errors have their origin in different magnitudes that have been analysed by Giglio & Kendall (2001) and that will be mentioned

First, a source of error in the results is the error in the calculation of the surface's radiance introduced in the equations. The values for the fire temperature and size are more sensitive to errors in the radiance of 11.0µm than in the 3.7µm. At low temperatures, this is not a big error, but, with a high fire temperature, the error increases noticeably both in the fraction of the pixel affected and in the fire temperature itself. Another source of error to consider is the one corresponding to the atmospheric transmittance. However, in this case, the errors made in the temperature and fraction of the pixel affected, are compensated in the MIR and TIR bands as long as such errors are caused by either an underestimation or an overestimation in both cases. Otherwise, the errors in the results will add up. Thus, an overestimation in the MIR transmittance overestimates the temperature calculated whereas an overestimation in the TIR transmittance produces the opposite effect. A third source of error in the calculations is due to the instrument's noise, although in this case it introduces an accidental systematic error. Finally, the omission of the atmospheric radiance that reaches the sensor is less important than the causes considered formerly, so that in no case does the temperature go over 1.5K or the area over 2%. A very interesting aspect in the theory developed is the one that refers to the fire's emissivity. A fire has always been considered as a blackbody. In fact, and strictly speaking, this is only true when the length of the flame seen from the sensor is larger than 6 metres (Langaas, 1995). This would make us reconsider this aspect in the case of smaller fires so that in these cases we should consider the fire as a grey body. In these cases that separate from the characteristics of a blackbody, the errors made for considering that the fire has an emissivity one, result in an underestimation of both the fire temperature and the fire area, and they are independent from the fraction of the pixel that is affected and the fire temperature. In spite of all the methodology developed, it is important to point out that a forest fire is, in reality, a very complex phenomenon in which different series of phenomena overlap. In this situation, we could ask ourselves what the parameter

 

 

here next.

we call "fire temperature" is exactly and what the "burning area" is. The model presented is a simplification from the real phenomenon since, up until now, only two phases have been differentiated: the fire flame and the surface. In reality, it should at least be considered the middle phase corresponding to the smouldering. However, it must be taken into account that the introduction of further terms in the model would imply having more spectral bands available in order to obtain more equations and to be able to find all the unknown quantities. We are going to consider this aspect so as to reach some conclusions in relation with the appropriate spectral information. Kaufman *et al.,* (1998) introduced a modification in Dozier's methodology in order to include the flame phase, which is hotter, and the smouldering phase, which is in the middle between the surface and the flame. Thus, if we call pf and ps to the fractions of the pixel corresponding to the flame and the smouldering respectively, the bi-spectral equations will be as follows:

$$\begin{cases} L\_{\rm MIR} = \tau\_{\rm MIR} \left[ p\_f \, B \left( \mathcal{\lambda}\_{\rm MIR}, T\_f \right) + p\_s \, B \left( \mathcal{\lambda}\_{\rm MIR}, T\_s \right) \right] + (1 - p) L\_{\rm surf, MIR} + p L\_{\rm atm, MIR} \\ L\_{\rm CIR} = \tau\_{\rm CIR} \left[ p\_f \, B \left( \mathcal{\lambda}\_{\rm CIR}, T\_f \right) + p\_s \, B \left( \mathcal{\lambda}\_{\rm CIR}, T\_s \right) \right] + (1 - p) L\_{\rm surf, TIR} + p L\_{\rm atm, TIR} & 0 < p < 1 \end{cases} \tag{5}$$

so that pf + ps =p is fulfilled. The analysis and discussion will be done through the flaming ratio function, f, defined as *f f s <sup>p</sup> <sup>f</sup> p p* . This relation is related to the importance that

the flame phase has in the fire observed. Since in order to obtain more detailed information, more observation wavelengths are needed, Giglio & Justice (2003) established the errors found according to the pair of wavelengths used to solve the bi-spectral equations so as to establish the most appropriate pair for this purpose, always considering the atmospheric windows for the observation. These authors carried out simulations with combinations of wavelengths in the interval [1.6, 3.8 m] for the MIR region and in the interval [2.4, 11 m] for the TIR region so that λTIR was always higher than λMIR. The most relevant conclusions of this analysis were that the shortest pairs of wavelengths provided higher fire temperatures and smaller areas since the decrease in λ implies a major importance in the flame phase whereas an increase in λ gives more importance to the smouldering phase. It is also interesting that the results of the pair [3.8, 11.0 μm] and of the pair [3.8, 8.5 µm] are practically identical, with differences inferior to 5K for the temperature and 5% for the area. This means that the spectral difference in the AVHRR and MODIS sensors, which correspond to the first pair, and in BIRD, which corresponds to the second, are not significant.

MODIS has several bands situated in the spectral region of 4µm. One of them has a saturation value of 500K (band 21), which makes this sensor especially suitable for the establishment of fire parameters since it is difficult to find saturated pixels. It must be taken into account that it is very rare when this monitoring phase can be applied to the AVHRR sensor since, although the detection is possible, band 3 is very frequently saturated. Likewise, BIRD prototype is especially suitable for the obtaining of parameters and the establishment of the FRP (Free Radiative Power) (Kaufman and Justice, 1998). By definition, the FRE (Fire Radiative Energy) is basically the portion of chemical energy released during the burning of the vegetation and emitted as radiation during the combustion process. These parameters are comprised within the goal of fire analysis and the FRP is precisely the most important one because it contains information both on the emissions produced in the atmosphere by these events (Kaufman *et al*., 1996) and on the fire's destructive power. These authors have suggested that the quantification of the radiated energy during the combustion process in the fire could supply a measurement related to the quantity of vegetation consumed per unit of time. Consequently, it would provide a measurement of the emissions produced during fires and, therefore, it would provide valuable support information to the processes of climate change obtained through remote-sensing. In spite of being a qualitative measurement of great value, we must take into account that the combustion phase is a mixture of physic processes through which the fire's energy is distributed, apart from the radiation phase, as is the case of the air mass convection above the fire and the conduction towards the interior of the earth.

For the MODIS sensor case, Kaufman & Justice (1998), have suggested an empiric expression in order to fix the intensity, in MWatts, from the brightness temperature of the pixel affected by the fire. This expression corresponds to:

$$FRP = \text{4.34} \cdot 10^{-19} \left( T\_{\text{\\_}\text{MR}}^{8} - T\_{\text{\\_}\text{MR}}^{8} \right) \tag{6}$$

where TMIR is the brightness temperature of the band of the 4 m of the pixel affected by the fire and TMIR,b is the same temperature in the adjacent pixels. In order to carry out a validation of the results obtained at a sub-pixel level, that is, the fire's temperature and the fire's area, a comparison with the intensity values calculated through Stefan-Boltzmann's law and the previous formula has been carried out. This has been exclusively done for the MODIS sensor and on the large fires that affected Spain and Portugal during the summer of 2003. Besides, the comparison has been done for the Terra and Aqua spacecraft and at two processing levels: level of individual burning pixels and level of averaged clusters. The results of this comparison can be found in Calle *et al*. (2005), in which the two processing levels are represented separately. The almost exact coincidence of the values, which are even better in the case of the analysis at a cluster level, proves the reliability of the magnitudes temperature and area of fire. It is very important to highlight the fact that the coincidence between the empiric expression and Stefan-Boltzmann's law, after applying Dozier methodology) are coming from the analysis of clusters. However, when results are compared at the level of individual pixels, the differences are much more noticeable; so, the use of empiric expression is recommended. When sensor has a high spatial resolution in the thermal bands, the sub pixel analysis is a useful tool in order to discriminate the increasing direction of fire: that is, the flaming front. The figure 5 shows the results of the application of the sub-pixel analysis on one of the active fires that have been described. It corresponds to the superposition of the fire's temperatures on the BIRD sensor over NIR image, showing the affected fire area.

The real usefulness of remote-sensing in the early detection of fires will take place when the time resolution of the sensors implied is around 15 minutes or less. At present, this characteristic is only available in the geostationary satellites, but they have the problem of their low spatial resolution. The advantage of geostationary sensors is that it's possible to obtaining, not only the FRP but the FRE too. The fire radiative energy will be: *FRE FRP dt* . In any case,

the comparison of FRP results among sensors is only valid for qualitative purposes since in certain fires the lowest spatial resolution implies an important underestimation of this magnitude. This happens for example when comparing MSG and MODIS or MODIS and BIRD. With respect to the latter ones, Wooster *et al* (2003) found differences of up to 46%.

atmosphere by these events (Kaufman *et al*., 1996) and on the fire's destructive power. These authors have suggested that the quantification of the radiated energy during the combustion process in the fire could supply a measurement related to the quantity of vegetation consumed per unit of time. Consequently, it would provide a measurement of the emissions produced during fires and, therefore, it would provide valuable support information to the processes of climate change obtained through remote-sensing. In spite of being a qualitative measurement of great value, we must take into account that the combustion phase is a mixture of physic processes through which the fire's energy is distributed, apart from the radiation phase, as is the case of the air mass convection above the fire and the conduction

For the MODIS sensor case, Kaufman & Justice (1998), have suggested an empiric expression in order to fix the intensity, in MWatts, from the brightness temperature of the

, 4.34·10 *MIR MIR FRP T T <sup>b</sup>*

where TMIR is the brightness temperature of the band of the 4 m of the pixel affected by the fire and TMIR,b is the same temperature in the adjacent pixels. In order to carry out a validation of the results obtained at a sub-pixel level, that is, the fire's temperature and the fire's area, a comparison with the intensity values calculated through Stefan-Boltzmann's law and the previous formula has been carried out. This has been exclusively done for the MODIS sensor and on the large fires that affected Spain and Portugal during the summer of 2003. Besides, the comparison has been done for the Terra and Aqua spacecraft and at two processing levels: level of individual burning pixels and level of averaged clusters. The results of this comparison can be found in Calle *et al*. (2005), in which the two processing levels are represented separately. The almost exact coincidence of the values, which are even better in the case of the analysis at a cluster level, proves the reliability of the magnitudes temperature and area of fire. It is very important to highlight the fact that the coincidence between the empiric expression and Stefan-Boltzmann's law, after applying Dozier methodology) are coming from the analysis of clusters. However, when results are compared at the level of individual pixels, the differences are much more noticeable; so, the use of empiric expression is recommended. When sensor has a high spatial resolution in the thermal bands, the sub pixel analysis is a useful tool in order to discriminate the increasing direction of fire: that is, the flaming front. The figure 5 shows the results of the application of the sub-pixel analysis on one of the active fires that have been described. It corresponds to the superposition of the fire's temperatures on

The real usefulness of remote-sensing in the early detection of fires will take place when the time resolution of the sensors implied is around 15 minutes or less. At present, this characteristic is only available in the geostationary satellites, but they have the problem of their low spatial resolution. The advantage of geostationary sensors is that it's possible to obtaining, not only the FRP but the FRE too. The fire radiative energy will be: *FRE FRP dt* . In any case, the comparison of FRP results among sensors is only valid for qualitative purposes since in certain fires the lowest spatial resolution implies an important underestimation of this magnitude. This happens for example when comparing MSG and MODIS or MODIS and BIRD. With respect to the latter ones, Wooster *et al* (2003) found differences of up to 46%.

19 8 8

(6)

towards the interior of the earth.

pixel affected by the fire. This expression corresponds to:

the BIRD sensor over NIR image, showing the affected fire area.

Fig. 5. This figure shows the fire temperatures, obtained by means of Dozier methodology. At this spatial resolution is very clear to recognize the flaming front and the spreading direction of fire (Calle *et al.,* 2005).

#### **5. Atmospheric impact of fire emissions**

The gases belonging to carbon cycle, CO and CO2, are trace gases located in the atmosphere, mostly as the result of anthropogenic activities. Despite not being a greenhouse gas, the carbon monoxide plays a significant role in the carbon cycle; it is not a direct precursor of CO2, but it essentially affects the budgets of OH radicals and O3 present in the atmosphere (see Bergamaschi *et al*., 2000, for an extended explanation about the modelling of the global CO cycle). The anthropogenic activities related to release carbon into the atmosphere can be divided in two well-defined groups: on the one hand, the urban pollutant emissions from vehicles and other industrial processes; on the other, from fires and global biomass burning emissions. The estimation of CO profiles and CO total column has been identified as a very important objective in order to improve our understanding of climate global system. The EOS (Earth Observing System) Science Steering Committee has proposed: "The fate of carbon monoxide, remotely detected from space, in conjunction with a few other critical meteorological and chemical parameters, is crucial to our understanding of the chemical reaction sequences that occur in the entire troposphere and govern most of the biogeochemical trace gases" (EOS, 1987). In the same line, the WMO (World Meteorological Organization) has proposed: "Definition of trends and distributions for troposphere CO is essential. A satellite-borne CO sensor operating for extended periods could help enormously" (WMO, 1985). The global estimation of CO based on satellite imagery involves a series of technical difficulties; the most important one is the associated error of the measurements.

The combustion by fire is a chemical reaction with heat release where the main products generated are, if combustion is completed, H2O, CO2, and N2. In the case of high combustion temperatures, NO2 and NO are released too. However it must be pointed out that the main cause of CO fire-related emissions is the incomplete or inefficient burning of wood, biomass and fossil fuels. Concerning wildfires, two phases are considered: the flaming phase (in which CO2 and nitrogen are released), and smouldering phase (in which CO and hydrocarbons are released). Two procedures provide estimations of CO emissions, a direct procedure and an indirect one. The indirect method estimates CO mass from the knowledge of the previous burned biomass. This value can be obtained from satellite cartography of fire-affected areas and the vegetation index, which is the main indicator of biomass quantity on a global scale. The adjustment of the measurements is carried out by introducing the combustion efficiency coefficients of this particular gas. This procedure was first proposed by Seiler and Crutzen (1980), who estimated CO emissions according to the following indirect parameters: i) burned land cover area (m2), ii) above-ground biomass density of burned area (kg-dry-matter/m2), iii) burning efficiency of the above-ground biomass (that is, the fraction of biomass burned) dimensionless, and iv) the emission factor (g of CO [kg dry matter]-1), which varies according to the type of vegetation and ecosystem. Note that many errors arise, from this indirect procedure, due to the uncertainty in the coefficients and, especially, in the biomass estimation, which is the main quantitative parameter.

The second procedure is the direct estimate of carbon content in the atmosphere by means of remote sensing. The SCIAMACHY (SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY) onboard the European satellite ENVISAT (Bovensmann *et al*., 1999) has provided more measurements, of the most important trace gases, than any other sensor up to the present. The CO total column is retrieved from a small spectral fitting window located in SCIAMACHY channel 8 (2.324-2.335 m); finally, the results of its measurements are adjusted according to the parameters of trace gas. Dils *et al*., (2006) have carried out a series of comparisons between SCIAMACHY measurements and ground-station data. In the case of CO and CH4, with similar algorithms, they have shown that the measurements provide good description of seasonal and latitudinal variability. However, they show important discrepancies in concrete cases. Besides, they show long periods in which the algorithm does not provide any data. The MOPITT (Measurements of Pollution in the Troposphere) instrument, onboard the Terra spacecraft, has proved to be the most operative sensor for the continuous estimation of CO. On the other hand, scientists from the NCAR (National Centre for Atmospheric Research), funded by NASA, have spread data and results concerning the global distribution of CO based on MOPITT measurements (http://www.acd.ucar.edu/), which have revealed the seasonal dynamics of CO throughout the planet and direct correlations between the increase in the CO total column measured by MOPITT and large fires. The validation of the results reveals the suitability of the MOPITT's spatial scale for monitoring continuously (at regional and global scale) observations of the spatial oscillations related to the atmospheric CO. Hereby, large horizontal gradients in the distribution of CO at the synoptic scale have been observed. These variations in CO can be as large as 50–100% and occur over spatial scales of around 100 km. These events, usually during several days, can span horizontal distances of 600-1000 km, and can appear over a range of pressure levels from 850 to 150 hPa (Liu *et al*., 2006).

The combustion by fire is a chemical reaction with heat release where the main products generated are, if combustion is completed, H2O, CO2, and N2. In the case of high combustion temperatures, NO2 and NO are released too. However it must be pointed out that the main cause of CO fire-related emissions is the incomplete or inefficient burning of wood, biomass and fossil fuels. Concerning wildfires, two phases are considered: the flaming phase (in which CO2 and nitrogen are released), and smouldering phase (in which CO and hydrocarbons are released). Two procedures provide estimations of CO emissions, a direct procedure and an indirect one. The indirect method estimates CO mass from the knowledge of the previous burned biomass. This value can be obtained from satellite cartography of fire-affected areas and the vegetation index, which is the main indicator of biomass quantity on a global scale. The adjustment of the measurements is carried out by introducing the combustion efficiency coefficients of this particular gas. This procedure was first proposed by Seiler and Crutzen (1980), who estimated CO emissions according to the following indirect parameters: i) burned land cover area (m2), ii) above-ground biomass density of burned area (kg-dry-matter/m2), iii) burning efficiency of the above-ground biomass (that is, the fraction of biomass burned) dimensionless, and iv) the emission factor (g of CO [kg dry matter]-1), which varies according to the type of vegetation and ecosystem. Note that many errors arise, from this indirect procedure, due to the uncertainty in the coefficients and, especially, in the

The second procedure is the direct estimate of carbon content in the atmosphere by means of remote sensing. The SCIAMACHY (SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY) onboard the European satellite ENVISAT (Bovensmann *et al*., 1999) has provided more measurements, of the most important trace gases, than any other sensor up to the present. The CO total column is retrieved from a small spectral fitting window located in SCIAMACHY channel 8 (2.324-2.335 m); finally, the results of its measurements are adjusted according to the parameters of trace gas. Dils *et al*., (2006) have carried out a series of comparisons between SCIAMACHY measurements and ground-station data. In the case of CO and CH4, with similar algorithms, they have shown that the measurements provide good description of seasonal and latitudinal variability. However, they show important discrepancies in concrete cases. Besides, they show long periods in which the algorithm does not provide any data. The MOPITT (Measurements of Pollution in the Troposphere) instrument, onboard the Terra spacecraft, has proved to be the most operative sensor for the continuous estimation of CO. On the other hand, scientists from the NCAR (National Centre for Atmospheric Research), funded by NASA, have spread data and results concerning the global distribution of CO based on MOPITT measurements (http://www.acd.ucar.edu/), which have revealed the seasonal dynamics of CO throughout the planet and direct correlations between the increase in the CO total column measured by MOPITT and large fires. The validation of the results reveals the suitability of the MOPITT's spatial scale for monitoring continuously (at regional and global scale) observations of the spatial oscillations related to the atmospheric CO. Hereby, large horizontal gradients in the distribution of CO at the synoptic scale have been observed. These variations in CO can be as large as 50–100% and occur over spatial scales of around 100 km. These events, usually during several days, can span horizontal distances of 600-1000 km, and can appear over a range of pressure levels

biomass estimation, which is the main quantitative parameter.

from 850 to 150 hPa (Liu *et al*., 2006).

The biomass burning is a very important source of ozone and methane precursors and the main factor of CO emissions. High levels of carbon monoxide pollution are found around the world, and they result from different types of biomass burning in different locations. High levels of CO are linked to widespread fire activity, such as agricultural burning in central Africa in January through March, or in Central America in April through June. Carbon monoxide molecules can last from a few weeks to several months in the atmosphere, and they travel long distances, without regard for national or international boundaries. Emissions from biomass burning accounts for about one quarter of the CO released to the atmosphere, with an average of around 600 Mt CO per year (Khalil *et al.,* 1999). The occurrence of biomass burning, the size of fire, the different phases of fire considered (e.g. smouldering and flaming) and fire parameters (e.g. fire radiative power and temperature) vary greatly with time and space. Andreae and Merlet (2001) estimated that mean CO emission from vegetation fires in savanna and tropical forests is 342 Mt CO per year, while the total CO emission for all non-tropical forest fires is 68 Mt CO per year.

The pattern of fire occurrence in Africa and Amazonia is quite different to others regions in the planet with higher population density. The fire occurrence, in Africa and Amazonia, is dominated by the displacement of ITCZ (Inter Tropical Convergence Zone). During the winter of North hemisphere the ITCZ, and therefore the tropical rain, is located in the South of equator and Amazonia; so, the fire occurrence is stronger in the North of equator and vice versa. The figure 6 shows the results of seasonal study of CO in the North equatorial Africa ([4.5N-15N] and [17W-37E]), South equatorial Africa ([22S-3S] and [10E-40E]) and Amazonia ([20S-7.5S] and [65W-50W]). The bar diagram shows fire occurrence from MODIS (Giglio *et al.,* 2003; Davies *et al*., 2009) in the period 2003-2008. In the background, in grey colour, the original data of CO total column, from MOPITT, are shown. In black colour, the inverse Fast Fourier Transform calculated by means of main harmonics with higher spectral energy. Finally, CO values from SCIAMACHY, averaged for each month, are displayed for the period 2003-2005, in order to compare results between MOPITT and SCIAMACHY sensors. Comparison between CO from MOPITT and SCIAMACHY have been carried out by Buchwitz *et al.* (2007) showing results over cities; so, this comparison over large fires, is a complementary result in order to know the spatial capabilities of these source of data.

Concerning analysis of results over Africa, northward equator, two main harmonics with maximum spectral energy, for each year, can be observed. First maximum is located in the period of January and February, showing a very good correlation with fire occurrence. The second maximum, weaker, is located in August, exactly when fire occurrence in the South of equatorial Africa is stronger. Concerning comparison between MOPITT (daily data) and SCIAMACHY CO total column is similar between them (having MOPITT data more amplitude). This is an expected difference, once SCIAMACHY data are averaged values. As we have underlined in the paragraph before, during the summer of North hemisphere the displacement of ITCZ is the responsible of a stronger fire occurrence in the South equator. Three main maximums, for each year, can be observed. The first maximum, with the shape of a peak, is located at the end of September, showing a very good correlation with the main fire occurrence in the year. The second maximum, weaker, is located at the end of January when fire occurrence in the North of equatorial Africa is stronger (see discussion before). The main difference with North equatorial Africa is the presence of a third harmonic providing an increasing tendency, of CO values, during June-September. As it's possible to

Fig. 6. Results of seasonal study of CO in Africa (North and South of equator) and Amazonia. A comparison between CO emissions and fire occurrences is shown. CO total column original values and Inverse FFT transform is underlined. Left part of each graph contains XCO2 evolution for 2003-05. (Calle *et al*., 2011).

observe in the figure 6, both geographical bands present a correlation between CO values and fire occurrence. But CO maximum values have a delay of 15-20 days with respect to maximum fire occurrence; additionally the local maximum of North band presents a coincidence with main maximum of South band; that is: influence between them due to CO transport processes in the atmosphere. In any case, the influence of North over the South is stronger. Concerning comparison between MOPITT (daily data) and SCIAMACHY CO total column is very similar between them. The pattern of fire occurrence in Amazonia is the same of the South of equatorial Africa, due to the ITCZ behaviour.

### **6. Conclusion**

198 Earth Observation

Fig. 6. Results of seasonal study of CO in Africa (North and South of equator) and Amazonia. A comparison between CO emissions and fire occurrences is shown. CO total column original values and Inverse FFT transform is underlined. Left part of each graph

contains XCO2 evolution for 2003-05. (Calle *et al*., 2011).

In the light of the results, the geostationary sensors prove to be a highly efficient tool in realtime forest fire management and monitoring. Despite not being originally designed as an Earth observation tool, but as a meteorological satellite, its excellent time resolution has proved useful for the detection of events which vary due to radiometric rather than spatial characteristics, as is the case of forest fires. On-going parameterization of fires has a strong influence on the subsequent treatment of forest regeneration. Major efforts are currently being made in the establishment of fire severity, where the main magnitude involved is the FRE in large fires for subsequently establishing intensity and include this magnitude in atmospheric emission models. This correlation between FRE and severity was not possible with polar sensors due to their lack of continuous observation. Another important magnitude that can be established from the FRE is the height of the flame, including some characteristics of the fuel, which could help the analysis of the fire front and other magnitudes linked to its advance. This is an essential magnitude since it is used by fire fighting services to determine the infrastructure necessary to combat fires. Both, the EOS Science Steering Committee and the WMO, have pointed out, as a main objective, the measurement and control of carbon monoxide as part of the control framework of trace gases involved in the carbon cycle. Forest fires are an important source of CO and CO2 worldwide. However, the global estimates carried out have been based on indirect methods which require the previous determination of the burned areas and the introduction of burning efficiency coefficients, which are difficult to determine. In order to apply direct methods for emissions estimating, atmospheric sensors as MOPITT and SCIAMACHY have proven their ability to extract important conclusions about carbon cycle gases at global scale.

### **7. References**


Bergamaschi, P., Hein, R., Heimann, M. and Crutzen, P. J. (2000): Inverse modelling of the

Berk, A., Bernstein, L.W. and Robertson, D.C. (1996), MODTRAN: A moderate resolution

Bovensmann, H., Burrows, J. P., Buchwitz, M., Frerick, J., Nöel, S., Rozanov, V. V., Chance,

Briess, K., Jahn, H., Lorenz, E., Oertel, D., Skrbek, W. & Zhukov, B. (2003). Fire recognition

Buchwitz, M., Khlystova, I., Bovensmann, H., and Burrows, J.P. (2007). Three years of global

Calle, A., Romo, A., Sanz, J. & Casanova, J.L. (2005). Analysis of forest fire parametres using

Calle, A., Casanova, J.L. and Romo, A. (2006). Fire detection and monitoring using MSG

Calle, A., Casanova, J.L. and Romo, A. (2009). Impact of point spread function of MSG-

Calle, A., Salvador, P. and González, F. (2011). Study of the impact of wildfires emissions,

Casanova, J.L., Calle, A. and González-Alonso F. (1998). A Forest Fire Risk Assessment obtained in real time by means of NOAA satellite images. *Forest Fire Research. III. International Conference on Forest Fire Research and 14th Conference on Fire and Forest* 

Chuvieco, E., Riaño, D., Aguado, I. and Cocero, D. (2002). Estimation of fuel moisture

Cracknell, A.P. (1998). Review article synergy in remote sensing-what's in a pixel? *International Journal of Remote Sensing*, 19, 2025-2047. ISSN: 0143-1161 Davies, D.K., Ilavajhala, S., Wong, M.M., and Justice, C.O. (2009). Fire Information for

Dozier, J. (1981). A method for satellite identification of surface temperature fields of subpixel resolution. *Remote Sensing of Environment*, 11: 221-229. ISSN: 0034-4257

*Geophysical Research*, 111, G04S06, doi:10.1029/2005JG000116.

Modes. *Journal of Atmospheric Sciences*, 56:127–150. ISSN 0022-4928

105:1909–1927. ISSN: 0148-0227

*Sensing*, 24, 865-872. ISSN: 0143-1161

*Physics*, 7:2399–2411. ISSN: 1680-7316

Millpress, Rotterdam, ISBN 90 5966 003.

*Remote Sensing* (in press). ISSN: 0143-1161

(11):2145-2162. ISSN: 0143-1161

6:1953–1976. ISSN: 1680-7316

*Meteorology*. Vol I: 1169-1179. ISBN: 972-97973-0-7

4579. ISSN: 0143-1161

ARB, MA.

global CO cycle, 1. Inversion of CO mixing ratios. *Journal of Geophysical Research*,

model for LOWTRAN 7, Philips Laboratory, Report AFGL-TR-83-0187, Hanscom

K. V. and Goede, A. (1999). SCIAMACHY- Mission Objectives and Measurement

potential of the bi-spectral detection (BIRD) satellite. *International Journal of Remote* 

carbon monoxide from SCIAMACHY: comparison with MOPITT and first results related to the detection of enhanced CO over cities. *Atmospheric Chemistry and* 

BIRD, MODIS and MSG-SEVIRI sensors. *New Strategies for European Remote Sensing*,

Spinning Enhanced Visible and Infrared Imager (SEVIRI) data. *Journal of* 

SEVIRI on active fire detection. *International Journal of Remote Sensing*, 30(17), 4567–

through MOPITT CO total column, at different spatial scales. *International Journal of* 

content from multitemporal analysis of Landsat Thematic Mapper reflectance data: applications in fire danger assessment. *International Journal of Remote Sensing*, 23

Resource Management System: Archiving and Distributing MODIS Active Fire Data. *IEEE Trans. on Geoscience and Remote Sensing*, 47 (1):72-79. ISSN: 0196-2892 Dils, B., et al. (2006). Comparisons between SCIAMACHY and ground-based FTIR data for

total columns of CO, CH4, CO2 and N2O. *Atmospheric Chemistry and Physics*,


## **Ocean Reference Stations**

Meghan F. Cronin1, Robert A. Weller2, Richard S. Lampitt3 and Uwe Send4 *1NOAA Pacific Marine Environmental Laboratory, Seattle WA 2Woods Hole Oceanographic Institution, Woods Hole, MA 3National Oceanography Centre, Southampton 4Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA 1,2,4USA 3UK* 

#### **1. Introduction**

202 Earth Observation

Matson, M. and Dozier, J. (1981). Identification of sub-resolution high temperatures sources

Nemani, R.R. and Running, S.W., (1989). Estimation of regional surface resistance to

Price, J. C. 1984, Land surface temperature measurements from the split window channels of

Prins, E.M. and Menzel, W.P. 1994. Trends in South American burning detected with the

Prins, E., Govaerts, Y. and Justice, C.O. (2004), Report on the Joint GOFC/GOLD Fire and

Robinson, J.M., (1991). Fire from space: Global fire evaluation using infrared remote sensing.

Roldán-Zamarrón, A., S. Merino-de-Miguel, F. González-Alonso, S. García-Gigorro, and J.

Seiler, W. and Crutzen, P. J. (1980). Estimates of gross and net fluxes of carbon between the

Wooster, M.J., Zhukov, B & Oertel, D. (2003). Fire radiative energy for quantitative study of

*International Journal of Remote Sensing*, 12: 3-24. ISSN: 0143-1161

1311-1318. ISSN: 0099-1112

ISSN: 0148-0227

Darmstadt, Germany.

doi:10.1029/2005JG000136.

ISSN:0165-0009

10.1029/2005JD006318

*Meteorology*, 28 (4): 276-274. ISSN: 0894-8763

using a thermal IR sensor. *Photogrametric Engineering and Remote Sensing*, 47(9),

evapotranspiration from NDVI and thermal IR AVHRR data. *Journal of Applied* 

the NOAA 7 AVHRR, *Journal of Geophysical Research*. D5:7231-7237. ISSN: 0148-0227

GOES VAS from 1983-1991. *Journal of Geophysical Research*, 99 (D8), 16719-16735.

CEOS LPV Working Group Workshop on Global Geostationary Fire Monitoring Applications, GOFC/GOLD Report No. 19. 23-25 March 2004. EUMETSAT,

M. Cuevas (2006), Minas de Riotinto (south Spain) forest fire: Burned area assessment and fire severity mapping using Landsat 5-TM, Envisat-MERIS, and Terra-MODIS postfire images, *Journal of Geophysical Research*, 111, G04S11,

biosphere and the atmosphere from biomass burning. *Climate Change*, 2:207– 247.

biomass burning: derivation from the BIRD experimental satellite and comparison to MODIS fire products. *Remote Sensing of Environment*, 86, 83-107. ISSN: 0034-4257 Wooster, M.J., Roberts, G., Perry, G.L.W and Kaufman, Y.J (2005). Retrieval of biomass

combustion rates and totals from fire radiative power observations: FRP derivation and calibration relationships between biomass consumption and fire radiative energy release. *Journal of Geophysical Research*, 110, D24311, doi: OceanSITES is an international network of deep ocean observatories that provide reference time-series for ocean and climate studies. While moorings form the backbone of the network, some stations comprise frequent shipboard observations. With dozens of advanced sensors on these platforms, the time-series are high quality, high resolution (hourly or better in many cases), and long (decades long in some cases). Most stations are interdisciplinary, measuring various aspects of the physical and biogeochemical environment from the sea floor to the atmosphere. All data are made publicly available, in a common format, many in near-real time. In this chapter we describe the motivation for, and the requirements and challenges of this network. Because the network includes more than 105 stations, for practical reasons, our overview of individual stations will focus on the subset of stations that make data available in near-real time. Our goal here is to provide an introduction to the network and provide information and links that will help the reader explore the network further.

#### **2. Water world**

We live on a water world. Over 70% of the Earth surface is covered by oceans. On the remaining 30%, human population is not distributed evenly, but instead is most dense in coastal regions. The oceans can affect climate and weather by absorbing, transporting, and emitting heat and gases such as carbon dioxide (Figure 1). Without the poleward heat transport by the ocean currents, the tropics would tend to steadily warm, while the poles would steadily cool. In the high latitudes, heat loss and ice formation generate very dense water at the surface that sinks to the interior and bottom of the ocean, driving the global thermohaline circulation. The oceans also absorb CO2, thus reducing the effects of anthropogenic climate change.

Because of the high heat content of water, the ocean temperature has much less variability than the atmosphere, particularly the atmosphere over land. While at a given location over land surface air temperature can have a range of up to 90°C, air temperature at a given location over open ocean generally has a range of less than 20°C, and the overall ocean temperature range is roughly 30°C (Figure 2). As such, the oceans typically have a

Fig. 1. Mean net surface heat flux in units watts per m2 based on ECMWF 40-year reanalyses (ERA40) (top) and net surface CO2 flux in units grams of carbon per m2 per year, based on the Takahashi et al. (2009) air-sea CO2 flux climatology (bottom). A positive flux indicates a flux from the ocean to the atmosphere. Note that the color scale is inverted for the CO2 flux. At the equator, heat enters the ocean through the surface and CO2 outgasses. White contours indicate mean dynamic sea level height (Rio & Hernandez, 2004).

moderating effect on weather and climate. Indeed, because 2.5 m of water has the same heat capacity per unit area as the whole height of the atmosphere, relatively small changes in the sea surface temperature distribution can have a significant influence on the atmosphere above, particularly in warm water regions such as the tropics, as shown in Figure 3 and discussed below. Approximately 41% of rainfall over land is of maritime origin (Oki & Kanae, 2006). Evaporation, which provides this precipitable water, is strongly dependent upon temperature.

Fig. 1. Mean net surface heat flux in units watts per m2 based on ECMWF 40-year reanalyses (ERA40) (top) and net surface CO2 flux in units grams of carbon per m2 per year, based on the Takahashi et al. (2009) air-sea CO2 flux climatology (bottom). A positive flux indicates a flux from the ocean to the atmosphere. Note that the color scale is inverted for the CO2 flux. At the equator, heat enters the ocean through the surface and CO2 outgasses. White contours

moderating effect on weather and climate. Indeed, because 2.5 m of water has the same heat capacity per unit area as the whole height of the atmosphere, relatively small changes in the sea surface temperature distribution can have a significant influence on the atmosphere above, particularly in warm water regions such as the tropics, as shown in Figure 3 and discussed below. Approximately 41% of rainfall over land is of maritime origin (Oki & Kanae, 2006). Evaporation, which provides this precipitable water, is strongly dependent upon temperature.

indicate mean dynamic sea level height (Rio & Hernandez, 2004).

Significantly more moisture is evaporated where the surface water is warm, fueling deep convection and precipitation (Figure 3). Small shifts in the location and temperature of very warm water can thus cause shifts in the atmospheric convection and weather patterns, both locally and global (Ding et al., 2011; Wallace & Gutzler, 1981).

Fig. 2. Air temperature range based upon daily averaged ERA40 values in units degrees Celsius.

Because of the Earth's rotation, the direct ocean response to wind forcing is an upper ocean transport that is to the right of the wind in the Northern Hemisphere (NH) and to the left of the wind in the Southern Hemisphere (SH). The easterly trade wind and westerly jet stream, and the placement of the continents, thus tend to cause convergence and divergence patterns that result in higher sea level in the subtropics and lower sea level in the subpolar regions (Figure 1). To a certain extent, the sea level height anomalies can be considered as streamlines of the surface flow. Water, that would tend to flow downhill, is deviated to the right in the NH and to the left in the SH, so that the adjusted flow is along the anomalous sea level height isobars. Consequently, the trade winds and jet streams result in an anticyclonic subtropical gyre in each of the ocean basins. The NH westerly jet stream also supports a cyclonic subpolar gyre in the North Pacific and North Atlantic, while in the SH, the jet stream drives an eastward flowing Antarctica Circumpolar Current. Directly at the equator, the axis of rotation is perpendicular to the vertical axis (gravity), making vertical motion near the equator much more dynamic both in the atmosphere and ocean. This, together with the effects of the warm water on precipitable water, causes the ocean and atmosphere to be much more coupled in the tropics than elsewhere. Changes in the ocean surface temperature can result in changes in the atmospheric deep convection and winds, which can in turn affect the ocean temperature structure.

While the ocean drift in most parts of these gyres is slow (~25 cm/s), in some parts, such as the western boundary currents and the circumpolar current, speeds can be up to 2 m/s near the surface and 25 cm/s at depth, corresponding to a transport of order 100 x 106 m3/s. While the analogy has its limits, these ocean currents can be considered as a conveyor belt, carrying heat, salt, and marine ecosystems. Warm currents carry heat poleward, and return currents and the deep thermohaline circulation carry cool water equatorward, resulting in a large-scale meridional overturning circulation.

Fig. 3. Mean surface temperature from ERA40 in units °C (only values greater than 0 are shown), and precipitation from the Global Precipitation Climatology Project in units mm per day (contoured).

Marine life, which lives within this dynamic environment, can be quite sensitive to the ocean temperature. Many animals reproduce, feed, or migrate only within a limited temperature range. Temperature also affects the buoyancy of the water, which can trap nutrients and dissolved inorganic carbon within the euphotic zone where photosynthesis and primary production occur. During blooms, CO2 is used in the production of both organic and inorganic biogenic particles, a portion of which sink into the deeper ocean and are regenerated into CO2 through respiration and dissolution. This export of CO2 is referred to as the "biological pump" of the carbon cycle. During respiration, oxygen is depleted. Anoxic water, devoid of dissolved O2, is generally barren of macroscopic life. Temperature also affects the solubility of dissolved gasses and thus the concentrations of dissolved O2 and CO2: As the surface water cools, it can hold and absorb more CO2. Thus as the surface water cools and sinks, atmospheric CO2 is absorbed into the water and exported into the deep ocean, a process referred to as the "solubility pump" of the carbon cycle.

The distribution of CO2 within the ocean is also critical to the pH of the water and the concentration of carbonate ions, which is a basic building block of skeletons and shells for a many marine organisms, including corals, shellfish, and marine plankton (Feely et al., 2004). As the ocean absorbs more anthropogenic CO2, the CO2 reacts with the seawater to form carbonic acid (H2CO3). This then dissociates to form a bicarbonate ion (HCO₃⁻) and a hydrogen ion (H⁺), which can react with carbonate ions (CO₃²⁻) to form bicarbonate (HCO₃⁻). The net effect of the increased CO2 is thus a decrease in pH and a decrease in the carbonate ion concentration, a process referred to as ocean acidification (Feely et al., 2010). The reduction in carbonate ion affects the saturation state of calcium carbonate (CaCO3) and is critically important as it directly affects the ability of some CaCO3 secreting organisms to produce their shells or skeletons. When pteropods were exposed to undersaturated water, their CaCO3 shells showed notable dissolution (Orr et al., 2005).

Fig. 3. Mean surface temperature from ERA40 in units °C (only values greater than 0 are shown), and precipitation from the Global Precipitation Climatology Project in units mm per

deep ocean, a process referred to as the "solubility pump" of the carbon cycle.

their CaCO3 shells showed notable dissolution (Orr et al., 2005).

Marine life, which lives within this dynamic environment, can be quite sensitive to the ocean temperature. Many animals reproduce, feed, or migrate only within a limited temperature range. Temperature also affects the buoyancy of the water, which can trap nutrients and dissolved inorganic carbon within the euphotic zone where photosynthesis and primary production occur. During blooms, CO2 is used in the production of both organic and inorganic biogenic particles, a portion of which sink into the deeper ocean and are regenerated into CO2 through respiration and dissolution. This export of CO2 is referred to as the "biological pump" of the carbon cycle. During respiration, oxygen is depleted. Anoxic water, devoid of dissolved O2, is generally barren of macroscopic life. Temperature also affects the solubility of dissolved gasses and thus the concentrations of dissolved O2 and CO2: As the surface water cools, it can hold and absorb more CO2. Thus as the surface water cools and sinks, atmospheric CO2 is absorbed into the water and exported into the

The distribution of CO2 within the ocean is also critical to the pH of the water and the concentration of carbonate ions, which is a basic building block of skeletons and shells for a many marine organisms, including corals, shellfish, and marine plankton (Feely et al., 2004). As the ocean absorbs more anthropogenic CO2, the CO2 reacts with the seawater to form carbonic acid (H2CO3). This then dissociates to form a bicarbonate ion (HCO₃⁻) and a hydrogen ion (H⁺), which can react with carbonate ions (CO₃²⁻) to form bicarbonate (HCO₃⁻). The net effect of the increased CO2 is thus a decrease in pH and a decrease in the carbonate ion concentration, a process referred to as ocean acidification (Feely et al., 2010). The reduction in carbonate ion affects the saturation state of calcium carbonate (CaCO3) and is critically important as it directly affects the ability of some CaCO3 secreting organisms to produce their shells or skeletons. When pteropods were exposed to undersaturated water,

day (contoured).

Like global weather maps of wind, barometric pressure, and atmospheric humidity and temperature properties, global maps of the ocean circulation, sea level, and temperature and salinity properties are needed to visualize, quantify, and understand the ocean physical variability (Cazenave et al., 2010; Schmitt et al., 2010; Talley et al., 2010; Wijffels et al., 2010). For understanding and quantifying the ocean and atmosphere interactions, maps of the airsea fluxes of heat, moisture, momentum, and gasses are needed (Fairall et al., 2010; Gulev et al., 2010). Likewise, for understanding and quantifying changes to the carbon cycle, maps of the atmospheric and seawater pCO2, dissolved O2 concentration, pH, and nutrients are needed (Gruber et al., 2010). Monitoring and predicting O2 concentration levels is critical for assessing the effects of the biological pump both on the carbon cycle and the ecosystem. Monitoring and mapping changes in the ocean acidification is likewise critical for understanding the biological impacts of increased of anthropogenic CO2 (Feely et al., 2010).

To make these maps, satellites, ships, floats, drifters, and moored buoys gather data that are routinely ingested into numerical models (Eyre et al., 2010). However, across the broad ocean, as compared to the land, observations are sparse. To validate and assess these modeled fields, as well as to assess satellite remotely sensed fields, in situ observations are needed as reference data (Send et al., 2010). Reference data are also needed for evaluating the processes and mechanisms that affect the ocean environment and ecosystems, and for developing parameterizations of processes that cannot be fully resolved within the numerical models (Lampitt et al., 2010a).

#### **3. Reference ocean data**

#### **3.1 Requirements**

Reference data, by definition, must be high quality, with quantified uncertainty that is small relative to the signal that is being measured. Uncertainties are determined by the measurement resolution, by investigation of sensor performance in the field, and through calibrations that are traceable to a standard at national metrology institutes such as the U.S. National Institute of Standards and Technology. Measurement resolution is determined by the sensor's precision and sampling frequency. If the sampling frequency is lower than the Nyquist frequency of the signal, errors can arise due to aliasing. In particular, biases can result if the sample frequency is identical to the signal frequency. For example, in regions such as the tropics where the diurnal cycle is large, surface temperature measurements will be biased high if the samples are only during the daytime. Likewise, in regions where the annual cycle is large, measurements may be biased high if they are only sampled during the summer.

Figure 4 shows the spectral bandwidth of various oceanographic signals that have periodicities that range from order of seconds (surface waves), to minutes (internal waves in very high stratification), hours (diurnal cycle, tides, inertial oscillations, internal waves), days (storm forced variability, hydrodynamic instabilities, mesoscale eddies), months (planetary waves, seasonal cycle), years (El Niño, gyre-scale variability), decades (gyre-scale, meridional overturning), and longer (anthropogenic forcing). For some processes that depend upon variables in a nonlinear way, variability in these parameters at one scale may affect variability in the process at another scale. Turbulence generally causes a cascade of energy from low to high frequency. However, high-frequency variability can also, in some cases, rectify into the longer scales. For example, since the efficiency of surface forcing depends upon the stratification, large diurnal co-variations in the forcing and stratification in the tropics can rectify and impact intraseasonal and longer timescales, thus affecting the coupled ocean-atmosphere interactions (Bernie et al., 2007; Guemas et al., 2011; Shinoda, 2005).

Fig. 4. Time and space scales of ocean variability (courtesy D. Chelton, Oregon State University, after Dickey (2001)).

#### **3.2 A network of open ocean reference stations**

For resolving high-frequency variability, moorings are the ideal platform, as the resolution of the moored sensors is generally limited only by constraints on the battery life and duration objectives. With the mooring refreshed at regular intervals (generally 6–12 months), these stations can provide long-term, high-resolution, accurate time-series. Moorings thus form the backbone of the global network of OceanSITES reference stations (Lampitt et al., 2010a; Send et al., 2010). The OceanSITES network, shown in Figure 5, is an

depends upon the stratification, large diurnal co-variations in the forcing and stratification in the tropics can rectify and impact intraseasonal and longer timescales, thus affecting the coupled ocean-atmosphere interactions (Bernie et al., 2007; Guemas et al., 2011; Shinoda, 2005).

Fig. 4. Time and space scales of ocean variability (courtesy D. Chelton, Oregon State

For resolving high-frequency variability, moorings are the ideal platform, as the resolution of the moored sensors is generally limited only by constraints on the battery life and duration objectives. With the mooring refreshed at regular intervals (generally 6–12 months), these stations can provide long-term, high-resolution, accurate time-series. Moorings thus form the backbone of the global network of OceanSITES reference stations (Lampitt et al., 2010a; Send et al., 2010). The OceanSITES network, shown in Figure 5, is an

University, after Dickey (2001)).

**3.2 A network of open ocean reference stations** 

element of the Global Ocean Observing System, which is a system within the Global Earth Observing Systems of Systems (GEOSS).

Fig. 5. OceanSITES network of reference stations, as of 2010 (figure courtesy http://www.oceansites.org/network/). Stations with near-real-time data are shown as green circles. Observatories without data telemetry are shown as blue squares. Transport stations are shown as small green squares and regular transport transects are shown as green lines. Planned stations are shown as orange diamonds; discontinued stations and transects are indicated in red.

At present, the OceanSITES network is a collection of stations operated by scientists throughout the world, supported through their national agencies, who agree to the basic requirements of data quality and open data with common formats. The vision is that the OceanSITES network would be interdisciplinary: "a worldwide system of deepwater reference stations: providing high resolution measurements, the full depth of the ocean, multi-year time scales, dozens of variables, real-time access." Indeed the OceanSITES acronym stands for OCEAN Sustained Interdisciplinary Timeseries Environment observation System. In practice, however, not every station monitors the full suite of physical and biogeochemical variables that characterize the local ocean environment. Within the array, moored buoys that carry meteorological sensors to characterize the exchanges or fluxes of heat, momentum, freshwater, and gases (e.g., carbon dioxide) across the air-sea interface are referred to as air-sea "flux" stations. These moorings also generally carry sensors on their anchor line to monitor the physical and sometimes biogeochemical environment in the upper ocean. Other moorings and frequently visited stations, referred to as "observatories," have as their primary objective monitoring the biogeochemical properties within much of the water column. Finally, the purpose of the "transport" stations is to monitor the ocean currents and transport.

#### **3.3 Data latency**

While some of the mooring stations have surface buoys that allow telemetry of near-realtime data; other mooring stations are entirely subsurface, and must be recovered to obtain the data. This can introduce a delay in the data availability of more than a year. With telemetered data, analyses can begin almost immediately, thus accelerating the research. Telemetry also acts as important insurance on the data. If the mooring is lost, the telemetered data may be the only source of the data. Having telemetry also allows the operators to identify and address problems in current and future deployments, thus minimizing data gaps. Finally, in some cases, the telemetered near-real-time data are used to assimilate into a short-term weather forecast, for which every hour of latency implies an hour of forecast.

Due to the sparse nature of oceanographic data, there is often a desire to assimilate all data possible, including reference data. Model operators often argue that an individual measurement is weighted in a way that it will not introduce a bullseye pattern in the fields and make the product appear falsely accurate when compared with the reference timeseries. Reference data are, by definition, supposed to be independent of the products for which they are used to assess. A World Meteorological Organization (WMO) data identification number containing the digits "84" indicate that they are reference data. Protocols are being developed to identify when reference data are being assimilated.

The delayed mode data, available after internally recording instruments are recovered and processed, also have unique value. Because of limited bandwidth and technical challenges for telemetry of ocean data, the real-time data are only a subset of the data available on the moorings. Internally recorded data may have sampling rates of every 1 minute and faster, whereas hourly data may be what was telemetered. Further, the recovered instruments are post-calibrated; thus, the delayed mode data have less uncertainty associated with their accuracy. In general, the delayed mode data are the highest quality data at a reference stations.

#### **4. The OceanSITES network of reference moorings**

In the following section we provide an overview of individual stations within the OceanSITES network, focusing on stations that telemeter data to shore. These include all of the air-sea flux stations, many of which also serve as biogeochemical observatories or are coordinated with nearby observatory and transport stations. A few subsurface observatories also have a small surface buoy used exclusively for telemetry. Because of the complexity of the network it is not possible to describe the network in its entirety. Our purpose here is to provide an introduction and information for further exploration of the network. As a start, the reader is directed to the OceanSITES network website: http://www.oceansites.org.

At roughly \$30,000-50,000 per day, shiptime is a significant component of the overall cost of the deep ocean mooring array. These costs and the limited number of global-class research vessels have necessitated efficient use of the fleet. For example, mooring maintenance cruises are often used for long-term coordinated observations. Likewise, while the stations themselves carry a suite of sensors for monitoring multiple variables, the stations also offer opportunities for other coordinated observations. Nearly all stations have been sites of process studies, involving multiple platforms (ships, extra moorings, drifters, floats, aircraft, etc.). In the following overview, we describe some of these activities, although a full list is not feasible.

#### **4.1 Tropics**

210 Earth Observation

While some of the mooring stations have surface buoys that allow telemetry of near-realtime data; other mooring stations are entirely subsurface, and must be recovered to obtain the data. This can introduce a delay in the data availability of more than a year. With telemetered data, analyses can begin almost immediately, thus accelerating the research. Telemetry also acts as important insurance on the data. If the mooring is lost, the telemetered data may be the only source of the data. Having telemetry also allows the operators to identify and address problems in current and future deployments, thus minimizing data gaps. Finally, in some cases, the telemetered near-real-time data are used to assimilate into a short-term weather forecast, for which every hour of latency implies an

Due to the sparse nature of oceanographic data, there is often a desire to assimilate all data possible, including reference data. Model operators often argue that an individual measurement is weighted in a way that it will not introduce a bullseye pattern in the fields and make the product appear falsely accurate when compared with the reference timeseries. Reference data are, by definition, supposed to be independent of the products for which they are used to assess. A World Meteorological Organization (WMO) data identification number containing the digits "84" indicate that they are reference data.

The delayed mode data, available after internally recording instruments are recovered and processed, also have unique value. Because of limited bandwidth and technical challenges for telemetry of ocean data, the real-time data are only a subset of the data available on the moorings. Internally recorded data may have sampling rates of every 1 minute and faster, whereas hourly data may be what was telemetered. Further, the recovered instruments are post-calibrated; thus, the delayed mode data have less uncertainty associated with their accuracy. In general, the delayed mode data are the highest quality data at a reference stations.

In the following section we provide an overview of individual stations within the OceanSITES network, focusing on stations that telemeter data to shore. These include all of the air-sea flux stations, many of which also serve as biogeochemical observatories or are coordinated with nearby observatory and transport stations. A few subsurface observatories also have a small surface buoy used exclusively for telemetry. Because of the complexity of the network it is not possible to describe the network in its entirety. Our purpose here is to provide an introduction and information for further exploration of the network. As a start, the reader is directed to the OceanSITES network website: http://www.oceansites.org.

At roughly \$30,000-50,000 per day, shiptime is a significant component of the overall cost of the deep ocean mooring array. These costs and the limited number of global-class research vessels have necessitated efficient use of the fleet. For example, mooring maintenance cruises are often used for long-term coordinated observations. Likewise, while the stations themselves carry a suite of sensors for monitoring multiple variables, the stations also offer opportunities for other coordinated observations. Nearly all stations have been sites of process studies, involving multiple platforms (ships, extra moorings, drifters, floats, aircraft, etc.). In the following overview, we describe some of these activities, although a full list is not feasible.

Protocols are being developed to identify when reference data are being assimilated.

**4. The OceanSITES network of reference moorings** 

**3.3 Data latency** 

hour of forecast.

#### **4.1.1 The Global Tropical Moored Buoy Array (GTMBA)**

The network has its densest coverage in the tropics (Figure 6). The tropical moored buoy array began in the eastern equatorial Pacific in the early 1980s and expanded to cover 8°S– 8°N across the Pacific with moorings and shiptime at present provided by the US National Oceanic Atmospheric Administration (NOAA) and the Japan Agency for Marine-Earth Science and Technology (JAMSTEC). The NOAA portion of the Pacific array is referred to as the Tropical Atmosphere and Ocean (TAO) array and the JAMSTEC portion is referred to as the Triangle Trans-Ocean Buoy Network (TRITON) array. As discussed below, the primary purpose of the TAO/TRITON array is to observe, better understand, and predict the El Niño-Southern Oscillation (ENSO). In 1997, the array expanded into the Atlantic with moorings from NOAA and shiptime provided by Brazil, France, and the US. The primary purpose of the Atlantic array, referred to as the Prediction and Research Moored Array in the Atlantic (PIRATA), is to observe, better understand, and predict seasonal, interannual, and longer variability, including both ENSO-like and meridional modes of variability. In 2000, the array expanded into the Indian Ocean, with moorings provided by the US, Japan, India, and China, and shiptime provided by India, Indonesia, France, Japan, and the Agulhas Somali Current Large Marine Ecosystems project. The primary purpose of the Research Moored Array for African-Australian Monsoon Analysis and Prediction (RAMA) is to advance monsoon research and forecasting.

Fig. 6. Global Tropical Moored Buoy Array, as of July 2011. Flux reference stations are indicated by a blue square. Courtesy M. McPhaden, NOAA Pacific Marine Environmental Laboratory (PMEL).

Within the tropical Pacific, surface trade winds tend to blow from the cool waters off of South America to the warm waters off of Indonesia, where the wind converges and rises in deep convective clouds. As can be seen in Figure 7, as the warm water shifts eastward, the region of wind convergence and deep convection shifts eastward, resulting in the ENSO cycle, with teleconnections to global weather and climate patterns (Bouma et al., 1997; Diaz & Markgraf, 2000).

The standard suite of sensors on the tropical buoys includes wind speed and direction, air temperature and humidity, surface salinity, and surface and subsurface temperature. A number of the moorings, however, are enhanced with additional sensors to monitor the air-

Fig. 7. Zonal wind (left) and upper 300 m heat content (right) time-series along the equatorial Pacific, as measured by the Pacific Tropical Atmosphere Ocean (TAO) /Triangle Trans-Ocean Buoy Network (TRITON) array. This figure was generated using the data display webpage, courtesy of the TAO project office of NOAA PMEL: http://www.pmel.noaa.gov/tao/jsdisplay/.

sea heat, moisture, and carbon dioxide fluxes; upper ocean salinity; and currents. The most heavily instrumented of these sites are designated as flux stations. The entire GTMBA, together with these specialized flux stations, contribute to the OceanSITES network of deep ocean reference stations (Figure 5).

Through the decades there have been several large international process studies built around the array, including the Coupled Ocean Atmosphere Response Experiment in the western tropical Pacific in 1992–1993 (Webster & Lukas, 1992), the Eastern Pacific Investigation of Climate in 2001 (Cronin et al., 2002; Raymond et al., 2004), and the GasEx 2001 study of physical, chemical, and biological factors controlling pCO2 fluxes in the eastern equatorial Pacific (Sabine et al., 2004). In the Atlantic, African Monsoon Multidisciplinary Analyses (AMMA) occurred 2005–2007 (Lebel et al., 2011). The Cooperative INDian Ocean experiment on intraseasonal variability in the Year 2011 / Dynamics of the Madden-Julian Oscillation (CINDY/DYNAMO) experiment in the Indian Ocean is planned for 2011. Maintenance cruises for the array have also been opportunities for ship-based ancillary projects, including regular hydrographic and Acoustic Doppler Current Profiler (ADCP) sections (Johnson et al., 2002); water sample (Behrenfeld et al., 2006) and atmospheric boundary layer measurements (Fairall

Fig. 7. Zonal wind (left) and upper 300 m heat content (right) time-series along the

display webpage, courtesy of the TAO project office of NOAA PMEL:

http://www.pmel.noaa.gov/tao/jsdisplay/.

ocean reference stations (Figure 5).

equatorial Pacific, as measured by the Pacific Tropical Atmosphere Ocean (TAO) /Triangle Trans-Ocean Buoy Network (TRITON) array. This figure was generated using the data

sea heat, moisture, and carbon dioxide fluxes; upper ocean salinity; and currents. The most heavily instrumented of these sites are designated as flux stations. The entire GTMBA, together with these specialized flux stations, contribute to the OceanSITES network of deep

Through the decades there have been several large international process studies built around the array, including the Coupled Ocean Atmosphere Response Experiment in the western tropical Pacific in 1992–1993 (Webster & Lukas, 1992), the Eastern Pacific Investigation of Climate in 2001 (Cronin et al., 2002; Raymond et al., 2004), and the GasEx 2001 study of physical, chemical, and biological factors controlling pCO2 fluxes in the eastern equatorial Pacific (Sabine et al., 2004). In the Atlantic, African Monsoon Multidisciplinary Analyses (AMMA) occurred 2005–2007 (Lebel et al., 2011). The Cooperative INDian Ocean experiment on intraseasonal variability in the Year 2011 / Dynamics of the Madden-Julian Oscillation (CINDY/DYNAMO) experiment in the Indian Ocean is planned for 2011. Maintenance cruises for the array have also been opportunities for ship-based ancillary projects, including regular hydrographic and Acoustic Doppler Current Profiler (ADCP) sections (Johnson et al., 2002); water sample (Behrenfeld et al., 2006) and atmospheric boundary layer measurements (Fairall et al., 2008); underway surface pCO2 (Feely et al., 2006) and chlorophyll fluorescence (Behrenfeld et al., 2006); and regular deployments of ARGO floats (Roemmich et al., 2009) and surface drifters (Lumpkin & Pazos, 2007), among other activities. For more information on the Pacific TAO/TRITON array, see McPhaden et al. (1998); for the Atlantic PIRATA array, see Bourlès et al. (2008); and for the Indian Ocean RAMA array, see McPhaden et al. (2009). Data and information can be accessed through the GTMBA project website: http://www.pmel.noaa.gov/tao/global/global.html.

#### **4.1.2 Stratus reference station mooring west of Chile**

The Stratus reference station mooring, located west of Chile at 20°S, 85°W in 4450 m depth water, was initiated in 2000. During the 1990s it became clear that nearly all coupled general circulation models had significant biases in the tropical Pacific that impeded their ability to properly reproduce the ENSO variability (Mechoso et al., 1995). In particular, nearly all models had too warm SST and too little stratus cloud in the eastern boundary region just west of Chile. As a result, these models tended to produce convective rainfall north and south of the equator, rather than just north of the equator as shown in Figure 3. The air-sea fluxes as well as the dynamics of the ocean and atmosphere in this data sparse region were poorly known, and any further progress required new data from the region. Thus, in 2000, with support from NOAA, a reference surface mooring, referred to as the "Stratus" mooring, was deployed at 20°S, 85°W. The mooring provides quality surface meteorology and air-sea fluxes of heat, freshwater and momentum, and CO2. Annual cruises to maintain the buoy have provided opportunities for intensive ship-based measurements, particularly of the atmospheric boundary layer (Bretherton et al., 2004). In 2008, the international process study VAMOS Ocean-Cloud-Atmosphere-Land Study Regional Experiment (VOCALS-REx) (Wood et al., 2011) was anchored by the Stratus mooring. For more information on the Stratus reference mooring see Colbo & Weller (2007). The project website can be found at: http://uop.whoi.edu/projects/Stratus/stratus.html.

#### **4.1.3 Northwest Tropical Atlantic Station (NTAS)**

The NTAS surface mooring was established in 4700 m depth water near 15°N, 51°W to investigate surface forcing and oceanographic response in a region of the tropical Atlantic with strong sea surface temperature (SST) anomalies and the likelihood of energetic local air–sea interaction on interannual to decadal timescales. Two modes of coupled air-sea variability are found in the tropical Atlantic, a dynamic mode similar to the Pacific ENSO and a thermodynamic mode characterized by changes in the cross-equatorial SST gradient. Forcing for these modes may be by synoptic atmospheric variability, remote forcing from ENSO, and extratropical forcing from the North Atlantic Oscillation (NAO). Relationships between tropical SST variability, the NAO, and the meridional overturning circulation, as well as between the two tropical modes, are poorly understood.

The NTAS site is co-located with the easternmost subsurface mooring of the Meridional Overturning Variability Experiment (MOVE) "transport" array, which monitors the deep southward branch of the North Atlantic meridional overturning circulation west of the Mid-Atlantic Ridge. Annual cruises to NTAS are shared with MOVE. Funding for NTAS and MOVE is primarily from NOAA. For more information see Kanzow et al. (2008). The NTAS and MOVE project websites can be found at: http://uop.whoi.edu/projects/NTAS/ ntas.html, and http://mooring.ucsd.edu/index.html?/projects/move/move\_results.html.

#### **4.1.4 Tropical Eastern North Atlantic Time-Series Observatory (TENATSO) and Cape Verde Atmospheric Observatory (CVAO)**

CVAO meteorological and atmospheric chemistry measurements and TENATSO-moored physical and biogeochemical measurements in the tropical eastern North Atlantic were initiated in 2006. In 2008, routine ship visits to TENATSO were initiated to collect physical and biogeochemical measurements at TENATSO. CVAO is located on a small Cape Verde island (Sao Vicente) at 16.8°N, 24.9°W, while the TENATSO ocean station is located in 3600 m depth water ~93 km north of the island at 17.6°N, 24.2°W. Like other tropical stations, this is a region of intense air-sea interaction. Being downwind of the Mauritanian upwelling, the ocean and atmospheric data can be used to link biological productivity and atmospheric composition. The location is critical for climate and greenhouse gas studies and for investigating dust impacts on marine ecosystems. CVAO atmospheric reference data contribute to the Global Atmospheric Watch (GAW) program of the WMO, and TENATSO is part of the EuroSITES network (http://www.eurosites.info/), which contributes to the global OceanSITES network. CVAO and TENATSO are funded by Germany, UK, and the EU. For more information, see: Read et al. (2008). CVAO and TENATSO websites can be found at: http://ncasweb.leeds.ac.uk/capeverde/, http://tenatso.ifm-geomar.de/, and http://www.eurosites.info/tenatso.php.

#### **4.2 North Pacific**

#### **4.2.1 Kuroshio Extension observatories and JAMSTEC biogeochemical observatories K2 and S1**

The NOAA Kuroshio Extension Observatory (KEO) surface mooring is located south of the Kuroshio Extension jet at 32.3°N, 144.5°E in 5700 m depth water, and the JAMSTEC KEO (JKEO) surface mooring is located north of the jet at 38°N, 146.5°E in 5400 m depth water. KEO was initiated in 2004 during the Kuroshio Extension System Study (KESS) (Donohue et al., 2008) and JKEO was initiated in 2007. Both KEO and JKEO monitor the air-sea fluxes of heat, moisture, momentum, and carbon dioxide, as well as the upper ocean temperature, salinity, and near-surface currents in the region of very large ocean heat loss in the western North Pacific (Figure 1). The large heat fluxes occur during winter, when cold, dry continental air blows over the warm ocean current. As can be seen in Figure 1, similar regions of high ocean surface heat loss are seen in all basins (Cronin et al., 2010). This strong oceanic warming of the atmosphere can affect the surface winds, clouds, storm development, and, potentially, the storm track. The large air-sea heat fluxes also can affect the formation of water masses, or mode water. The KEO site is located in the subtropical mode water formation region and the JKEO site is located in the central mode water formation region (Oka et al., 2011a, 2011b). Mode waters are formed and modified at the surface, and, after they subduct beneath the surface layer, they generally preserve these characteristics as they circulate through the ocean (Hanawa & Talley, 2001; Oka & Qiu, 2011).

Beginning in 2011, KEO will be enhanced with additional sensors to monitor ocean acidification and the net biological production of oxygen in the surface waters. The carbon cycle and its biological pump are also being monitored at the JAMSTEC biogeochemical observatories, which include K2 in the western subarctic Pacific at 47°N, 160°E in 5200 m depth water and S1 in the western subtropical gyre at 30°N, 145°E in 5900 m water depth.

**4.1.4 Tropical Eastern North Atlantic Time-Series Observatory (TENATSO) and Cape** 

CVAO meteorological and atmospheric chemistry measurements and TENATSO-moored physical and biogeochemical measurements in the tropical eastern North Atlantic were initiated in 2006. In 2008, routine ship visits to TENATSO were initiated to collect physical and biogeochemical measurements at TENATSO. CVAO is located on a small Cape Verde island (Sao Vicente) at 16.8°N, 24.9°W, while the TENATSO ocean station is located in 3600 m depth water ~93 km north of the island at 17.6°N, 24.2°W. Like other tropical stations, this is a region of intense air-sea interaction. Being downwind of the Mauritanian upwelling, the ocean and atmospheric data can be used to link biological productivity and atmospheric composition. The location is critical for climate and greenhouse gas studies and for investigating dust impacts on marine ecosystems. CVAO atmospheric reference data contribute to the Global Atmospheric Watch (GAW) program of the WMO, and TENATSO is part of the EuroSITES network (http://www.eurosites.info/), which contributes to the global OceanSITES network. CVAO and TENATSO are funded by Germany, UK, and the EU. For more information, see: Read et al. (2008). CVAO and TENATSO websites can be found at: http://ncasweb.leeds.ac.uk/capeverde/, http://tenatso.ifm-geomar.de/, and

**4.2.1 Kuroshio Extension observatories and JAMSTEC biogeochemical observatories** 

The NOAA Kuroshio Extension Observatory (KEO) surface mooring is located south of the Kuroshio Extension jet at 32.3°N, 144.5°E in 5700 m depth water, and the JAMSTEC KEO (JKEO) surface mooring is located north of the jet at 38°N, 146.5°E in 5400 m depth water. KEO was initiated in 2004 during the Kuroshio Extension System Study (KESS) (Donohue et al., 2008) and JKEO was initiated in 2007. Both KEO and JKEO monitor the air-sea fluxes of heat, moisture, momentum, and carbon dioxide, as well as the upper ocean temperature, salinity, and near-surface currents in the region of very large ocean heat loss in the western North Pacific (Figure 1). The large heat fluxes occur during winter, when cold, dry continental air blows over the warm ocean current. As can be seen in Figure 1, similar regions of high ocean surface heat loss are seen in all basins (Cronin et al., 2010). This strong oceanic warming of the atmosphere can affect the surface winds, clouds, storm development, and, potentially, the storm track. The large air-sea heat fluxes also can affect the formation of water masses, or mode water. The KEO site is located in the subtropical mode water formation region and the JKEO site is located in the central mode water formation region (Oka et al., 2011a, 2011b). Mode waters are formed and modified at the surface, and, after they subduct beneath the surface layer, they generally preserve these characteristics as they circulate through the ocean (Hanawa & Talley, 2001; Oka & Qiu,

Beginning in 2011, KEO will be enhanced with additional sensors to monitor ocean acidification and the net biological production of oxygen in the surface waters. The carbon cycle and its biological pump are also being monitored at the JAMSTEC biogeochemical observatories, which include K2 in the western subarctic Pacific at 47°N, 160°E in 5200 m depth water and S1 in the western subtropical gyre at 30°N, 145°E in 5900 m water depth.

**Verde Atmospheric Observatory (CVAO)** 

http://www.eurosites.info/tenatso.php.

**4.2 North Pacific** 

**K2 and S1**

2011).

K2 was initiated in 2001 and S1 was initiated in 2010. Both the western and eastern regions of the subarctic Pacific are expected to experience significant effects of ocean acidification during the next century from the absorption of anthropogenic CO2 (Orr et al., 2005).

Routine measurements from the JAMSTEC mooring maintenance cruises have included hydrographic, atmospheric profile sounding sections, and underway meteorological and oceanographic measurements, among other activities (Tokinaga et al., 2009). All four stations are visited regularly during JAMSTEC biogeochemical cruises. For more information on the KEO array, see Cronin et al. (2008) and Konda et al. (2010). For Station K2, see Kawakami et al. (2007). Project websites can be found at: http://www.pmel.noaa.gov/keo/, http://www.jamstec.go.jp/iorgc/ocorp/ktsfg/data/jkeo/ and http://www.jamstec.go.jp/res/ress/hondam/index\_e.html.

#### **4.2.2 Hawaii Ocean Time-series (HOT)**

One of the most iconic long time-series is the famous "Keeling" curve showing the increase in atmospheric CO2 observed at Mauna Loa since 1958 (Figure 8). While the atmospheric CO2 has seasonal peak-to-peak variations of ~7 parts per million (ppm), over the past five decades, the CO2 concentration has steadily increased by more than 10 times that amount due to anthropogenic sources.

Fig. 8. Time-series of atmospheric CO2 at Mauna Loa, in parts per million volume (ppmv; red), surface ocean pCO2 (atm; blue) and surface ocean pH (green) from the Hawaii Ocean Time-series Station ALOHA. Note that the increase in oceanic CO2 over the past 17 years is consistent with the atmospheric increase within the statistical limits of the measurements. From Doney et al. (2009), after Feely et al. (2008).

A corresponding oceanic time-series at an observatory station in 4780 m depth water, 100 km north of the island of Oahu, was initiated in 1988 through the Hawaii Ocean Time-series (HOT) program, funded primarily by the US National Science Foundation (NSF). As shown in Figure 8 (Doney et al., 2009; Feely et al., 2008) the rise in pCO2 is observed in the surface waters, and because it interacts with seawater to form carbonic acid as discussed in Section 2, this rise is also associated with a decrease in the water's pH. Essentially, the absorption of anthropogenic CO2 is causing the waters to become more corrosive.

As shown in Figure 8, the sea surface CO2 has rapid natural variability due to variations in the ocean temperature, mixing, upwelling, and biological processes. While much of this variability is captured in the monthly cruises to the HOT ALOHA (A Longterm Oligotrophic Habitat Assessment) observatory, there is significant variability at higher frequencies. Thus, with funding from NOAA and additional funding from NSF, in 2004, a surface reference station flux mooring was deployed at the observatory site. The mooring measures surface oceanic and atmospheric pCO2 at three hourly intervals, and meteorological and other physical measurements even more frequently. For more information on the HOT program, see Karl et al. (2003). The project websites can be found at: http://hahana.soest.hawaii.edu/ hot/hot\_jgofs.html and http://uop.whoi.edu/projects/WHOTS/whots.html.

#### **4.2.3 Station Papa in the eastern subarctic Pacific**

One of the oldest ocean time-series is Ocean Station Papa, which began in December 1949 as part of the ocean weathership program. Station Papa is located at 50°N, 145°W in the eastern subarctic Pacific in 4260 m depth water. During its first year the site was occupied by a US Coast Guard ship. For the next three decades it was occupied continuously by Canadian ships on 6-week rotations. Taking meteorological and oceanic measurements, information was radioed to shore and contributed to the weather forecasts during this period. With the advent of the satellite era in the early 1980s, the Canadian Weathership Program was terminated. The Line-P program funded by Canadian Fisheries and Oceans, however, continued to make shipboard measurements on transects from Victoria, Canada, to Station Papa 3–6 times per year. Standard Line-P measurements include hydrography (Crawford et al., 2007), O2 (Whitney et al., 2007), phytoplankton biomass and nutrient samples (Peña & Varela, 2007), zooplankton net tows (Mackas et al., 2007), chlorophyll, transmissivity, as well as dissolved inorganic carbon and total alkalinity (Wong et al., 2002), among other measurements. The present program samples three times per year, in February, May-June, and August-September.

Through the decades, Station Papa has been the location of numerous process studies, including, among others: the Mixed Layer Experiment in 1977 (Davis et al., 1981), Subarctic Pacific Ecosystem Research in 1984 (Miller, 1993), Storm Transfer and Response Experiment in 1980 and 1981 (Large et al., 1986), Ocean Storms in 1987 (Paduan & Niiler, 1993), and the SOLAS/SERIES iron enrichment experiment in 2003 (Boyd et al., 2004; de Baar et al., 2005). From 2007 to 2009, an NSF funded Carbon Cycle process study included support for a flux reference station mooring at Station Papa to monitor the carbon cycle and ocean acidification, in addition to the physical and meteorological environment (Emerson et al., 2011). In order to continue the mooring station on an ongoing basis, in 2009, support for the reference station mooring was transferred to NOAA. Shiptime for annual mooring maintenance has been provided by the Canadian Line-P program. The US NSF Ocean Observatory Initiative (OOI)

A corresponding oceanic time-series at an observatory station in 4780 m depth water, 100 km north of the island of Oahu, was initiated in 1988 through the Hawaii Ocean Time-series (HOT) program, funded primarily by the US National Science Foundation (NSF). As shown in Figure 8 (Doney et al., 2009; Feely et al., 2008) the rise in pCO2 is observed in the surface waters, and because it interacts with seawater to form carbonic acid as discussed in Section 2, this rise is also associated with a decrease in the water's pH. Essentially, the absorption of

As shown in Figure 8, the sea surface CO2 has rapid natural variability due to variations in the ocean temperature, mixing, upwelling, and biological processes. While much of this variability is captured in the monthly cruises to the HOT ALOHA (A Longterm Oligotrophic Habitat Assessment) observatory, there is significant variability at higher frequencies. Thus, with funding from NOAA and additional funding from NSF, in 2004, a surface reference station flux mooring was deployed at the observatory site. The mooring measures surface oceanic and atmospheric pCO2 at three hourly intervals, and meteorological and other physical measurements even more frequently. For more information on the HOT program, see Karl et al. (2003). The project websites can be found at: http://hahana.soest.hawaii.edu/

One of the oldest ocean time-series is Ocean Station Papa, which began in December 1949 as part of the ocean weathership program. Station Papa is located at 50°N, 145°W in the eastern subarctic Pacific in 4260 m depth water. During its first year the site was occupied by a US Coast Guard ship. For the next three decades it was occupied continuously by Canadian ships on 6-week rotations. Taking meteorological and oceanic measurements, information was radioed to shore and contributed to the weather forecasts during this period. With the advent of the satellite era in the early 1980s, the Canadian Weathership Program was terminated. The Line-P program funded by Canadian Fisheries and Oceans, however, continued to make shipboard measurements on transects from Victoria, Canada, to Station Papa 3–6 times per year. Standard Line-P measurements include hydrography (Crawford et al., 2007), O2 (Whitney et al., 2007), phytoplankton biomass and nutrient samples (Peña & Varela, 2007), zooplankton net tows (Mackas et al., 2007), chlorophyll, transmissivity, as well as dissolved inorganic carbon and total alkalinity (Wong et al., 2002), among other measurements. The present program samples three times per year, in February, May-June,

Through the decades, Station Papa has been the location of numerous process studies, including, among others: the Mixed Layer Experiment in 1977 (Davis et al., 1981), Subarctic Pacific Ecosystem Research in 1984 (Miller, 1993), Storm Transfer and Response Experiment in 1980 and 1981 (Large et al., 1986), Ocean Storms in 1987 (Paduan & Niiler, 1993), and the SOLAS/SERIES iron enrichment experiment in 2003 (Boyd et al., 2004; de Baar et al., 2005). From 2007 to 2009, an NSF funded Carbon Cycle process study included support for a flux reference station mooring at Station Papa to monitor the carbon cycle and ocean acidification, in addition to the physical and meteorological environment (Emerson et al., 2011). In order to continue the mooring station on an ongoing basis, in 2009, support for the reference station mooring was transferred to NOAA. Shiptime for annual mooring maintenance has been provided by the Canadian Line-P program. The US NSF Ocean Observatory Initiative (OOI)

anthropogenic CO2 is causing the waters to become more corrosive.

hot/hot\_jgofs.html and http://uop.whoi.edu/projects/WHOTS/whots.html.

**4.2.3 Station Papa in the eastern subarctic Pacific** 

and August-September.

plans to enhance this station in the coming years with additional moorings and sensors to make station Papa one of its four global nodes. For more information on Station Papa and Line-P, see Freeland (2007) and Peña & Bograd (2007). The project websites can be found at: http://www.pmel.noaa.gov/stnP/ and http://www.pac.dfo-mpo.gc.ca/science/oceanseng.htm.

#### **4.2.4 Monterey Bay Aquarium Research Institute (MBARI) moorings, California Current Ecosystem (CCE) moorings, and the California Oceanic Cooperative Fisheries Investigation (CalCOFI)**

The MBARI moorings were in the California Current system at 36.7°N, 122°W in 1600 m and 36.7°N, 122.4°W in 1800 m water depth were first deployed in 1989. The moorings carry physical, meteorological (air-sea flux), and biogeochemical sensors. Ecosystem productivity and the biogeochemical cycling of elements in the California upwelling regions is regulated by physical processes that vary on daily to multidecadal time scales. As with other observatories described here, through these concurrent measurements of physics, chemistry, and biology, changes in biological and chemical fluxes associated with the physical variability can be estimated and used to develop predictive models. These moorings are funded primarily through support from the David and Lucile Packard Foundation, with support for bio-optical measurements from NASA. For more information, see Chavez et al. (1997). The project website can be found at: http://www.mbari.org/oasis/.

With funding from NOAA, two multi-disciplinary moorings, CCE1 and CCE2, are being sustained off Point Conception at 33.5°N, 122.5°W and 34.3°N, 120.8°W in 4000 m and 800 m of water, respectively. CCE1 was initiated in 2008 and CCE2 was initiated in 2010 and carry physical, meteorological, biogeochemical, and ecosystem sensors. The moorings contrast the productive upwelling regime near the coast and the open-ocean regime in the center of the southward flowing low-salinity Californian Current, and are co-located with repeat stations of the CalCOFI shipboard sampling grid, and a glider repeat transect. CCE1 and CCE2 provide real-time data and connectivity to sensors along the mooring wire down to several hundred meters depth, and have spare capacity for adding and telemetering additional community-provided sensors. Ground-truthing for chemical and optical/acoustic ecosystem observations is provided by CalCOFI cruises. The CalCOFI program began in 1949 for the purpose of studying the ecological aspects of the sardine population collapse off California. Initially monthly cruises, the present sampling is quarterly cruises to 75 stations in a 1.9 x 105 km2 grid located off the coast of Southern California and provides unique long-term time-series at select locations in the southern California Current. For more information on CalCOFI, see: Ohman and Venrick (2003). The CCE and CalCOFI project websites can be found at http://mooring.ucsd.edu/cce/.

#### **4.3 North Atlantic**

#### **4.3.1 Bermuda Atlantic Time-series Study (BATS)**

Biweekly ship-based observations at "Hydrostation S", in 3300 m depth water 25 km SE of Bermuda, began in 1954, making this one of the few ocean time-series that exceeds 50 years. In October 1988, monthly cruises were extended to the BATS station located in 4500 m depth water approximately 80 km SE of Bermuda. These monthly (and biweekly during spring bloom periods) BATS cruises had a broader focus on the biogeochemistry and hydrography of the Sargasso Sea ecosystem. The site is located within the North Atlantic subtropical gyre, similar to the HOT location in the center of the North Pacific subtropical gyre. From 1994 through 2007, a surface mooring at this site, referred to as the "Bermuda Testbed Mooring" (BTM), carried a suite of meteorological, physical, and biogeochemical sensors. At present the BATS observations are supported primarily through NSF research grants. Funding cuts to the BTM, however, caused this long, high-resolution time-series to be discontinued in 2007. As discussed later, one of the main challenges of the reference station network is securing sustained funding. For more information on hydrostation S, see Phillips & Joyce (2007); for BTM and BATS, see Dickey et al. (2001). Project websites can be found at: http://www.bios.edu/research/bats.html and http://www.opl.ucsb.edu/btm.html.

#### **4.3.2 Central Irmingir Sea (CIS)**

The CIS observatory, established in 2002, is located about 200 km east of the southern tip of Greenland, at 59.4 ºN, 39.4 ºW in a water depth of 2800 m. The instrumentation is optimized for resolving physical and biogeochemical processes in the mixed layer, with sensors that monitor temperature, salinity, currents, nitrate, pCO2, O2, and fluorescence, among other variables. Wintertime surface cooling can be intense and very deep mixed layer depths have been observed, indicating deep water formation. Because weather conditions have been a perpetual challenge, the mooring has a small surface element for real time data transmission, but does not carry meteorological sensors.

The NSF OOI plans to enhance this station in the coming years with additional moorings and sensors to make the CIS station one of its four global nodes. Currently, CIS is funded by Germany and the EU. For more information, see: http://www.eurosites.info/cis.php.

#### **4.3.3 Porcupine Abyssal Plain (PAP)**

The PAP observatory is located in 4850 m depth water south of the North Atlantic Current, at 49°N, 16.5°W, in a region with a relatively flat seafloor. The mooring, equipped with sediment traps at three depths, was first deployed in 1989 to study and monitor the open ocean water column biogeochemistry, physics, and benthic biology. Capability has steadily increased to include upper ocean biogeochemical variables such as CO2, chlorophyll and nutrients in 2002. In 2009, the station was enhanced to monitor surface meteorology and thus the observatory became an air-sea flux station as well. PAP is located in a region with large ocean absorption of atmospheric CO2. Surface mixed layers are deep during winter, and during springtime the mixed layer becomes shallow, supporting a widespread phytoplankton bloom. PAP observations thus allow monitoring of the carbon cycle from the atmosphere to the abyss and its physical and biological pumps.

PAP is funded primarily by the UK Natural Environment Research Council (NERC) and the EU, and is part of the EuroSITES network. For more information on PAP see Lampitt et al. (2010b). The project webpage can be found at: http://www.noc.soton.ac.uk/pap.

#### **4.3.4 European Station for Time-series in the Ocean Canary Islands (ESTOC)**

The ESTOC observatory, located about 100 km north of the Canary Islands at 29.2ºN, 15.5ºW in 3610 m depth water, was initiated in 1994 with monthly ship visits to the station that

bloom periods) BATS cruises had a broader focus on the biogeochemistry and hydrography of the Sargasso Sea ecosystem. The site is located within the North Atlantic subtropical gyre, similar to the HOT location in the center of the North Pacific subtropical gyre. From 1994 through 2007, a surface mooring at this site, referred to as the "Bermuda Testbed Mooring" (BTM), carried a suite of meteorological, physical, and biogeochemical sensors. At present the BATS observations are supported primarily through NSF research grants. Funding cuts to the BTM, however, caused this long, high-resolution time-series to be discontinued in 2007. As discussed later, one of the main challenges of the reference station network is securing sustained funding. For more information on hydrostation S, see Phillips & Joyce (2007); for BTM and BATS, see Dickey et al. (2001). Project websites can be found at: http://www.bios.edu/research/bats.html and http://www.opl.ucsb.edu/btm.html.

The CIS observatory, established in 2002, is located about 200 km east of the southern tip of Greenland, at 59.4 ºN, 39.4 ºW in a water depth of 2800 m. The instrumentation is optimized for resolving physical and biogeochemical processes in the mixed layer, with sensors that monitor temperature, salinity, currents, nitrate, pCO2, O2, and fluorescence, among other variables. Wintertime surface cooling can be intense and very deep mixed layer depths have been observed, indicating deep water formation. Because weather conditions have been a perpetual challenge, the mooring has a small surface element for real time data

The NSF OOI plans to enhance this station in the coming years with additional moorings and sensors to make the CIS station one of its four global nodes. Currently, CIS is funded by Germany and the EU. For more information, see: http://www.eurosites.info/cis.php.

The PAP observatory is located in 4850 m depth water south of the North Atlantic Current, at 49°N, 16.5°W, in a region with a relatively flat seafloor. The mooring, equipped with sediment traps at three depths, was first deployed in 1989 to study and monitor the open ocean water column biogeochemistry, physics, and benthic biology. Capability has steadily increased to include upper ocean biogeochemical variables such as CO2, chlorophyll and nutrients in 2002. In 2009, the station was enhanced to monitor surface meteorology and thus the observatory became an air-sea flux station as well. PAP is located in a region with large ocean absorption of atmospheric CO2. Surface mixed layers are deep during winter, and during springtime the mixed layer becomes shallow, supporting a widespread phytoplankton bloom. PAP observations thus allow monitoring of the carbon cycle from the

PAP is funded primarily by the UK Natural Environment Research Council (NERC) and the EU, and is part of the EuroSITES network. For more information on PAP see Lampitt et al.

The ESTOC observatory, located about 100 km north of the Canary Islands at 29.2ºN, 15.5ºW in 3610 m depth water, was initiated in 1994 with monthly ship visits to the station that

(2010b). The project webpage can be found at: http://www.noc.soton.ac.uk/pap.

**4.3.4 European Station for Time-series in the Ocean Canary Islands (ESTOC)** 

**4.3.2 Central Irmingir Sea (CIS)** 

**4.3.3 Porcupine Abyssal Plain (PAP)** 

transmission, but does not carry meteorological sensors.

atmosphere to the abyss and its physical and biological pumps.

included a sediment trap mooring and nearby subsurface current meter mooring. Since 2002, the station has been occupied by a surface mooring that measures upper-ocean physical and biogeochemical variables and surface meteorology. In 2007, the sediment trap mooring was terminated and in 2008, the surface mooring was upgraded to monitor air-sea fluxes. As it is windward of the Canary Islands, the station avoids wake effects of the Canary Current and northeast trade winds. It is also far enough from coasts and islands to serve as a reference for satellite images and altimetry.

Funding for ESTOC has come from the EU, the German Research Foundation (DFG), and national and regional projects from Spain and the Canary Islands. At present, funding from the governments of Spain and the Canary Islands comes primarily through the Canary Oceanic Platform (PLOCAN; http://plocan.eu). For more information on ESTOC, see Neuer et al. (2007) and González-Dávila et al. (2010). ESTOCS is part of the EuroSITES network and its websites can be found at: http://www.eurosites.info/estoc.php and http://www.estoc.es/.

#### **4.4 Mediterranean Sea**

#### **4.4.1 Mediterranean Moored Multi-sensor Array (M3A) network**

The M3A network includes three reference stations which contribute to the EuroSITES and OceanSITES networks: POSEIDON/E1-M3A in the south Aegean Sea at 35.8ºN, 24.93ºE (initiated in 2000), E2-M3A in the Adriatic Sea at 41.84ºN, 17.76ºE (initiated in 2004), and the W1-M3A in the Ligurian Sea at 43.81ºN, 9.12ºE (also initiated in 2004). All three moorings carry suites of sensors to monitor the surface meteorology and air-sea fluxes, directional wave parameters, upper ocean temperature, salinity, currents, and biochemical parameters in the euphotic zone. Biogeochemistry within the Ligurian Sea is also monitored by the DYFAMED station described below. All three M3A stations are in water depth greater than 1200 m. The M3A network is funded by Italy, Greece, and the EU. For more information, see: http://www.eurosites.info/.

#### **4.4.2 Dynamics of the Atmospheric Fluxes in the MEDiterranean (DYFAMED) station in the Ligurian Sea**

The DYFAMED station in the Ligurian Sea at 43.42ºN, 7.87ºE was initiated in 1988 with the deployment of a mooring with sediment traps at 200 m and 1000 m, in water depth of 2350 m. Since 1991, monthly cruises have been performed as well to observe the physical and biogeochemical variability throughout the water column. In 1999, a nearby surface mooring was deployed by Météo-France to monitor the surface meteorology and wave parameters. Ocean physical parameters are also measured at present by sensors mounted on the sediment trap mooring. DYFAMED is currently funded by France and the EU. For more information, see Marty (2002) and the project websites: http://www.obs-vlfr.fr/dyfBase, and http://www.eurosites.info/dyfamed.php.

#### **4.5 Southern Ocean**

#### **4.5.1 Southern Ocean Time-Series (SOTS)**

SOTS (Trull et al., 2010) commenced in 1998 with a sediment trap mooring program (SAZ; Trull et al., 2001) located in the Sub-Antarctic Zone 650 km south of Tasmania at 46.75°S, 142°E, in 4600 m of water. The site was expanded in 2003 with the addition of the Pulse mooring, to understand biogeochemical processes in the surface ocean, and again in 2010 with the addition of the Southern Ocean Flux Station (SOFS; Schulz et al., 2011) climate mooring, autonomous drifting profilers, and gliders. The Southern Ocean "Roaring Forties" is notorious for its storms, waves, and strong currents. Its Circumpolar Current is a route by which water can be carried from the South Atlantic Ocean to the South Indian Ocean and the South Pacific. As waters, formed at the surface in the Subantarctic Zone, sink and flow under warmer subtropical and tropical waters, they carry CO₂ into the deep ocean, out of contact with the atmosphere. Through this subduction process, oxygen and nutrients are also supplied to deep ocean ecosystems throughout much of the global ocean. It should be noted that this is the only OceanSITES surface mooring south of the Tropic of Cancer. SOTS is funded through the Australian Integrated Marine Observing System (IMOS; Hill, 2010; Meyers, 2008). For more information, see: http://imos.org.au/sofs.html.

#### **5. Challenges facing the network**

#### **5.1 Long term commitment**

Obtaining long time-series requires commitment: organizational, institutional, and scientific. Funding organizations that can support a long-term project do not always exist. In many cases, these long time-series are funded through 3–5 year research grants and the time-series is vulnerable to the funding cycle. If the research proposal with a 3-year time horizon is rejected, the long time-series is discontinued, as was the case of the 13-year surface mooring time-series at the BATS observatory discussed above. Likewise, for the very long time-series, the scientists who initiated the time-series may no longer be involved. During the transition in leadership, the institution's interest in the station can play a critical role in the ultimate success of the transition. Ultimately, the value of the station is determined by how the data are used, which depends upon the scientific importance of the station, the suite of measurements and their quality, the data latency and availability, and the ease with which the data can be used (Karl, 2010).

#### **5.2 Public data and common data formats**

OceanSITES has an active data management group that developed a self-documented netCDF (network common data form) format that all station operators agree to use. (For more information, see http://www.oceansites.org/data/). All station operators also agree to submit their data in this common format to Data Assembly Centers (DACS) that, in turn, forward data to two Global Data Assembly Centers (GDACS) that mirror each other: one at the NOAA National Data Buoy Center (NDBC) in the US and one at the Institut Français de Recherche pour l'exploitation de la MER (IFREMER) in France. Both GDACS can be accessed through the OceanSITES website provided above.

#### **5.3 Governance**

OceanSITES began as a volunteer group. Recently it has become an Action Group of the Data Buoy Cooperation Panel (DBCP) of the Joint WMO and International Oceanographic Commission's (IOC) Technical Commission for Oceanography and Marine Meteorology (JCOMM) (http://www.jcomm.info/index.php?option=com\_content&task=view&id=76

142°E, in 4600 m of water. The site was expanded in 2003 with the addition of the Pulse mooring, to understand biogeochemical processes in the surface ocean, and again in 2010 with the addition of the Southern Ocean Flux Station (SOFS; Schulz et al., 2011) climate mooring, autonomous drifting profilers, and gliders. The Southern Ocean "Roaring Forties" is notorious for its storms, waves, and strong currents. Its Circumpolar Current is a route by which water can be carried from the South Atlantic Ocean to the South Indian Ocean and the South Pacific. As waters, formed at the surface in the Subantarctic Zone, sink and flow under warmer subtropical and tropical waters, they carry CO₂ into the deep ocean, out of contact with the atmosphere. Through this subduction process, oxygen and nutrients are also supplied to deep ocean ecosystems throughout much of the global ocean. It should be noted that this is the only OceanSITES surface mooring south of the Tropic of Cancer. SOTS is funded through the Australian Integrated Marine Observing System (IMOS; Hill, 2010;

Obtaining long time-series requires commitment: organizational, institutional, and scientific. Funding organizations that can support a long-term project do not always exist. In many cases, these long time-series are funded through 3–5 year research grants and the time-series is vulnerable to the funding cycle. If the research proposal with a 3-year time horizon is rejected, the long time-series is discontinued, as was the case of the 13-year surface mooring time-series at the BATS observatory discussed above. Likewise, for the very long time-series, the scientists who initiated the time-series may no longer be involved. During the transition in leadership, the institution's interest in the station can play a critical role in the ultimate success of the transition. Ultimately, the value of the station is determined by how the data are used, which depends upon the scientific importance of the station, the suite of measurements and their quality, the data latency and availability, and the ease with which

OceanSITES has an active data management group that developed a self-documented netCDF (network common data form) format that all station operators agree to use. (For more information, see http://www.oceansites.org/data/). All station operators also agree to submit their data in this common format to Data Assembly Centers (DACS) that, in turn, forward data to two Global Data Assembly Centers (GDACS) that mirror each other: one at the NOAA National Data Buoy Center (NDBC) in the US and one at the Institut Français de Recherche pour l'exploitation de la MER (IFREMER) in France. Both GDACS can be

OceanSITES began as a volunteer group. Recently it has become an Action Group of the Data Buoy Cooperation Panel (DBCP) of the Joint WMO and International Oceanographic Commission's (IOC) Technical Commission for Oceanography and Marine Meteorology (JCOMM) (http://www.jcomm.info/index.php?option=com\_content&task=view&id=76

Meyers, 2008). For more information, see: http://imos.org.au/sofs.html.

**5. Challenges facing the network** 

**5.1 Long term commitment** 

the data can be used (Karl, 2010).

**5.3 Governance** 

**5.2 Public data and common data formats** 

accessed through the OceanSITES website provided above.

&Itemid=76). With support of a technical staffer at JCOMM and staff from the NOAA NDBC in the US and IFREMER in France, OceanSITES has made significant progress on developing governance. The executive committee includes representatives for each ocean, for the physical and biogeochemical communities, and from the data management panel. The OceanSITES data management team includes scientists and technical staff from the various DACS and GDACS. An emphasis has been placed on making all data openly and easily available at no cost to the user.

The OceanSITES also has a scientific steering team that includes the principal investigators (station operators) from all OceanSITES reference stations. The scientific steering team is charged with developing and reviewing the network and its data requirements and data management, coordinating the implementation of the network, identifying gaps in the network and synergies with other programs, and ensuring the integration of the network into the overall global ocean observing system. While many of the stations were initiated prior to or independently from the OceanSITES network, by becoming part of the network, the stations can significantly increase their user base and thereby increase the value of the station. Admittance into the network, however, carries responsibility, particularly in terms of providing open and easy access to the data.

#### **5.4 High latitudes**

As can be seen in Figures 5 and 6, most open ocean surface moorings are located in the tropics. While this is in part because the tropical environment is much more benign (it is much easier to maintain a mooring in tropical conditions than in the "Roaring Forties"), the primary reason is that, as discussed earlier, the ocean and atmosphere are highly coupled in the tropics. The tropical oceans can thus have a strong influence on the tropical and global atmosphere. However, higher latitudes are important to monitor as these source regions form various different water masses, are living environments for important fisheries, and are where the CO2 solubility pump occurs and is the driver for the downwelling limb of the thermohaline circulation. Furthermore, model studies indicate that ocean acidification will lead to the high latitude surface waters becoming undersaturated with respect to calcium carbonate biominerals (e.g. aragonite, calcite) within a matter of decades (Orr et al., 2005). This would have a detrimental effect on the high latitude ecosystems and reference stations are needed to quantify these changes.

As we seek to use ocean and coupled ocean-atmosphere models to investigate the ocean's role in climate variability and change, there is great interest in knowing the fluxes across the ocean's surface integrated over its entire surface and in assessing whether the models accurately represent those surface integrals. Yet, the high latitudes have few reference stations and our knowledge of the regional surface meteorology, air-sea exchanges, and physical and biogeochemical dynamics, is poor. It is thus a high priority to expand ocean reference stations to the high latitudes.

#### **6. The future**

OceanSITES seeks to encourage the sustained support of ocean reference stations. As discussed above, many of the stations are supported through partnerships that involve multiple scientists, institutions, agencies, and nations. Hope for expansion into the high latitudes is at hand, as is shown by the Australian site south of Tasmania, SOTS, mentioned above. The US NSF OOI is also initiating four deep-ocean, full water column, interdisciplinary reference stations. These stations, referred to as "global nodes," would be located at strategic sites within the three-dimensional circulation of the global oceans, including two currently existing OceanSITES reference stations: Station Papa in the eastern subarctic Pacific, and CIS, in the Irminger Sea southeast of Greenland. The two other global nodes are in the Argentine Basin at 42°S, 42°W, and in the Southern Ocean, southwest of Chile at 55°S, 90°W. Further contributions to the high latitude sites and continued efforts to develop common, multidisciplinary instrumentation to be deployed at each site would complete the global array of ocean reference stations.

#### **7. Conclusion**

We live on a water world. Weather and climate over land cannot be isolated from that over and within the ocean. In order to understand the global heat balance, hydrological cycle, and carbon cycle, it is necessary to observe, understand, and map the physical, chemical, and ecosystem environment with sufficient temporal, horizontal, and vertical resolution. This is the purpose of the Global Ocean Observing System (GOOS), which is a system within the GEOSS. The network of OceanSITES reference stations is an integral part of the GOOS. Data from these reference stations detect rapid changes and episodic events as well as long-term changes. These reference data are made available to the public to further our understanding of our changing world. The data are used to validate and assess satellite products and improve our ability to monitor the globe remotely. Scientific researchers are using these data to study mechanisms controlling the climate and ecosystems and to test and improve numerical models used for predicting future changes. Our ability to plan, adapt, and cope with future changes in weather, climate, and ecosystem depends crucially upon our ability to monitor and predict these changes. Through dedication and commitment, the OceanSITES network provides the baseline data for these efforts.

#### **8. Acknowledgement**

Much of the information in section 4 was provided to the OceanSITES project office by the operators of the station. The authors thank Hester Viola for compiling this information into white papers that are available on the OceanSITES website. The authors also thank M. McPhaden, L. Carpenter, M. Honda, Y. Kawai, M. Church, B. Crawford, F. Chavez, M. Lomas, J. Karstensen, A. Cianca, V. Cardin, L. Coppola, R. Bozzano, E. Schulz, S. Bigley, K. Ronnholm, and Z. Yu for their helpful feedback on this manuscript.

#### **9. References**


latitudes is at hand, as is shown by the Australian site south of Tasmania, SOTS, mentioned above. The US NSF OOI is also initiating four deep-ocean, full water column, interdisciplinary reference stations. These stations, referred to as "global nodes," would be located at strategic sites within the three-dimensional circulation of the global oceans, including two currently existing OceanSITES reference stations: Station Papa in the eastern subarctic Pacific, and CIS, in the Irminger Sea southeast of Greenland. The two other global nodes are in the Argentine Basin at 42°S, 42°W, and in the Southern Ocean, southwest of Chile at 55°S, 90°W. Further contributions to the high latitude sites and continued efforts to develop common, multidisciplinary instrumentation to be deployed at each site would

We live on a water world. Weather and climate over land cannot be isolated from that over and within the ocean. In order to understand the global heat balance, hydrological cycle, and carbon cycle, it is necessary to observe, understand, and map the physical, chemical, and ecosystem environment with sufficient temporal, horizontal, and vertical resolution. This is the purpose of the Global Ocean Observing System (GOOS), which is a system within the GEOSS. The network of OceanSITES reference stations is an integral part of the GOOS. Data from these reference stations detect rapid changes and episodic events as well as long-term changes. These reference data are made available to the public to further our understanding of our changing world. The data are used to validate and assess satellite products and improve our ability to monitor the globe remotely. Scientific researchers are using these data to study mechanisms controlling the climate and ecosystems and to test and improve numerical models used for predicting future changes. Our ability to plan, adapt, and cope with future changes in weather, climate, and ecosystem depends crucially upon our ability to monitor and predict these changes. Through dedication and

commitment, the OceanSITES network provides the baseline data for these efforts.

Ronnholm, and Z. Yu for their helpful feedback on this manuscript.

OGCM. *Climate Dynamics,* Vol. 29, pp. 575-590

Much of the information in section 4 was provided to the OceanSITES project office by the operators of the station. The authors thank Hester Viola for compiling this information into white papers that are available on the OceanSITES website. The authors also thank M. McPhaden, L. Carpenter, M. Honda, Y. Kawai, M. Church, B. Crawford, F. Chavez, M. Lomas, J. Karstensen, A. Cianca, V. Cardin, L. Coppola, R. Bozzano, E. Schulz, S. Bigley, K.

Behrenfeld, M.J.; Worthington, K.; Sherrell, R.M.; Chavez, F.P.; Strutton, P.; McPhaden, M.J.

Bernie,D. J.; Guilyardi, E.; Madec, G.; Slingo, J.M. & Woolnough, S.J. (2007). Impact of

through nutrient stress diagnostics. *Nature, Vol. 442,* pp. 1025-1028

& Shea, D.M. (2006). Controls on tropical Pacific ocean productivity revealed

resolving the diurnal cycle in an ocean-atmosphere GCM. Part 1: A diurnally forced

complete the global array of ocean reference stations.

**7. Conclusion** 

**8. Acknowledgement** 

**9. References** 


Dickey, T.; Zedler, S.; Frye, D.; Jannasch, H.; Manov, D.; Sigurdson, D.; McNeil, J.D.; Dobeck,

Ding, Q.; Wang, B.; Wallace, J.M. & Branstator, G. (2011). Tropical-extratropical

Doney, S.C.; Balch, W.M.; Fabry, V.J. & Feely, R.A. (2009). Ocean acidification: A critical emerging problem for the ocean sciences. *Oceanography,* Vol. 22, No. 4, pp. 16-25 Donohue, K.A. & Co-Authors (2008). Program studies the Kuroshio Extension. *EOS* 

Emerson, S.; Sabine, C.; Cronin, M.F.; Feely, R.; Cullison, S. & DeGrandpre, M. (2011).

Eyre, J.; Andersson, E.; Charpentier, E.; Ferranti, L.; Lafeuille, J.; Ondrá, M.; Pailleux, J.;

Fairall, C.W.; Uttal, T.; Hazen, D.; Hare, J.; Cronin, M.F.; Bond, N.; Veron, D.E. (2008).

Feely, R.A.; Sabine, C.L.; Lee, K.; Berelson, W.; Kleypas, J.; Fabry, V.J.; & Millero, F.J. (2004).

Feely, R.A.; Takahashi, T.; Wanninkhof, R.; McPhaden, M.J.; Cosca, C.E.; Sutherland, S.C. &

Feely, R.A.; Fabry, V.J.; & Guinotte, J.M. (2008). Ocean acidification of the North Pacific

Feely, R.; Fabry, V.; Dickson, A.; Gattuso, J.; Bijma, J.; Riebesell, U.; Doney, S.; Turley, C.;

Pacific. *J. Clim.*, Vol. *21, No.* 4, pp. 655-673, doi: 10.1175/2007JCLI1757.1 Fairall, C.W. & Co-Authors (2010).Observations to quantify air-sea fluxes and their role in

*Studies in Oceanography,* Vol. 48, pp. 2105-2140

*Transactions AGU*, Vol. 89, No. 17, pp. 161-162

GB3008, 12 pp., doi:10.1029/2010GB003924

WPP-306, doi:10.5270/OceanObs09.cwp.27

Ocean. *PICES Press*, Vol. 16, No. 1, pp. 22-26.

doi:10.5270/OceanObs09.cwp.29

doi:10.5270/OceanObs09.cwp.26

No. 5682, pp. 362-366

2005JC003129

*Climate*, Vol. 24, pp. 1874-1896, doi: 10.1175/2011JCLI3621.1

L.; Yu, X.; Gilboy, T. ; Bravo, C.; Doney, S.C.; Siegel, D.A. & Nelson, N. (2001). Physical and biogeochemical variability from hours to years at the Bermuda Testbed Mooring site: June 1994– March 1998, *Deep-Sea Research Part II*: *Topical* 

teleconnections in boreal summer: Observed interannual variability. *Journal of* 

Quantifying the flux of CaCO2 and organic carbon from the surface ocean using in situ measurements of O2, N2, pCO2 and pH. *Global Biogeochemical Cycles*, Vol. 25,

Rabier, F. & Riishojgaard, L. (2010). Requirements of numerical weather prediction for observations of the oceans. *Proceedings of OceanObs'09: Sustained Ocean Observations and Information for Society (Vol. 2),* Venice, Italy, September 2009, Hall, J.; Harrison, D.E. & Stammer, D. (Eds.), ESA Publication WPP-306,

Observations of cloud, radiation, and surface forcing in the equatorial eastern

climate variability and predictability. *Proceedings of the "OceanObs'09: Sustained Ocean Observations and Information for Society" Conference (Vol. 2)*, Venice, Italy, September 2009, Hall, J.; Harrison, D.E. & Stammer, D. (Eds.), ESA Publication

Impact of anthropogenic CO2 on the CaCO3 system in the oceans. *Science*, Vol. 305,

Carr, M.‐E. (2006). Decadal variability of the air‐sea CO2 fluxes in the equatorial Pacific Ocean. *Journal of Geophysical Research*, Vol. 111, C08S90, doi:10.1029/

Saino, T.; Lee, K.; Anthony, K.; & Kleypas, J. (2010). An international observational network for ocean acidification. *Proceedings of the "OceanObs'09: Sustained Ocean Observations and Information for Society" Conference (Vol. 2)*, Venice, Italy, September 2009, Hall, J.; Harrison, D.E. & Stammer, D. (Eds.), ESA Publication WPP-306,


Lampitt, R.S.; Billett, D.S.M. & Martin, A.P. (2010b). The sustained observatory over the

Lebel, T.; & Co-Authors (2011) The AMMA field campaigns: accomplishments and lessons learned. *Atmospheric Science Letters,* Vol. 12, pp. 123-128, doi: 10.1002/asl.323 Lumpkin, R. & Pazos, M. (2007). Measuring surface currents with Surface Velocity Program

Mackas, D.L.; Batten, S. & Trudel, M. (2007). Effects on zooplankton of a warmer ocean: Recent evidence from the northeast Pacific. *Progress in Oceanography*, Vol. 75, pp. 223-252 Marty, J.C., (2002). The DYFAMED time-series program (French-JGOFS). *Deep-Sea Research* 

McPhaden, M.J.; Busalacchi, A.J.; Cheney, R.; Donguy, J.-R.; Gage, K.S.; Halpern, D.; Ji, M.;

McPhaden, M.J.; Meyers, G.; Ando, K.; Masumoto, Y.; Murty, V.S.N.; Ravichandran, M.;

Mechoso, C.R. & Co-Authors (1995). The seasonal cycle over the tropical Pacific in coupled

Meyers, G. (2008). The Australian Integrated Marine Observing System. *Journal of Ocean* 

Miller, C.B. (1993). Pelagic production processes in the Subarctic Pacific. *Progress in* 

Neuer, S. & Co-Authors (2007). Biogeochemistry and hydrography in the eastern subtropical

Ohman, M.D. & Venrick, E.L. (2003). CalCOFI in a changing ocean. *Oceanography*, Vol. 16,

Oka, E. & Qiu, B. (2011). Progress of North Pacific mode water research in the past decade.

Oka, E.; Suga, T.; Sukigara, C.; Toyama, K.; Shimada, K. & Yoshida, J. (2011a). ''Eddy-

Julian, P.; Meyers, G.; Mitchum, G.T.; Niiler, P.P.; Picaut, J.; Reynolds, R.W.; Smith, N. & Takeuchi, K. (1998). The Tropical Ocean Global Atmosphere observing system: A decade of progress. *Journal of Geophysical Research*, Vol. 103, No. C7, pp.

Syamsudin, F.; Vialard, J.; Yu, L. & Yu, W. (2009). RAMA: The Research Moored Array for African-Asian-Australian Monsoon Analysis and Prediction. *Bulletin of* 

ocean-atmosphere general circulation models. *Monthly Weather Review*, Vol. 123, pp.

North Atlantic gyre. Results from the European time-series station ESTOC. *Progress* 

resolving'' observation of the North Pacific subtropical mode water. *Journal of* 

Press, ISBN: 978-0-521-87018-4, Cambridge, UK

*II*, Vol. 49, No. 11, pp. 1963-1964

14,169–14,240, doi: 10.1029/97JC02906

*the American Meteorological Society*, Vol. *90,* pp. 459-480

*Journal of Oceanography*, doi:10.1007/s10872-011-0032-5

pp. 1524-1550

2825-2838

No. 3, pp. 76-85

*Technology*, Vol. 3, pp. 80-81

*Oceanography*, Vol. 32, pp. 1-15

*in Oceanography*, Vol. 72, No. 1, pp. 1-29

*Physical Oceanography*, Vol. 41, pp. 666–681

Porcupine Abyssal Plain (PAP): Insights from time series observations and process studies (preface) [In special issue: Water Column and Seabed Studies at the PAP Sustained Observatory in the Northeast Atlantic]. *Deep Sea Research Part II: Topical Studies in Oceanography,* Vol. 57, No. 15, pp. 1267-1271, doi:10.1016/j.dsr2.2010.01.003 Large, W.G.; McWilliams, J.C. & Niiler, P.P. (1986). Upper ocean thermal response to strong

autumnal forcing of the northeast Pacific. *Journal of Physical Oceanography*, Vol. 16,

drifters: the instrument, its data, and some recent results. (Chapter 2) In: *Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics*, Griffa, A.; Kirwan, A.D.; Mariano, A.; Özgökmen, T. & Rossby, T. (Eds.), pp. 39-67, Cambridge University


### **Current Advances in Uncertainty Estimation of Earth Observation Products of Water Quality**

Mhd. Suhyb Salama

*Department of Water Resources, ITC, University of Twente, Hengelosestraat 99, 7500 AA Enschede The Netherlands*

#### **1. Introduction**

228 Earth Observation

Schmitt, R. & Co-Authors (2010). Salinity and global water cycle. *Proceedings of the* 

Takahashi, T. & Co-Authors. (2009). Climatological mean and decadal change in surface

Talley, L., Fine, R., Lumpkin, R., Maximenko, N. & Morrow, R. (2010). Surface ventilation

Tokinaga, H.; Tanimoto, Y.; Xie, S.-P.; Sampe, T.; Tomita, H. & Ichikawa, H. (2009). Ocean

In situ and satellite observations. *Journal of Climate,* Vol. 22, pp. 4241-4260 Trull, T.W.; Sedwick, P.N.; Griffiths, F.B.; & Rintoul, S.R. (2001) Introduction to special section: SAZ Project. *Journal of Geophysical Research*, Vol. 106, pp. 31,425–31,430 Trull, T.W.; Schulz, E.W.; Bray, S.G; Pender, L; McLaughlan, D. & Tilbrook, B. (2010). The

Wallace, J.M. & Gutzler, D.S. (1981). Teleconnections in the geopotential height field during the northern hemisphere winter. *Monthly Weather Review*, Vol. 109, pp. 784-812 Webster, P.J. & Lukas, R. (1992). TOGA COARE: The Coupled Ocean-Atmosphere Response Experiment. *Bulletin of the American Meteorological Society*, Vol. 73, pp. 1377-1416 Whitney, F.A.; Freeland, H.J. & Robert, M. (2007). Persistently declining oxygen levels in the

Wijffels, S. & Co-Authors (2010). Progress and challenges in monitoring ocean temperature

Wong, C.S.; Waser, N.A.D.; Whitney, F.A.; Johnson, W.K. & Page, J.S. (2002). Time-series

*Research Part II*: *Topical Studies in Oceanography,* Vol. 49, pp. 5717-5738 Wood, R. & Co-Authors (2011). The VAMOS Ocean-Cloud-Atmosphere-Land Study

ESA Publication WPP-306, doi:10.5270/OceanObs09.pp.34

j.dsr2.2008.12.009

179-199

doi:10.5270/OceanObs09.pp.38

doi:10.1109/OCEANSSYD.2010.5603514

doi:10.5270/OceanObs09.pp.41

*"OceanObs'09: Sustained Ocean Observations and Information for Society" Conference (Vol. 1)*, Venice, Italy, September 2009, Hall, J.; Harrison, D.E. & Stammer, D. (Eds.),

ocean pCO2 and net sea-air CO2 flux over the global oceans. *Deep-Sea Research Part II*: *Topical Studies in Oceanography,* Vol. 56, pp. 554-577, doi:10.1016/

and circulation. *Proceedings of the "OceanObs'09: Sustained Ocean Observations and Information for Society" Conference (Vol. 1)*, Venice, Italy, September 2009, Hall, J.; Harrison, D.E. & Stammer, D. (Eds.), ESA Publication WPP-306,

frontal effects on the vertical development of clouds over the western North Pacific:

Australian Integrated Marine Observing System Southern Ocean Time Series facility. OCEANS '10 IEEE Sydney Conference Volume, May 2010, 7 pp.,

interior waters of the eastern subarctic Pacific. *Progress in Oceanography,* Vol. 75, pp.

and heat content. *Proceedings of the "OceanObs'09: Sustained Ocean Observations and Information for Society" Conference (Vol. 1),* Venice, Italy, September 2009, Hall, J; Harrison, D.E. & Stammer, D. (Eds.), ESA Publication WPP-306,

study of biogeochemistry of the North East subarctic Pacific: reconciliation of the Corg/N remineralization and update ratios with the Redfield ratios. *Deep-Sea* 

Regional Experiment (VOCALS-REx): goals, platforms, and field operations, *Atmospheric Chemistry and Physics*, Vol. 11, pp. 627-654, doi:10.5194/acp-11-627-2011

Remote sensing data over a water body are related to the physical and biological properties of water constituents through inherent optical properties (IOPs). These IOPs characterize the absorption and scattering of the water column and are used as proxies to water quality variables. The scientific procedure to derive IOPs from ship/space borne remote sensing data can be divided into three steps: *i- forward modeling*, relates the radiometric data to the IOPs of the water column; *ii- parametrization*, defines the minimal set of IOPs whose values completely characterize the observed radiance; *iii- inversion*, derives the values of IOPs, and hence water quality variables, from radiometric data.

Reliable methods for uncertainty quantification of earth observation (EO) products of IOPs are important for sensor and algorithm validation, assessment, and operational monitoring. High accuracy in both observations and algorithms may reduce considerable ranges of errors. EO derived IOPs, however, have an inherent stochastic component. This is due to the dynamic nature of aquatic biogeophysical quantities, intrinsic fluctuations, model approximations, correction schemes, and inversion methods. Due to stochasticity of the measurements, as well as model approximations and inversion ambiguity, the retrieved IOPs are not the only possible set that caused the observed spectrum (Sydor et al., 2004). Instead, many other IOPs sets may be derived. Each of these sets has an unknown probability of being the derived product. The probability distribution of the estimated IOPs provides, therefore, all the necessary information about the variability and uncertainties of derived IOPs.

Generally, uncertainty assessment of EO-data falls under one of two methods, namely analytical deterministic or stochastic methods. Deterministic methods are based on gradient techniques and have been used to asses the uncertainty of IOPs as derived from EO-data. Duarte et al. (2003) analyzed the sensitivity of the observed remote sensing reflectance due to variable concentrations of water constituents. Maritorena & Siegel (2005) employed a deterministic technique for consistent merging of different products using their uncertainties. Wang et al. (2005) performed a detailed study on the uncertainties of model inversion related to fluctuations in each of the IOPs and their spectral shapes. Salama et al. (2009) studied the uncertainty of model-inversion using the gradient-based method. They found that the derived IOPs are linearly related to their errors. Lee et al. (2010) used analytical derivative of the quasi-analytical algorithm (Lee et al., 2002, QAA) to estimate the uncertainty of IOPs as derived from QQA. On the other hand, Salama et al. (2011) developed a gradient based method to estimate the accuracy of a specific model-parameterizations setup. The advantage of their method is that it does not require radiometric information, however on the cost of deriving detailed information. The main drawback of gradient-based methods is that they depend on the used EO-model to derive the IOPs and *a priori* knowledge on the radiometric uncertainty. On the other hand, stochastic methods are less dependent on the used EO-model and can deal with non-convex functions. The basic idea of stochastic methods is to systematically partition the region of feasible solutions into smaller subregions and move between them using random search techniques. Stochastic uncertainty techniques have been recently adopted to estimate the uncertainty of EO-derived IOPs. Salama & Stein (2009) proposed a stochastic technique to quantify and separate the source of errors of IOPs derived from EO data. The main objective of this chapter is to review the two families of error-estimation methods and inter-compare their results.

The reminder of this chapter is organized as follow: in Section (2) we describe the ocean color paradigm, i.e. used ocean color model, its parametrization and inversion. Deterministic methods for error derivation are described in Section (3), whereas the principles of stochastic methods are detailed in Section (4). The results of both families (deterministic and stochastic) are inter-compared in Section (5) whereas, in Section (6) we present an exercise to decompose the different sources of uncertainty. Error propagation exercise is detailed in Section (7) followed by a discussion on the advantages and limitations of error estimation methods in Section (8). We finalize the chapter by a summary and future developments in Section (9).

#### **2. Ocean color model inversion**

Remote sensing reflectance, the ratio of radiance to irradiance, above the water surface *Rs*w can be related to the inherent optical properties (IOPs) using the ocean color model of Gordon et al. (1988):

$$\text{Rs}\_{\text{W}}(\lambda) = \frac{t}{n\_{\text{W}}^2} \sum\_{i=1}^{2} g\_i \left( \frac{b\_b(\lambda)}{b\_b(\lambda) + a(\lambda)} \right)^i. \tag{1}$$

Where *Rs*w(*λ*) is the remote sensing reflectance leaving the water surface at wavelength *λ*; *gi* are constants taken from Gordon et al. (1988); *t* and *n*<sup>w</sup> are the sea−air transmission factor and water index of refraction, respectively. Their values are taken from literatures (Gordon et al., 1988; Lee, 2006; Maritorena et al., 2002). The parameters *bb*(*λ*) and *a*(*λ*) are the bulk backscattering and absorption coefficients of the water column, respectively. The light field in the water column is assumed to be governed by four optically significant constituents, namely: water molecules, phytoplankton green pigment chlorophyll-a (Chl-a), colored dissolved organic matter (CDOM) and detritus/suspended particulate matter (SPM). The absorption and backscattering coefficients are modeled as the sum of absorption and backscattering from water constituents:

$$a(\lambda) = a\_{\rm W}(\lambda) + a\_{\rm ph}(\lambda) + a\_{\rm dg}(\lambda) \tag{2}$$

$$b\_b(\lambda) = 0.5b\_\mathbf{W}(\lambda) + \eta b\_{\text{sym}}(\lambda). \tag{3}$$

Where the subscripts on the right hand side of equations (2) and (3) denote water constituents: water *w*; phytoplankton green pigment *ph*; lumped absorption effects of CDOM and detritus *dg* and suspended particulate matter *spm*. *η* is the backscattering fraction, its value is estimated from Petzold's "San Diego harbor" scattering phase function as *η* ∼ 0.018 (Petzold, 1977).

The absorption and scattering coefficients of water molecules, *a*w(*λ*) and *b*w(*λ*), are assumed to be constant. Their values are obtained from Pope & Fry (1997) and Mobley (1994), respectively. The total absorption of phytoplankton pigments *a*ph(*λ*) is approximated as in Lee et al. (1998),

$$a\_{\rm ph}(\lambda) \simeq a\_0(\lambda)a\_{\rm ph}(440) + a\_1(\lambda)a\_{\rm ph}(440)\ln a\_{\rm ph}(440),\tag{4}$$

where *a*0(*λ*) and *a*1(*λ*) are statistically derived coefficients of Chl-a, their values are taken from Lee et al. (1998).

The absorption effects of detritus and colored dissolved organic matter (CDOM) are combined due to the similar spectral signature (Maritorena et al., 2002) and approximated using the model of Bricaud et al. (1981),

$$a\_{\rm dg}(\lambda) = a\_{\rm dg}(440) \exp\left[-s(\lambda - 440)\right],\tag{5}$$

where *s* is the spectral exponent of combined effects of detritus and CDOM. The scattering coefficient of SPM *b*spm(*λ*) is parameterized as a single type of particles with a spectral dependency exponent *y* (Kopelevich, 1983):

$$b\_{\rm spm}(\lambda) = b\_{\rm spm}(550) \left(\frac{550}{\lambda}\right)^y. \tag{6}$$

Equation (1) is inverted to derive five parameters from the IOCCG data set and three parameters from the NOMAD data set. The derived parameters are called the set of IOPs and expressed in a vector notation as **iop**. The exponents *s* and *y* are assumed to be unknown (Salama et al., 2009) and are derived from the IOCCG data set as:

$$\mathbf{i}\mathbf{o}\mathbf{p} = \left[ a\_{\mathrm{ph}}(440), a\_{\mathrm{dg}}(440), b\_{\mathrm{spm}}(550), \mathbf{s}\_{\prime}\mathbf{y} \right]. \tag{7}$$

The numerical inversion is carried out using the constrained Levenberg-Marquardt Algorithm (LMA) (Press et al., 2002), where the constraints are set such that they guarantee positive and physically meaningful values: between 0 and 100 m−<sup>1</sup> for *a*ph(440), *a*dg(440) and *b*spm(550), between 0 and 2.5 for *y* and between 0 and 0.03 for *s*. Optimization is started using the initial values of Lee et al. (1999) and *s* = 0.021 nm−<sup>1</sup> and *y* = 1.7. Maximum number of iteration is set equal to 100.

#### **3. Error estimation via deterministic method**

#### **3.1 Description**

2 Will-be-set-by-IN-TECH

advantage of their method is that it does not require radiometric information, however on the cost of deriving detailed information. The main drawback of gradient-based methods is that they depend on the used EO-model to derive the IOPs and *a priori* knowledge on the radiometric uncertainty. On the other hand, stochastic methods are less dependent on the used EO-model and can deal with non-convex functions. The basic idea of stochastic methods is to systematically partition the region of feasible solutions into smaller subregions and move between them using random search techniques. Stochastic uncertainty techniques have been recently adopted to estimate the uncertainty of EO-derived IOPs. Salama & Stein (2009) proposed a stochastic technique to quantify and separate the source of errors of IOPs derived from EO data. The main objective of this chapter is to review the two families of

The reminder of this chapter is organized as follow: in Section (2) we describe the ocean color paradigm, i.e. used ocean color model, its parametrization and inversion. Deterministic methods for error derivation are described in Section (3), whereas the principles of stochastic methods are detailed in Section (4). The results of both families (deterministic and stochastic) are inter-compared in Section (5) whereas, in Section (6) we present an exercise to decompose the different sources of uncertainty. Error propagation exercise is detailed in Section (7) followed by a discussion on the advantages and limitations of error estimation methods in Section (8). We finalize the chapter by a summary and future developments in Section (9).

Remote sensing reflectance, the ratio of radiance to irradiance, above the water surface *Rs*w can be related to the inherent optical properties (IOPs) using the ocean color model of Gordon

Where *Rs*w(*λ*) is the remote sensing reflectance leaving the water surface at wavelength *λ*; *gi* are constants taken from Gordon et al. (1988); *t* and *n*<sup>w</sup> are the sea−air transmission factor and water index of refraction, respectively. Their values are taken from literatures (Gordon et al., 1988; Lee, 2006; Maritorena et al., 2002). The parameters *bb*(*λ*) and *a*(*λ*) are the bulk backscattering and absorption coefficients of the water column, respectively. The light field in the water column is assumed to be governed by four optically significant constituents, namely: water molecules, phytoplankton green pigment chlorophyll-a (Chl-a), colored dissolved organic matter (CDOM) and detritus/suspended particulate matter (SPM). The absorption and backscattering coefficients are modeled as the sum of absorption and backscattering from

Where the subscripts on the right hand side of equations (2) and (3) denote water constituents: water *w*; phytoplankton green pigment *ph*; lumped absorption effects of CDOM and detritus *dg* and suspended particulate matter *spm*. *η* is the backscattering fraction, its value is estimated from Petzold's "San Diego harbor" scattering phase function as *η* ∼ 0.018 (Petzold,

 *bb*(*λ*) *bb*(*λ*) + *a*(*λ*) *i*

*a*(*λ*) = *a*w(*λ*) + *a*ph(*λ*) + *a*dg(*λ*) (2) *bb*(*λ*) = 0.5*b*w(*λ*) + *ηb*spm(*λ*). (3)

. (1)

2 ∑ *i*=1 *gi*

error-estimation methods and inter-compare their results.

*Rs*w(*λ*) = *<sup>t</sup>*

*n*2 w

**2. Ocean color model inversion**

et al. (1988):

water constituents:

1977).

The uncertainty in the derived IOPs is attributed to the infinitesimal change of radiance in equation (1) as,

$$
\Delta \text{Rs}\_{\text{W}}(\lambda) = w\_{\text{ph}}(\lambda) \Delta a\_{\text{ph}}(440) + w\_{\text{dg}}(\lambda) \Delta a\_{\text{dg}}(440) + w\_{\text{spm}}(\lambda) \Delta b\_{\text{spm}}(550), \tag{8}
$$

where Δ*Rs*w(*λ*) represents the radiometric uncertainty at the wavelength *λ*; *w*ph, *w*dg,*w*spm are the partial derivatives of *Rs*w with respect to the derived IOPs. Equation (8) represents an over determined linear set of equations that can only be solved if the radiometric uncertainty is known in at least *n* wavelengths, with *n* being the number of derived IOPs.

Analytical expressions of partial derivatives in (8) are listed hereafter. To simplify the notations let us define the ratio *ω* as,

$$
\omega = \frac{b\_b(\lambda)}{c\_b^2},
\tag{9}
$$

where *cb* = *bb*(*λ*) + *a*(*λ*). The partial derivative *w*ph is,

$$w\_{\rm ph} = \frac{\partial \mathcal{R}s\_{\rm W}(\lambda)}{\partial a\_{\rm ph}(440)} = \frac{t}{n\_{\rm W}^2} \zeta\_{\rm ph} \sum\_{i=1}^{2} j\_i \omega^i \,\,\,\,\tag{10}$$

where *ζ*ph is the spectral dependency of Chl*a*,

$$\mathcal{L}\_{\rm ph} = a\_0 + a\_1 \left[ 1 + \log a\_{\rm ph}(440) \right]. \tag{11}$$

The parameters *ji* are *j*<sup>1</sup> = −*g*<sup>1</sup> and *j*<sup>2</sup> = −2*g*2*cb*. The term *w*dg is expressed as,

$$w\_{\rm dg} = \frac{\partial \mathcal{R}s\_{\rm W}(\lambda)}{\partial a\_{\rm dg}(440)} = \frac{t}{n\_{\rm W}^2} \zeta\_{\rm dg} \sum\_{i=1}^{2} j\_i \omega^i. \tag{12}$$

The partial derivative *w*spm is expressed as,

$$w\_{\rm spm} = \frac{\partial R s\_{\rm W}(\lambda)}{\partial b\_{\rm bspm}(550)} = \frac{t}{n\_{\rm W}^2} \sum\_{i=0}^{2} v\_i \omega^i \,\mathrm{\,} \tag{13}$$

where *v*<sup>0</sup> = *g*1/*cb*, *v*<sup>1</sup> = 2*g*<sup>2</sup> − *g*<sup>1</sup> and *v*<sup>2</sup> = *j*2.

Based on the above theoretical formulation in equation (8), Lee et al. (2010) obtained the uncertainty of IOPs using the quasi analytical algorithm (Lee et al., 2002) and a prior information on the radiometric errors. Salama et al. (2011), on the other hand, proposed a method that produces a single (or ensemble) uncertainty measure for the collective errors in the derived IOPs relative to the radiometric uncertainty without the need for model inversion or prior information on the radiometric errors. In addition, the method provides the optimum accuracy which can be achieved by a model-parametrization setup. The method of Salama et al. (2011) is self-contained and is directly applicable to existing satellite based IOP products, we therefore, brief this method hereafter.

#### **3.2 Ensemble uncertainty of IOPs**

Applying Taylor series approximation of the second moment on equation (8) gives:

$$
\sigma\_\mathrm{r}^2(\lambda) = w\_{\mathrm{ph}}^2(\lambda)\sigma\_{\mathrm{ph}}^2(440) + w\_{\mathrm{dg}}^2(\lambda)\sigma\_{\mathrm{dg}}^2(440) + w\_{\mathrm{spm}}^2(\lambda)\sigma\_{\mathrm{spm}}^2(550) \tag{14}
$$

Where *σ*<sup>2</sup> <sup>r</sup> (*λ*) is the radiometric variance and *σ*<sup>2</sup> ph(440), *<sup>σ</sup>*<sup>2</sup> dg(440), and *<sup>σ</sup>*<sup>2</sup> spm(550) are the variances of the derived IOPs. The covariance terms in equation(14) is assumed to be zero, i.e. the IOPs are mutually independent. Knowledge on the radiometric uncertainty is now avoided by dividing both sides of equation (14) by the radiometric variance,

$$\sum\_{i=1}^{i=n} w\_i^2(\lambda) \psi\_i^2(\lambda) = 1,\tag{15}$$

with *ψ*<sup>2</sup> *<sup>i</sup>* (*λ*) = *<sup>σ</sup>*<sup>2</sup> *<sup>i</sup>* (*λ*0)/*σ*<sup>2</sup> <sup>r</sup> (*λ*). The ensemble uncertainty of IOPs per radiometric error, Ψ(*λ*), is derived from equation (15) by normalizing both sides by the squared sum of partial derivatives and taking its square-root:

$$\Psi(\lambda) = \left(\sum\_{i=1}^{i=n} w\_i^2(\lambda)\psi\_i^2(\lambda) / \sum\_{i=1}^{i=n} w\_i^2(\lambda)\right)^{0.5} = \left(\sum\_{i=1}^{i=n} w\_i^2(\lambda)\right)^{-0.5}.\tag{16}$$

Ψ(*λ*) represent the ensemble uncertainty of IOPs per unit error of remote sensing reflectance and have the unit of sr m−1. The advantages of this methods is that it can be applied on the readily available earth observation products of IOPs (water quality proxies). Fig.(1) shows the climatology of the ensemble uncertainty relative to the sum of derived IOPs. These figures are generated by applying equation (16), to the monthly mean values of GSM-derived IOPs and then averaged for each year from 1997-2007 (the year 1997 is not shown). It is clear that there are persistent patterns of high values throughout the last decade in the subtropical gyres, whereas lower values are observed in most coastal areas. These results are in accordance to the global uncertainty maps of Chlorophyll-a presented by Mélin (2010) for the subtropical gyres, whereas the coastal waters show contrary patterns, i.e. very small error. The spatial distribution of the relative-ensemble uncertainty largely resembles the observed values of remote sensing reflectance at 443 nm.

#### **3.3 Detailed uncertainty of IOPs**

4 Will-be-set-by-IN-TECH

Analytical expressions of partial derivatives in (8) are listed hereafter. To simplify the

*<sup>ω</sup>* <sup>=</sup> *bb*(*λ*) *c*2 *b*

*<sup>∂</sup>a*ph(440) <sup>=</sup> *<sup>t</sup>*

*<sup>∂</sup>a*dg(440) <sup>=</sup> *<sup>t</sup>*

*<sup>∂</sup>bb*spm(550) <sup>=</sup> *<sup>t</sup>*

Based on the above theoretical formulation in equation (8), Lee et al. (2010) obtained the uncertainty of IOPs using the quasi analytical algorithm (Lee et al., 2002) and a prior information on the radiometric errors. Salama et al. (2011), on the other hand, proposed a method that produces a single (or ensemble) uncertainty measure for the collective errors in the derived IOPs relative to the radiometric uncertainty without the need for model inversion or prior information on the radiometric errors. In addition, the method provides the optimum accuracy which can be achieved by a model-parametrization setup. The method of Salama et al. (2011) is self-contained and is directly applicable to existing satellite based IOP products,

Applying Taylor series approximation of the second moment on equation (8) gives:

dg(*λ*)*σ*<sup>2</sup>

variances of the derived IOPs. The covariance terms in equation(14) is assumed to be zero, i.e. the IOPs are mutually independent. Knowledge on the radiometric uncertainty is now

dg(440) + *<sup>w</sup>*<sup>2</sup>

ph(440), *<sup>σ</sup>*<sup>2</sup>

spm(*λ*)*σ*<sup>2</sup>

dg(440), and *<sup>σ</sup>*<sup>2</sup>

ph(440) + *<sup>w</sup>*<sup>2</sup>

*n*2 w *ζ*ph 2 ∑ *i*=1 *jiω<sup>i</sup>*

1 + log *a*ph(440)

*n*2 w *ζ*dg 2 ∑ *i*=1 *jiω<sup>i</sup>*

> *n*2 w

2 ∑ *i*=0

*viω<sup>i</sup>*

, (9)

, (10)

. (11)

. (12)

, (13)

spm(550) (14)

spm(550) are the

notations let us define the ratio *ω* as,

where *cb* = *bb*(*λ*) + *a*(*λ*). The partial derivative *w*ph is,

where *ζ*ph is the spectral dependency of Chl*a*,

The partial derivative *w*spm is expressed as,

where *v*<sup>0</sup> = *g*1/*cb*, *v*<sup>1</sup> = 2*g*<sup>2</sup> − *g*<sup>1</sup> and *v*<sup>2</sup> = *j*2.

we therefore, brief this method hereafter.

<sup>r</sup> (*λ*) = *<sup>w</sup>*<sup>2</sup>

ph(*λ*)*σ*<sup>2</sup>

<sup>r</sup> (*λ*) is the radiometric variance and *σ*<sup>2</sup>

**3.2 Ensemble uncertainty of IOPs**

*σ*2

Where *σ*<sup>2</sup>

*<sup>w</sup>*ph <sup>=</sup> *<sup>∂</sup>Rs*w(*λ*)

*ζ*ph = *a*<sup>0</sup> + *a*<sup>1</sup>

*<sup>w</sup>*dg <sup>=</sup> *<sup>∂</sup>Rs*w(*λ*)

*<sup>w</sup>*spm <sup>=</sup> *<sup>∂</sup>Rs*w(*λ*)

The parameters *ji* are *j*<sup>1</sup> = −*g*<sup>1</sup> and *j*<sup>2</sup> = −2*g*2*cb*. The term *w*dg is expressed as,

Based on equation(8), Bates & Watts (1988) devised an elegant method to quantify the uncertainties for each derived IOPs as,

$$IOP\_{l\pm} = IOP\_l \pm \sigma \left\| \mathbf{W} \cdot \mathbf{R}^{-1} \right\| \left\| t(N - m, a/2) \right\|\tag{17}$$

Where *IOPi*<sup>±</sup> is the upper "+" and lower "-" bounds of the derived IOP; **<sup>W</sup>** is the matrix of partial derivatives; *σ* is the standard deviation of residuals between measured and model best-fit radiances; *t*(*N* − *m*, *α*/2) is the upper quantile for a Student's *t* distribution with *N* − *m* degrees of freedom. *N* is the number of bands and *m* is the number of unknowns. **R** is the upper triangle matrix of QR decomposition of the jacobian matrix. equation (17) has widely been used to estimate the error of derived IOP (Salama et al., 2009; Van Der Woerd & Pasterkamp, 2008). The derivative term in equation (17), can be approximated as being the gradient of equation (1) with respect to the derived IOPs and is computed for model-best-fit to the observation. This approximation is derived as follows.

Observed remote sensing reflectance can be approximated as being the sum of the model best-fit *Rsm*(*λ*) and its deviations from the observed one *�*(*λ*):

$$\text{Rs}(\lambda) = \text{Rs}\_{\mathfrak{m}}(\lambda) + \varepsilon(\lambda) \tag{18}$$

The term *Rsm*(*λ*) is obtained from fitting the model in equation (1) to the radiometric observation of ocean color or/and field sensors. The error *�*(*λ*) is a lumped term that includes

Fig. 1. Time series of ensemble-uncertainty of IOPs at 440 nm relative to the sum of derived IOPs.

model goodness-of-fit, measurements and atmospheric noises. For simplicity this term is assumed to be nearly independent the derived IOPs. The derivative of (18), with respect to the derived values, can then be written as:

$$\frac{\Delta \text{Rs}(\lambda)}{\Delta \text{iop}} = \frac{\Delta \text{Rs}\_m(\lambda)}{\Delta \text{iop}} + \frac{\Delta \epsilon(\lambda)}{\Delta \text{iop}} \tag{19}$$

By definition of the least square minimization that was used to derive model-best-fit *Rsm*(*λ*), we have:

$$\frac{\Delta\varepsilon(\lambda)}{\Delta\mathbf{i}\mathbf{op}} \approx 0\tag{20}$$

Equation (19) can then be reduced to:

$$\frac{\Delta \text{Rs}(\lambda)}{\Delta \text{iop}} \approx \frac{\Delta \text{Rs}\_{\text{m}}(\lambda)}{\Delta \text{iop}}\tag{21}$$

The simplification in equation (21) implies that the gradient of measured remote sensing reflectance can be approximated by the gradient of the model in (1) which can easily be computed as in equation (21).

#### **4. Error estimation via stochastic method**

#### **4.1 Description**

6 Will-be-set-by-IN-TECH

**Jan Feb**

**Mar Apr**

**May Jun**

**Jul Aug**

**Sept Oct**

**Nov Dec**

IOPs.

Fig. 1. Time series of ensemble-uncertainty of IOPs at 440 nm relative to the sum of derived

In this section we summarize the method of Salama & Stein (2009) as it is the only stochastic method published so far in the field of ocean color.

Salama and Stein used prior information to obtain plausible ranges of the IOPs. These ranges are used in a log-normal distribution to generate a first-estimate of the probability distribution (PD) of the IOPs. This first-estimate PD is called the prior PD of the IOPs. The method, explained hereafter, uses the prior PD to converge to a "posterior" probability distribution that better describes the IOPs.

Prior information is obtained from known radiometric errors in *Rs*w and model-inversion intrinsic errors. Radiometric errors are: (i) noise equivalent radiance of the sensor and (ii) error in aerosol optical thickness. Sensor equivalent radiance is known from sensor specifications and post-launch calibrations. Model approximation and inversion-accuracy can be quantified by evaluating the performance of the employed ocean color model against measurements and radiative transfer simulations. Atmospheric error, due to variation in aerosol optical thickness, can be evaluated from available measurements or by using standard atmospheric correction models. The error estimate algorithm will follow sequential steps as detailed hereafter.

An initial estimate of the confidence interval around water remote sensing reflectance can be computed using the method of (Bates & Watts, 1988, pp.59, cf. 1.36 ) or available knowledge on plausible fluctuations for model, noise and atmospheric residual respectively. The upper and lower bounds of this interval are then inverted to derive the corresponding two sets of IOPs **iop**u, **iop**<sup>l</sup> . These sets with the derived **iop**obs from the water remote sensing reflectance, hereafter will be called the IOP-triplet: (**iop**<sup>l</sup> , **iop**obs, **iop**u) and denoted as *ω*. The value log **iop**obs is assumed to approximate the mean of a first-estimate, i.e. prior, probability distribution (PD) of IOPs in the logarithmic space. The prior PD is first elicited using the IOP-triplet and prior knowledge on the log-normal shape of the IOPs as explained in section (4.2). The posterior probability distribution, or our gain in information, is then inferred by maximizing the expected utility (Bernardo, 1979; Carlin & Polson, 1991) as explained in section(4.3).

#### **4.2 Prior probability distribution**

Estimating the IOP-triplet, **iop**<sup>l</sup> , **iop**obs and **iop**u, is the first step towards deriving the prior probability distribution of the IOPs. The use of flat or improper priors, e.g. uniform distribution, may invalidate the derivation of the posterior probability (Goutis & Robert, 1998). According to the maximum entropy principle (Jaynes, 1957a;b) a proper prior probability distribution should have the maximum entropy provided by the IOP-triplet. However applying the maximum entropy principle on the information provided by the IOP-triplet will give the probability values of **iop**<sup>l</sup> , **iop**obs and **iop**<sup>u</sup> but not the whole probability distribution P(**iop**); for more detail one may consult Jaynes (1968). To overcome this limitation, in data values, we introduce the following method to elicit the prior distribution of IOPs assuming that they are log-normally distributed. The log-normal assumption is based on Campbell's work (Campbell, 1995) who pointed out that, in general, marine bio-geophysical quantities follow a log-normal distribution i.e. their log transform has a Gaussian distribution.

The IOP-triplet is first transformed to the log space, allowing us to use a Gaussian distribution to simulate the PD of IOPs. Second we assume that log **iop**obs approximates the mean of the prior PD of the IOPs. The Gaussian distribution of the IOPs can be standardized to a N(0,1) distribution, i.e. normal distribution with zero mean and unity standard deviation. The standard Gaussian variate for log **iop**<sup>u</sup> is,

$$\mathfrak{a}\_{\mathfrak{u}} = \frac{\log \mathbf{i} \mathbf{op}\_{\mathfrak{u}} - \log \mathbf{i} \mathbf{op}\_{\mathrm{obs}}}{\sigma} \,\,\,\,\tag{22}$$

where *α*<sup>u</sup> is a sample drawn from the N(0,1) that corresponds to **iop**u. The parameters log **iop**obs and *σ* are the expectation and the standard deviation of the population. From equation (22) and the second set in the IOP-triplet **iop**<sup>l</sup> we can establish the ratio,

$$r\_{\rm u,l} = \frac{a\_{\rm u}}{a\_{\rm l}} = \frac{\log \mathbf{i} \mathbf{o} \mathbf{p}\_{\rm u} - \log \mathbf{i} \mathbf{o} \mathbf{p}\_{\rm obs}}{\log \mathbf{i} \mathbf{o} \mathbf{p}\_{\rm l} - \log \mathbf{i} \mathbf{o} \mathbf{p}\_{\rm obs}},\tag{23}$$

and for convenience we set log **iop**<sup>u</sup> > log **iop**<sup>l</sup> . The standardization of the IOPs distribution allows us to use the N(0,1) random number generator to simulate values of *α* as in equation (22). The ratios of these random values are also computed and compared to the ratio of the IOP-triplet in equation (23). The best fit allocates the two values *α*<sup>u</sup> and *α*<sup>l</sup> , hence the standard deviation of the prior distribution can be computed from equation (22). The prior probability distribution of the IOPs, is now known: N(log **iop**obs,*σ*), i.e. a Gaussian distribution with log **iop**obs mean and *σ* standard deviation.

Random values (1000) are generated from the N(0,1) distribution such that they satisfy an imposed acceptance-rejection condition. This condition requires that the ratio in equation (23) defines a unique ordered pair of *α*. This is to enable the use of a simple searching method with a fast convergence to the best-fit ratio. The uniqueness in this sense implies that the squared difference between the computed ratio, from the IOP-triplet, and the best-fit is a global minimum resolvable by the searching method and the used computer processor. Three look-up tables (LUTs) are then created from the generated values. These LUTs correspond to the following three scenarios:

8 Will-be-set-by-IN-TECH

section (4.2). The posterior probability distribution, or our gain in information, is then inferred by maximizing the expected utility (Bernardo, 1979; Carlin & Polson, 1991) as explained in

prior probability distribution of the IOPs. The use of flat or improper priors, e.g. uniform distribution, may invalidate the derivation of the posterior probability (Goutis & Robert, 1998). According to the maximum entropy principle (Jaynes, 1957a;b) a proper prior probability distribution should have the maximum entropy provided by the IOP-triplet. However applying the maximum entropy principle on the information provided by the

probability distribution P(**iop**); for more detail one may consult Jaynes (1968). To overcome this limitation, in data values, we introduce the following method to elicit the prior distribution of IOPs assuming that they are log-normally distributed. The log-normal assumption is based on Campbell's work (Campbell, 1995) who pointed out that, in general, marine bio-geophysical quantities follow a log-normal distribution i.e. their log transform has

The IOP-triplet is first transformed to the log space, allowing us to use a Gaussian distribution to simulate the PD of IOPs. Second we assume that log **iop**obs approximates the mean of the prior PD of the IOPs. The Gaussian distribution of the IOPs can be standardized to a N(0,1) distribution, i.e. normal distribution with zero mean and unity standard deviation.

*<sup>α</sup>*<sup>u</sup> <sup>=</sup> log **iop**<sup>u</sup> <sup>−</sup> log **iop**obs

where *α*<sup>u</sup> is a sample drawn from the N(0,1) that corresponds to **iop**u. The parameters log **iop**obs and *σ* are the expectation and the standard deviation of the population. From

allows us to use the N(0,1) random number generator to simulate values of *α* as in equation (22). The ratios of these random values are also computed and compared to the ratio of the IOP-triplet in equation (23). The best fit allocates the two values *α*<sup>u</sup> and *α*<sup>l</sup> , hence the standard deviation of the prior distribution can be computed from equation (22). The prior probability distribution of the IOPs, is now known: N(log **iop**obs,*σ*), i.e. a Gaussian

Random values (1000) are generated from the N(0,1) distribution such that they satisfy an imposed acceptance-rejection condition. This condition requires that the ratio in equation (23) defines a unique ordered pair of *α*. This is to enable the use of a simple searching method with a fast convergence to the best-fit ratio. The uniqueness in this sense implies that the squared difference between the computed ratio, from the IOP-triplet, and the best-fit is a global minimum resolvable by the searching method and the used computer processor. Three look-up tables (LUTs) are then created from the generated values. These LUTs correspond to

<sup>=</sup> log **iop**<sup>u</sup> <sup>−</sup> log **iop**obs log **iop**<sup>l</sup> − log **iop**obs

equation (22) and the second set in the IOP-triplet **iop**<sup>l</sup> we can establish the ratio,

*<sup>r</sup>*u,l <sup>=</sup> *<sup>α</sup>*<sup>u</sup> *α*l

distribution with log **iop**obs mean and *σ* standard deviation.

, **iop**obs and **iop**u, is the first step towards deriving the

, **iop**obs and **iop**<sup>u</sup> but not the whole

*<sup>σ</sup>* , (22)

. The standardization of the IOPs distribution

, (23)

section(4.3).

**4.2 Prior probability distribution** Estimating the IOP-triplet, **iop**<sup>l</sup>

a Gaussian distribution.

IOP-triplet will give the probability values of **iop**<sup>l</sup>

The standard Gaussian variate for log **iop**<sup>u</sup> is,

and for convenience we set log **iop**<sup>u</sup> > log **iop**<sup>l</sup>

$$\begin{aligned} \log \mathbf{iop}\_{\text{obs}} &> \log \mathbf{iop}\_{\text{u}} &> \log \mathbf{iop}\_{\text{l}}\\ \log \mathbf{iop}\_{\text{obs}} &< \log \mathbf{iop}\_{\text{l}} &< \log \mathbf{iop}\_{\text{u}}\\ \log \mathbf{iop}\_{\text{l}} &< \log \mathbf{iop}\_{\text{obs}} &< \log \mathbf{iop}\_{\text{u}} \end{aligned} \tag{24}$$

The generated N(0,1) values are, first, subdivided into two sets containing positive and negative values. The ratios of the first and second LUTs are, then, computed from the ordered descending sets as; *xi*/*xi*+1. The third LUT is generated from all possible combinations of the unordered positive and negative sets. This will results in ratio values between 0 and 1, > 1 and < 0 for the first, second and third LUT respectively. The ratio in equation (23) is first estimated from IOP-triplet. Based on the values of this triplet (equation 24) a lookup table is selected and searched to find the best-fit value to the computed ratio (equation 23). This best-fit is found either by direct search or interpolated. One of the corresponding pair is then used in equation (22) to compute the standard deviation of the prior PD P(**iop**).

#### **4.3 Posterior probability distribution**

In section (4.2) we derived a proper prior distribution of the IOPs. This first-estimate, i.e. prior distribution, is converged to a posterior distribution that better describes the IOPs using the concept of Entropy. Entropy is a numerical measure of error associated with probability distribution of derived IOPs or any hydrological parameter (Singh, 1998). For a population with *N* sets of IOPs it is expressed as the Shannon entropy (Shannon, 1948):

$$H\{\mathbf{P}(\mathbf{i}\mathbf{op})\} = -\sum\_{1}^{N} \mathbf{P}(\mathbf{i}\mathbf{op}) \cdot \log \mathbf{P}(\mathbf{i}\mathbf{op})\tag{25}$$

where P(**iop**) is the prior probability distribution (PD) of the derived set of IOPs **iop**.

If we design a function *D* that measures the information, e.g. equation (25), between the prior and the posterior PD, then we can derive the posterior PD such that it maximizes the expected information to be gained in *D* (Bernardo, 2005; Christakos, 1990). In other words, maximizing the function *D* will maximize the gained information from the posterior PD (Bernardo, 1979). The Kullback-Leibler divergence (Kullback & Leibler, 1951), or cross-entropy, belongs to this type of utility functions (Johnson & Geisser, 1985). It measures the divergence between the posterior P(**iop**|*ω*) and the prior P(**iop**) probability distribution as:

$$D\_{\rm KL}\left\{\mathbf{P}(\mathbf{i}\mathbf{op}|\mathbf{l})|\mathbf{P}(\mathbf{i}\mathbf{op})\right\} = \sum\_{1}^{N} \mathbf{P}(\mathbf{i}\mathbf{op}|\omega) \cdot \log \frac{\mathbf{P}(\mathbf{i}\mathbf{op}|\omega)}{\mathbf{P}(\mathbf{i}\mathbf{op})} \tag{26}$$

where P(**iop**|*ω*) is the posterior probability of **iop** given the IOP-triplet *ω*. Equation (26) can be rewritten in view of equation (25) as:

$$D\_{\rm KL}\{\mathbf{P}(\mathbf{i}\mathbf{op}|\omega)|\mathbf{P}(\mathbf{i}\mathbf{op})\} = H\{\mathbf{P}(\mathbf{i}\mathbf{op}|\omega), \mathbf{P}(\mathbf{i}\mathbf{op})\} - H\{\mathbf{P}(\mathbf{i}\mathbf{op}|\omega)\} \tag{27}$$

where *H*{P(**iop**|*ω*), P(**iop**)} is expressed as:

$$H\{\mathbf{P}(\mathbf{i}\mathbf{op}|\omega), \mathbf{P}(\mathbf{i}\mathbf{op})\} = -\sum\_{1}^{N} \mathbf{P}(\mathbf{i}\mathbf{op}|\omega) \cdot \log \mathbf{P}(\mathbf{i}\mathbf{op})\tag{28}$$

Maximizing the cross-entropy in equation (26) or the corresponding expression in (27) is equivalent to minimizing the entropy (uncertainty) of the posterior probabilities distribution, i.e. maximizing gained information. The errors can then be estimated from the reconstructed posterior probability distribution of IOPs P(**iop**|*ω*).

The posterior probability distribution is inferred by maximizing the utility function, i.e. Kullback-Leibler divergence (equation 26). The maximum is found by iteration through a sequential updating of the posterior using the prior parameters mean *μ* and variance *σ*<sup>2</sup> (Rubinstein & Kroese, 2004). The corresponding log-normal mean *m* and variance *v* are computed as Kendall & Stuart (1987):

$$
\mu = e^{\mu} e^{0.5\sigma^2} \tag{29}
$$

$$v = e^{2\mu} e^{\sigma^2} \left( e^{\sigma^2} - 1 \right) \tag{30}$$

The following steps describe the algorithm, as implemented, to derive the posterior PD P(**iop**|*ω*):


The convergence is defined by a threshold as follow. Keep track of the best ten candidates which maximize equation (26). The system converges if the variance of these ten values is less than 10−4.

#### **5. Inter-comparison between deterministic and stochastic methods**

The inter-comparison between the deterministic method, described in Section (3), and the stochastic method, described in Section (4), is carried out using two data sets. The first, is radiative transfer simulations of synthetic IOPs obtained from the International Ocean Color Coordination Group (IOCCG), report-5 (Lee, 2006, IOCCG data set). The second consists of concurrent observations from the Sea viewing Wide Field-of-view Sensor (SeaWiFS) and measured inherent and apparent optical properties, retrieved from the NASA bio-Optical Marine Algorithm Data set (NOMAD) Version 1.3 (Werdell & Bailey, 2005, SeaWiFS matchup data set).

#### **5.1 IOCCG**

(29)

(30)

10 Will-be-set-by-IN-TECH

Maximizing the cross-entropy in equation (26) or the corresponding expression in (27) is equivalent to minimizing the entropy (uncertainty) of the posterior probabilities distribution, i.e. maximizing gained information. The errors can then be estimated from the reconstructed

The posterior probability distribution is inferred by maximizing the utility function, i.e. Kullback-Leibler divergence (equation 26). The maximum is found by iteration through a sequential updating of the posterior using the prior parameters mean *μ* and variance *σ*<sup>2</sup> (Rubinstein & Kroese, 2004). The corresponding log-normal mean *m* and variance

The following steps describe the algorithm, as implemented, to derive the posterior PD

1. From the water remote sensing spectrum estimate the initial radiometric confidence interval using the method of (Bates & Watts, 1988, pp.59, cf. 1.36) or prior information

2. Invert the ocean color model in equation (1) to derive the IOPs from the water remote sensing spectrum and the upper and lower bounds. This will results in three sets of IOPs:

7. Use initial values of the mean and standard deviation to generate *n* Monte Carlo samples

8. Select the population that have the maximum Kullback-Leibler divergence (equation 26),

10. Update the prior PD with the resulting posterior PD (from the pervious step: 9), and

The convergence is defined by a threshold as follow. Keep track of the best ten candidates which maximize equation (26). The system converges if the variance of these ten values is less

The inter-comparison between the deterministic method, described in Section (3), and the stochastic method, described in Section (4), is carried out using two data sets. The first, is radiative transfer simulations of synthetic IOPs obtained from the International Ocean Color Coordination Group (IOCCG), report-5 (Lee, 2006, IOCCG data set). The second consists of concurrent observations from the Sea viewing Wide Field-of-view Sensor (SeaWiFS) and measured inherent and apparent optical properties, retrieved from the NASA bio-Optical Marine Algorithm Data set (NOMAD) Version 1.3 (Werdell & Bailey, 2005, SeaWiFS matchup

3. Based on the order of this IOP-triplet allocate the suitable LUT using equation (24).

*m* = *e μe* 0.5*σ*<sup>2</sup>

*v* = *e* <sup>2</sup>*μe <sup>σ</sup>*<sup>2</sup> *e σ*2 − 1 

on atmospheric and noise-induced radiometric fluctuations.

4. Search for the best-fit ratio calculated from equation (23).

5. Use equation (22) to estimate the standard deviation of the prior PD. 6. Use the standard deviation and log **iop**obs to generate the prior PD.

**5. Inter-comparison between deterministic and stochastic methods**

posterior probability distribution of IOPs P(**iop**|*ω*).

*v* are computed as Kendall & Stuart (1987):

, **iop**obs, **iop**u; IOP-triplet.

and update the initial values. 9. Repeat step 7 to 8 till convergence.

iterate steps 7 to 10 till convergence.

P(**iop**|*ω*):

**iop**<sup>l</sup>

of PD.

than 10−4.

data set).

IOCCG data set (Lee, 2006) of synthesized IOPs and their radiative transfer simulations at 30◦ sun zenith angle are used to inter compare the results of the deterministic and stochastic methods. IOCCG simulated spectra, between 400 nm and 750 nm at 10 nm interval, are inverted using the ocean color model in equation (1) to derive five variables. These variables are: Chlorophyll-a absorption at 440 nm *a*ph(440), detritus and CDOM absorption at 440 nm *a*dg(440) and their spectral dependency *s*, SPM scattering at 550 nm *b*spm(550) and SPM spectral dependency *y*, as shown in equation (7).

The standard deviation of the posterior PD represents the error/confidence of the derived value **iop**obs. The deviation of the posterior PD from known IOPs is measured using root-mean-square of errors (RMSE). These two values, RMSE and standard deviation, are related through the bias, i.e the actual difference between derived and measured IOPs. Figure (2) shows estimated errors, expressed as standard deviation using equation (30), against the known root-mean-square of errors (RMSE). The actual RMSE is estimated from the posterior PD and the known IOPs. The reproduced errors for the IOPs other than *a*ph(440) have a high accuracy with *r*<sup>2</sup> values between 0.77 and 0.96. Estimated errors of *a*ph(440) have the lowest *r*<sup>2</sup> and *n* values. It is worth noting that the determinacy method of Bates & Watts (1988) generally underestimates model-errors of the IOPs with lower *r*<sup>2</sup> values than the presented stochastic method. This is apparent at an almost threefold difference for the error values of *a*ph(440). On the other hand, the stochastic method has a tendency to overestimate the errors of the IOPs with a better fit and improved capability, in the sense that it can be applied to populations of any bio-geophysical variable.

#### **5.2 NOMAD**

Due to the limited number of available visible bands in this data set we reduced the number of unknowns to three only. The first three IOPs in equation (7) are derived from SeaWiFS spectra using the ocean color model (equation 1) and the constrained LMA technique. The values of *s* and *y* are set to 0.021 nm−<sup>1</sup> and 1.7 respectively. The actual RMSE values are computed from the posterior PD and measured IOPs. The total error on derived IOPs is estimated by applying the stochastic method using (Bates & Watts, 1988, pp.59, cf. 1.36) radiometric confidence interval. The estimated errors are expressed as standard deviation using equation (30) and plotted against RMSE values in figure (3). The reproduced total error values are strongly correlated to the known RMSE values with *r*<sup>2</sup> between 0.67 and 0.9 and >90% of valid retrievals. Estimated errors from the deterministic technique (Bates & Watts, 1988), however, did not correspond to the actual values of RMSE.

Errors are computed for the ocean color model and SeaWiFS visible bands centered at [412, 443, 490, 510, 555, 670] nm. The average values of the derived standard deviation are 1.7802, 1.1431 and 1.6177 m−<sup>1</sup> for *a*ph(440), *a*dg(440) and *b*spm(550), respectively.

#### **6. Uncertainty sources**

#### **6.1 Description**

The total remote sensing reflectance received at the sensor altitude can be written as the sum of several components (Gordon, 1997):

$$\mathrm{Rs}\_{\mathrm{l}}(\lambda) = \mathrm{Rs}\_{\mathrm{path}}(\lambda) + T(\lambda)\mathrm{Rs}\_{\mathrm{sfc}}(\lambda) + T(\lambda)\mathrm{Rs}\_{\mathrm{W}}(\lambda) \tag{31}$$

Fig. 2. Derived versus known errors of the IOPs estimated from the IOCCG data set for: (a) Chl-a absorption at 440 nm; (b) CDOM and detritus absorption at 440 nm; (c) SPM scattering at 550 nm; and (d) the total absorption at 440 nm. The data on the plots are log transformed. The coefficients of determination *r*<sup>2</sup> *<sup>s</sup>* and *r*<sup>2</sup> *<sup>d</sup>* are for stochastic and deterministic method respectively.

The subscript of the remote sensing reflectance *Rs* represents the contribution from: (i) the atmosphere (*path*), i.e. air molecules and aerosol multiple scattering; (ii) sea-surface (*sfc*); and (iii) water (*w*). *T*(*λ*) is the diffuse transmittance.

The contribution of air molecules, i.e. the Rayleigh scattering, to the atmospheric path is well described in terms of geometry and atmospheric pressure (Gordon et al., 1988). 12 Will-be-set-by-IN-TECH

−6

−6

−4

−2

Derived error

Fig. 2. Derived versus known errors of the IOPs estimated from the IOCCG data set for: (a) Chl-a absorption at 440 nm; (b) CDOM and detritus absorption at 440 nm; (c) SPM scattering at 550 nm; and (d) the total absorption at 440 nm. The data on the plots are log transformed.

The subscript of the remote sensing reflectance *Rs* represents the contribution from: (i) the atmosphere (*path*), i.e. air molecules and aerosol multiple scattering; (ii) sea-surface (*sfc*); and

The contribution of air molecules, i.e. the Rayleigh scattering, to the atmospheric path is well described in terms of geometry and atmospheric pressure (Gordon et al., 1988).

*<sup>s</sup>* and *r*<sup>2</sup>

0

2

r s 2 =0.98

r d 2 =0.013

−4

−2

Derived error

0

r d 2 =0.75

r s 2 =0.94

2

<sup>−</sup><sup>8</sup> <sup>−</sup><sup>6</sup> <sup>−</sup><sup>4</sup> <sup>−</sup><sup>2</sup> <sup>0</sup> <sup>2</sup> <sup>−</sup><sup>8</sup>

<sup>−</sup><sup>8</sup> <sup>−</sup><sup>6</sup> <sup>−</sup><sup>4</sup> <sup>−</sup><sup>2</sup> <sup>0</sup> <sup>2</sup> <sup>−</sup><sup>8</sup>

*<sup>d</sup>* are for stochastic and deterministic method

Known error

Stochastic 1:1 line Deterministic

Known error

(d): Total absorption

(b): Absorption; detritus and CDOM

<sup>−</sup><sup>8</sup> <sup>−</sup><sup>6</sup> <sup>−</sup><sup>4</sup> <sup>−</sup><sup>2</sup> <sup>0</sup> <sup>2</sup> <sup>−</sup><sup>8</sup>

<sup>−</sup><sup>8</sup> <sup>−</sup><sup>6</sup> <sup>−</sup><sup>4</sup> <sup>−</sup><sup>2</sup> <sup>0</sup> <sup>2</sup> <sup>−</sup><sup>8</sup>

(iii) water (*w*). *T*(*λ*) is the diffuse transmittance.

The coefficients of determination *r*<sup>2</sup>

Known error

Known error

(c): Scattering; SPM

(a): Absorption; chlorophyll−a

−6

−6

respectively.

−4

−2

Derived error

0

r d 2 =0.72

r s 2 =0.77

2

−4

−2

Derived error

0

r d 2 =0.003

r s 2 =0.64

2

Fig. 3. Derived versus known errors of the IOPs estimated from the NOMAD data set for: (a) Chl-a absorption at 440 nm; (b) CDOM and detritus absorption at 440 nm; (c) SPM scattering at 550 nm; and (d) the total absorption at 440 nm. The data on the plots are log transformed. The coefficients of determination *r*<sup>2</sup> *<sup>s</sup>* and *r*<sup>2</sup> *<sup>d</sup>* are for stochastic and deterministic method respectively.

The contribution of sea-surface reflectance *Rs*sfc can be estimated using the probabilistic formulations of Cox & Munk (1954) and ancillary data on wind field. Gaseous transmittance can be calculated from ancillary data on ozone and water vapor concentrations using the transmittance models of Goody (1964) and Malkmus (1967). For viewing angles < 60◦ the diffuse transmittance *T* is weakly dependent on aerosol and can be approximated following Gordon et al. (1983). Following the aforementioned approximations will basically leave two unknowns; the aerosol and the water remote sensing reflectance. In other words, the errors in *Rs*w can be attributed to errors in aerosol estimation and any noise in the sensor, i.e. noise equivalent radiance (NER).

Radiometric errors in *Rs*w, beside to model-inversion intrinsic errors, will accumulate and propagate to the IOPs during the retrieval. The total error of the derived IOPs can therefore be decomposed into three major components, namely model-inversion error, sensor noise and error in aerosol estimation. These errors are originated by various mechanisms during the processing chain of ocean color data as explained hereafter.

Each error component, *x*, will be expressed as the variance *σ*<sup>2</sup> <sup>x</sup> of IOPs caused by this error *x*. The subscript *x* will be replaced by *inv*, *ner* and *a* to represent the contribution of model-inversion, noise equivalent radiance and aerosol, respectively.

#### **6.2 Model-inversion error,** *σ*<sup>2</sup> inv**:**

The employed approximations in the forward-model (equation 1) may not precisely describe the optical processes that have caused the observed signal (Zaneveld, 1994). Moreover, the numerical technique used for inversion provides an ambiguous solution, i.e. the derived IOPs are not unique (Sydor et al., 2004). These assumptions and ambiguity will generate error that is, at the one hand, inherent to the employed ocean color forward model and, on the other hand, dependent on the accuracy of the inversion scheme which could be related to the optical complexity of the water. Model-inversion error is quantified as a lumped sum of errors due to the approximation in (1), the parametrization of IOPs and inversion and abbreviated as model error.

#### **6.3 Noise equivalent radiance,** *σ*<sup>2</sup> ner**:**

Noise equivalent radiance (NER) depends on sensor specifications and performance over time, i.e. sensor degradation. This fluctuation could either increase or decrease the observed remote sensing reflectance and could also be wavelength dependent or random. The effects of NER is inversely proportional to the value of signal-to-noise ratio. Sensor degradation, i.e. sensitivity losses over time, will cause decrease in the signal-to-noise ratio of the sensor leading to low signal reading. Low signal can also be observed over clear water at the near infrared part of the spectrum or over turbid water, with high CDOM, detritus and Chl-a contents, at the blue part of the spectrum. The propagated error from NER to IOPs will therefore be dependent on sensor specification, sensor degradation over time, water turbidity and observing wavelength.

#### **6.4 Variations of aerosol type and optical thickness,** *σ*<sup>2</sup> a **:**

Atmospheric correction errors are, generally, caused by unknown aerosol type and optical thickness (AOT). The residual signals from atmospheric correction will have spectral and spatial dependency. The spectral dependency is due to the error about the aerosol type e.g. absorbing aerosol, while the spatial dependency is, on the one hand, related to the error about AOT spatial variations and, on the other hand, to water turbidity (Hu et al., 2004). It is assumed that aerosol optical thickness has a higher spatial variability than aerosol type, so that aerosol type can be assumed to be known and homogenous. Within the validity of this assumption, the residual signals from atmospheric correction will be caused by errors in estimating the aerosol optical thickness.

#### **6.5 Decomposition**

14 Will-be-set-by-IN-TECH

leave two unknowns; the aerosol and the water remote sensing reflectance. In other words, the errors in *Rs*w can be attributed to errors in aerosol estimation and any noise in the sensor,

Radiometric errors in *Rs*w, beside to model-inversion intrinsic errors, will accumulate and propagate to the IOPs during the retrieval. The total error of the derived IOPs can therefore be decomposed into three major components, namely model-inversion error, sensor noise and error in aerosol estimation. These errors are originated by various mechanisms during the

*x*. The subscript *x* will be replaced by *inv*, *ner* and *a* to represent the contribution of

The employed approximations in the forward-model (equation 1) may not precisely describe the optical processes that have caused the observed signal (Zaneveld, 1994). Moreover, the numerical technique used for inversion provides an ambiguous solution, i.e. the derived IOPs are not unique (Sydor et al., 2004). These assumptions and ambiguity will generate error that is, at the one hand, inherent to the employed ocean color forward model and, on the other hand, dependent on the accuracy of the inversion scheme which could be related to the optical complexity of the water. Model-inversion error is quantified as a lumped sum of errors due to the approximation in (1), the parametrization of IOPs and inversion and abbreviated as model

Noise equivalent radiance (NER) depends on sensor specifications and performance over time, i.e. sensor degradation. This fluctuation could either increase or decrease the observed remote sensing reflectance and could also be wavelength dependent or random. The effects of NER is inversely proportional to the value of signal-to-noise ratio. Sensor degradation, i.e. sensitivity losses over time, will cause decrease in the signal-to-noise ratio of the sensor leading to low signal reading. Low signal can also be observed over clear water at the near infrared part of the spectrum or over turbid water, with high CDOM, detritus and Chl-a contents, at the blue part of the spectrum. The propagated error from NER to IOPs will therefore be dependent on sensor specification, sensor degradation over time, water turbidity

Atmospheric correction errors are, generally, caused by unknown aerosol type and optical thickness (AOT). The residual signals from atmospheric correction will have spectral and spatial dependency. The spectral dependency is due to the error about the aerosol type e.g. absorbing aerosol, while the spatial dependency is, on the one hand, related to the error about AOT spatial variations and, on the other hand, to water turbidity (Hu et al., 2004). It is assumed that aerosol optical thickness has a higher spatial variability than aerosol type, so that aerosol type can be assumed to be known and homogenous. Within the validity of this assumption, the residual signals from atmospheric correction will be caused by errors in

a **:**

<sup>x</sup> of IOPs caused by this error

i.e. noise equivalent radiance (NER).

**6.2 Model-inversion error,** *σ*<sup>2</sup>

**6.3 Noise equivalent radiance,** *σ*<sup>2</sup>

and observing wavelength.

estimating the aerosol optical thickness.

error.

processing chain of ocean color data as explained hereafter. Each error component, *x*, will be expressed as the variance *σ*<sup>2</sup>

inv**:**

model-inversion, noise equivalent radiance and aerosol, respectively.

ner**:**

**6.4 Variations of aerosol type and optical thickness,** *σ*<sup>2</sup>

The total error of the derived IOPs, expressed as the variance *σ*<sup>2</sup> t *σ*2 inv, *<sup>σ</sup>*<sup>2</sup> ner, *σ*<sup>2</sup> a , is thus described as a function of the three error components, *σ*<sup>2</sup> inv, *<sup>σ</sup>*<sup>2</sup> ner and *σ*<sup>2</sup> <sup>a</sup> . Assuming that this function is continuous in its variables, we can approximate it by a first order Taylor series as:

$$
\sigma\_\mathrm{t}^2 \approx \sigma\_\mathrm{t0}^2 + \frac{\partial \sigma\_\mathrm{t}^2}{\partial \sigma\_\mathrm{inv}^2} \sigma\_\mathrm{inv}^2 + \frac{\partial \sigma\_\mathrm{t}^2}{\partial \sigma\_\mathrm{ner}^2} \sigma\_\mathrm{ner}^2 + \frac{\partial \sigma\_\mathrm{t}^2}{\partial \sigma\_\mathrm{a}^2} \sigma\_\mathrm{a}^2 \tag{32}
$$

where, *σ*<sup>2</sup> t0 is the value of the function *<sup>σ</sup>*<sup>2</sup> <sup>t</sup> (0, 0, 0). According to the assumption that the total error is caused by three components, the value of *σ*<sup>2</sup> t0 is negligible, i.e. *<sup>σ</sup>*<sup>2</sup> t0 � 0. In other words, if we have perfect measurements, accurate atmospheric correction and exact model parameterizations and inversion then the total error on the derived IOPs will be negligible. The total error of the derived IOPs can thus be approximated as a weighted sum of the individual error components as:

$$
\sigma\_\text{\textasciicize} \approx w\_{\text{inv}}^2 \sigma\_\text{inv}^2 + w\_{\text{ner}}^2 \sigma\_{\text{ner}}^2 + w\_\text{a}^2 \sigma\_\text{a}^2 \tag{33}
$$

where the weights *w*inv, *w*ner, and *w*<sup>a</sup> are the partial derivatives in equation (32). The functionality in *σ*<sup>2</sup> <sup>t</sup> , however, is commonly unknown and it is therefore difficult to find proper estimates of the weights *w*inv, *w*ner and *w*a. An intuitive approach would be setting all the weights in equation (33) to unity and check its validity:

$$
\sigma\_\text{t}^2 \approx \sigma\_\text{inv}^2 + \sigma\_\text{ner}^2 + \sigma\_\text{a}^2 \tag{34}
$$

Figure (4) depicts the relationship between the sum of the righthand side of equation (34) and the total error on the derived IOPs. On the X axis is the total error of the IOPs as calculated from all possible error sources *σ*<sup>2</sup> <sup>t</sup> . We then calculated each error component apart and summed their variances in the Y axis as: *σ*<sup>2</sup> <sup>a</sup> + *σ*<sup>2</sup> ner + *σ*<sup>2</sup> <sup>a</sup> . As anticipated from equation (33) there is a linear relationship between the actual variance and the linear sum of individual variances with R2 values above 0.75 for the absorption coefficients of Chl-a and detritus-CDOM. The value of R2 decreases to 0.69 for SPM scattering and 0.64 for the total absorption. The dispersion value as measured with RMSE is large for all IOPs. The results in figure (4) indicate that the linear sum in equation (34) is an acceptable approximation to the total variance. Due to the large values of RMSE in figure (4), the computed relative contribution should be treated with caution.

While model-induced error can directly be estimated from the techniques described in Brad (1974) and Bates & Watts (1988), noise and atmospheric-induced errors should be inferred from the available information. This information forms the prior knowledge that we will use in the following section to derive the error of the IOPs. Prior information is obtained from known sensor's noise, variation in aerosol optical thickness and ocean-color model's approximations and inversion accuracy.

#### **6.5.1 IOCCG**

The noise is estimated based on NER values of the Medium Resolution Imaging Spectrometer (MERIS) (Doerffer, 2008; Hoogenboom & Dekker, 1998). The variation in aerosol optical thickness (AOT) is set to be ±0.02. This value is estimated from the variation of recorded aerosol optical thickness by a newly calibrated sunphotometer (CIMEL) and cloud free

Fig. 4. Sum of variances versus the total variance of the IOCCG data set for: (a) Chl-a absorption at 440 nm; (b) absorption of detritus and CDOM at 440 nm; (c) SPM scattering at 550 nm; and (d) total absorption coefficient at 440 nm.

condition (Holben et al., 2000). The values of aerosol optical thicknesses are obtained from sunphotometer measurements situated at (51.225 N, 2.925 E) at the 8th of June 2006. The atmospheric paths are estimated with radiative transfer computation (Vermote et al., 1997) using maritime aerosol model with a nadir looking sensor at 30◦ sun-zenith and 203◦ sun-azimuth angles.

The relative contribution of model, noise and atmospheric errors are shown in table (1) and quantified for each of the derived IOP as follow. First we computed the total error, i.e. the total error in *Rs*w(*λ*) is due to aerosol estimation and sensor noise, inversion error will add up during the inversion. The same step is repeated for each error source in three steps: (i) model error is estimated from the error-free *Rs*w(*λ*); (ii) atmospheric-induced error *σ*<sup>2</sup> <sup>a</sup> is computed from *Rs*w(*λ*) that contains errors due to aerosol estimation only; (iii) noise error is calculated from *Rs*w(*λ*) that contains sensor noise only. Note that model error will add up during the inversion in the last two steps. Now we can use equation (34) to estimate the relative contribution of each error component to the total error of the IOPs.

Errors due to atmospheric correction are the major source of errors in the derived IOPs. Imperfect atmospheric correction, due to the variability of aerosol optical thickness, is responsible for more than 50% of the total error and up to 82%. One fifth of the total errors on derived IOPs (except for the SPM scattering: one tenth) is attributed to noise-error. Model-error has the lowest contribution (≈7%) to the total error on derived *b*spm(550) values, but it has a significant contribution (≈ 16%) to *y*. This can be attributed to the assumed parametrization. On the one hand, the absorption of other constituents than water molecules is negligible at the near infrared (NIR) which will cause stability (one-to-one relation) in the derived SPM scattering coefficient, leading to a significant contribution from the atmosphere at the NIR region. On the other hand, the error in *b*spm(*λ*) will decrease towards the NIR region due to the assumed exponential spectral dependency. In general, model-induced errors are large for the spectral shape coefficients *y* and *s*. Note that the spectral shape of chlorophyll-a absorption is imbedded in coefficient *a*ph(440).


Table 1. The average relative contribution (%) of error components on IOCCG data set.

#### **6.5.2 NOMAD**

16 Will-be-set-by-IN-TECH

)

)

condition (Holben et al., 2000). The values of aerosol optical thicknesses are obtained from sunphotometer measurements situated at (51.225 N, 2.925 E) at the 8th of June 2006. The atmospheric paths are estimated with radiative transfer computation (Vermote et al., 1997) using maritime aerosol model with a nadir looking sensor at 30◦ sun-zenith and 203◦

The relative contribution of model, noise and atmospheric errors are shown in table (1) and quantified for each of the derived IOP as follow. First we computed the total error, i.e. the total error in *Rs*w(*λ*) is due to aerosol estimation and sensor noise, inversion error will add

Fig. 4. Sum of variances versus the total variance of the IOCCG data set for: (a) Chl-a absorption at 440 nm; (b) absorption of detritus and CDOM at 440 nm; (c) SPM scattering at

−10

−10

−5

0

log(sum of variances, m−1

)

5

10

15

r 2 =0.61 RMSE=2.23

−5

0

log(sum of variances, m−1

)

5

10

15 r 2 =0.75 RMSE=1.39

−10 −5 0 5 10 15

−10 −5 0 5 10 15

log(total variance of atotal(440), m<sup>−</sup><sup>1</sup>

(b): detritus and CDOM

)

)

This work 1:1 line

log(total variance of adg(440), m<sup>−</sup><sup>1</sup>

(d): total absorption

−10 −5 0 5 10 15

−10 −5 0 5 10 15

log(total variance of bspm(550), m<sup>−</sup><sup>1</sup>

550 nm; and (d) total absorption coefficient at 440 nm.

(a): chlorophyll−a

log(total variance of aph(440), m<sup>−</sup><sup>1</sup>

(c): SPM

−10

−10

sun-azimuth angles.

−5

0

log(sum of variances, m−1

)

5

10

15 r 2 =0.69 RMSE=2.14

−5

0

log(sum of variances, m−1

)

5

10

15 r 2 =0.77 RMSE=2.76

> The total error on estimated IOPs from the NOMAD data is derived from the values presented in figure (3). Model induced errors are subtracted from the total error using equation (33) to deduce atmospheric and noise-induced errors. The results are shown as percentages in table (2). Main uncertainty is due to atmospheric and noise-induced errors for *a*ph(440) and *b*spm(550), while model inversion is the main source of error to *a*dg(440) in this data set. These results are within the validity of the linear assumption expressed in equation (33) and the imposed values of *s* and *y*.


Table 2. The average relative contribution (%) of error components on SeaWiFS observations in the NOMAD data set.

#### **6.5.3 Model-sensor error table**

The linear sum of individual variances in equation (34) can describe about 70%, the value of R2, of the total variance of the IOPs. This linearization of the total variance is a simple yet effective approach. It allows us to estimate the relative contribution of the different error components to the total error budget on the IOPs. The relative contribution of model, noise and atmospheric errors to the total error budget using IOCCG data set are 20-40%, 10-25% and 40-80%, respectively. Model-induced errors, due to approximation and inversion, are inherent to the derived IOPs and inversely proportional to model-inversion degree-of-freedom, while atmospheric-induced errors are the major contributor to the total error budget on IOPs. These results are for assumed levels of noise and atmospheric fluctuations. This suggests that error table can be generated for specific model, sensor and range of IOPs. This model-sensor error table can serve as a benchmark to estimate the atmospheric-induced errors in the derived IOPs. The merit of this argument is based on the fact that the computations of model and noise-induced errors can be quantified using water radiative transfer simulations, for a specific range of IOPs, and known sensor's NER. The magnitude of these errors are in principle known for the ocean color model and the used sensor. An example of such a table for the MERIS sensor and the ocean color model is shown in table (3). This table is computed from table (1) for the MERIS visible bands centered at [412, 443, 490, 510, 560, 620, 665, 708, 778] nm, i.e. we simply reduced the spectral bands of IOCCG data set to fit those of MERIS. Table (3) shows that the reduced number of spectral bands for MERIS setup has increased model contribution to the total error approximately two fold. This will reduce noise and atmospheric contribution to the total error, since the relative contributions of all error components should sum to 1. Note that for weak radiometric signals, the lower bound might end up to negative values which will lead to further reduction in the number of bands (negative values are set to zero). This approach is demonstrated for ocean color observations obtained from NOMAD data set. Model and noise-induced errors are simulated from the IOCCG data set and subtracted from the total error of IOPs estimated from the NOMAD data set. The simulation is carried out simply by selecting IOCCG wavelengths that correspond to NOMAD spectral set-up. The simplicity of this approach can pose a limitation on the accuracy of equation (34). On the one hand, the method shows that model approximation and inversion are main contributors, ≈57%, to the total error of *a*dg(440). On the other hand, the presented stochastic method quantified these errors most efficiently. Atmospheric and noise-induced errors are significant for *a*ph(440) and *b*spm(550). This may suggests that model-induced errors are better quantified with the current method. However, errors of SPM scattering coefficient, which are mainly due to atmospheric residuals, are reproduced with high accuracy.


Table 3. The average relative contribution (%) of error components on derived IOPs using the ocean color model (equation 1) and simulated MERIS bands from IOCCG data set.

#### **7. Spectral propagation of errors and error correlation**

18 Will-be-set-by-IN-TECH

The linear sum of individual variances in equation (34) can describe about 70%, the value of R2, of the total variance of the IOPs. This linearization of the total variance is a simple yet effective approach. It allows us to estimate the relative contribution of the different error components to the total error budget on the IOPs. The relative contribution of model, noise and atmospheric errors to the total error budget using IOCCG data set are 20-40%, 10-25% and 40-80%, respectively. Model-induced errors, due to approximation and inversion, are inherent to the derived IOPs and inversely proportional to model-inversion degree-of-freedom, while atmospheric-induced errors are the major contributor to the total error budget on IOPs. These results are for assumed levels of noise and atmospheric fluctuations. This suggests that error table can be generated for specific model, sensor and range of IOPs. This model-sensor error table can serve as a benchmark to estimate the atmospheric-induced errors in the derived IOPs. The merit of this argument is based on the fact that the computations of model and noise-induced errors can be quantified using water radiative transfer simulations, for a specific range of IOPs, and known sensor's NER. The magnitude of these errors are in principle known for the ocean color model and the used sensor. An example of such a table for the MERIS sensor and the ocean color model is shown in table (3). This table is computed from table (1) for the MERIS visible bands centered at [412, 443, 490, 510, 560, 620, 665, 708, 778] nm, i.e. we simply reduced the spectral bands of IOCCG data set to fit those of MERIS. Table (3) shows that the reduced number of spectral bands for MERIS setup has increased model contribution to the total error approximately two fold. This will reduce noise and atmospheric contribution to the total error, since the relative contributions of all error components should sum to 1. Note that for weak radiometric signals, the lower bound might end up to negative values which will lead to further reduction in the number of bands (negative values are set to zero). This approach is demonstrated for ocean color observations obtained from NOMAD data set. Model and noise-induced errors are simulated from the IOCCG data set and subtracted from the total error of IOPs estimated from the NOMAD data set. The simulation is carried out simply by selecting IOCCG wavelengths that correspond to NOMAD spectral set-up. The simplicity of this approach can pose a limitation on the accuracy of equation (34). On the one hand, the method shows that model approximation and inversion are main contributors, ≈57%, to the total error of *a*dg(440). On the other hand, the presented stochastic method quantified these errors most efficiently. Atmospheric and noise-induced errors are significant for *a*ph(440) and *b*spm(550). This may suggests that model-induced errors are better quantified with the current method. However, errors of SPM scattering coefficient, which are mainly due to atmospheric residuals, are reproduced with

error components

IOPs Model Noise Aerosol *a*ph(440) 40 13 47 *a*dg(440) 41 13 46 *b*spm(550) 45 5 50 *y* 42 19 39 *s* 42 16 42

Table 3. The average relative contribution (%) of error components on derived IOPs using the

ocean color model (equation 1) and simulated MERIS bands from IOCCG data set.

**6.5.3 Model-sensor error table**

high accuracy.

The presented errors of IOPs were for two wavelengths: 440 nm for the absorption coefficients and 550 nm for the scattering coefficient, as defined by equation (7). We can use the parameterizations in equations (4), (5) and (6) to derive analytical description of error propagation to other wavelengths. Here below we provide an analytical derivation of error propagation and numerical examples for two wavelengths one at the blue, 400 nm, and the other at the red, 680 nm.

The errors of the IOPs will propagate to shorter and longer wavelengths following the parameterizations in equations (4), (5) and (6). For example, the error in *b*spm(550) has two components; one in *b*spm(550) itself and the other in the spectral shape *y*. Using the parametrization in equation (6) we will have:

$$
\Delta b\_{\rm spm}(\lambda) = \frac{\partial b\_{\rm spm}(\lambda)}{\partial b\_{\rm spm}(550)} \Delta b\_{\rm spm}(550) + \frac{\partial b\_{\rm spm}(\lambda)}{\partial y} \Delta y + \frac{\partial b\_{\rm spm}(\lambda)}{\partial \lambda} \Delta \lambda \tag{35}
$$

Carrying the derivation of the palatial derivatives, equation (35) can be written as:

$$\begin{split} \Delta b\_{\rm spm}(\lambda) &= \left(\frac{550}{\lambda}\right)^{\circ} \Delta b\_{\rm spm}(550) \\ &+ b\_{\rm spm}(550) \left(\frac{550}{\lambda}\right)^{\circ} \ln \frac{550}{\lambda} \Delta y \\ &- \frac{\nu}{\lambda} b\_{\rm spm}(550) \left(\frac{550}{\lambda}\right)^{\circ} \Delta \lambda \end{split} \tag{36}$$

In this exercise we will neglect the error in the wavelength, i.e. Δ*λ* ≈ 0 and we will show that the derivative *∂*/*∂λ* is negligible.

Let us take the two reference wavelengths: the blue 400 nm and the red 680 nm and assume *y* = 1.7, we will have:

$$
\Delta b\_{\rm spm}(400) = 1.718 \Delta b\_{\rm spm}(550) + 0.547 b\_{\rm spm}(550) \Delta y \tag{37}
$$

$$
\Delta b\_{\rm spm}(680) = 0.697 \Delta b\_{\rm spm}(550) - 0.148 b\_{\rm spm}(550) \Delta y \tag{38}
$$

The wavelength variation term *∂*/*∂λ* in equations (37) and (38) is neglected. It takes the values, with *<sup>λ</sup>* expressed in nanometer, 7.3 <sup>×</sup> <sup>10</sup>−3*b*spm(550)Δ*<sup>λ</sup>* and 1.74 <sup>×</sup> <sup>10</sup>−3*b*spm(550)Δ*<sup>λ</sup>* for the blue and the red wavelengths, respectively.

Equations (37, 38) show that the error in SPM scattering coefficient at the blue wavelength is larger than that at the red wavelength if the relative error in the scattering coefficient satisfies the condition:

$$\frac{\Delta b\_{\rm spm}(550)}{b\_{\rm spm}(550)} > -0.681 \Delta y \tag{39}$$

In a similar approach we can quantify the propagated errors of *a*dg(440) to other wavelengths:

$$\begin{aligned} \Delta a\_{\rm dg}(\lambda) &= \exp\left[-s\left(\lambda - 440\right)\right] \Delta a\_{\rm dg}(440) \\ &- a\_{\rm dg}(440) \left(\lambda - 440\right) \exp\left[-s\left(\lambda - 440\right)\right] \Delta s \\ &- s \times a\_{\rm dg}(440) \exp\left[-s\left(\lambda - 440\right)\right] \Delta \lambda \end{aligned} \tag{40}$$

If we assume the value *s* = 0.021 nm−<sup>1</sup> and take our reference bands to be the blue (400 nm) and red (680 nm) wavelengths we will have, with *λ* in meter:

$$
\Delta a\_{\rm dg}(400) = 2.316 \Delta a\_{\rm dg}(440) + 92.654 \times 10^{-9} a\_{\rm dg}(440) \Delta S \tag{41}
$$

$$
\Delta a\_{\rm dg}(680) = 6.47 \times 10^{-3} \Delta a\_{\rm dg}(440) - 1.554 \times 10^{-9} a\_{\rm dg}(440) \Delta S \tag{42}
$$

The wavelength variation term *∂*/*∂λ* is also negligible. It takes the values −4.86 × <sup>10</sup>−11*a*dg(440)Δ*<sup>λ</sup>* and <sup>−</sup>1.36 <sup>×</sup> <sup>10</sup>−13*a*dg(440)Δ*<sup>λ</sup>* for the blue and the red wavelengths, respectively. The error at the blue will be larger than that at the red if the relative error of *a*dg(440) satisfies the condition (from equations 41 and 42):

$$\frac{\Delta a\_{\rm dg}(440)}{a\_{\rm dg}(440)} > 4.08 \times 10^{-8} \Delta s \tag{43}$$

The parametrization of Chl-a absorption is based on the tabulated values *a*<sup>0</sup> and *a*1, see equation(4). These tabulated values are taken to be constant per wavelength, i.e. *a*ph(*λ*) is function of *a*ph(440) only. The error in *a*ph(440) will propagate to other wavelengths following the derivative of equation (4):

$$
\Delta a\_{\rm ph}(\lambda) = a\_1(\lambda) + a\_0(\lambda) + a\_1(\lambda) \log a\_{\rm ph}(440) \Delta a\_{\rm ph}(440) \tag{44}
$$

For the two reference bands, 400 nm and 680 nm, we will have:

$$
\Delta a\_{\rm ph}(400) = 0.731 + 0.012 \log a\_{\rm ph}(440) \Delta a\_{\rm ph}(440) \tag{45}
$$

$$
\Delta a\_{\rm ph}(680) = 0.945 + 0.149 \log a\_{\rm ph}(440) \Delta a\_{\rm ph}(440) \tag{46}
$$

From equations (45, 46) it can be shown that the error at the blue band is larger than that at the red if the following condition is satisfied:

$$
\log a\_{\rm ph}(440)\Delta a\_{\rm ph}(440) < -1.562 \tag{47}
$$

The analytical expressions in equations (36), (40) and (44) show that the errors are related to absolute values of the IOPs. Therefore, the three error components are expected to be correlated to water turbidity, and hence to each others. The results of the numerical examples also demonstrate that the errors of *b*spm(*λ*) and *a*dg(*λ*) will be larger at the blue than that at the red if the relative errors of *b*spm(550) and *a*dg(440) satisfy equations (36) and (40), respectively. Whereas the error in *a*ph(440) will propagate to other wavelengths following equation (44) and will be larger at the blue if the condition in (47) is satisfied.

#### **8. Advantages and limitations of error estimation methods**

Estimated errors from the deterministic method (Bates & Watts, 1988) did not correspond to the actual values of RMSE. This is due to the atmospheric and noise radiometric fluctuations. These fluctuations are imbedded in the observed signal and do not vary with IOPs values, i.e. different response function. Their large fluctuations may cause an ill-conditioned Jacobian matrix that produces erroneous estimates, see (Bates & Watts, 1988, pp.59, cf. 1.36 ). It should, nevertheless, be emphasized that the deterministic method is a well established technique to estimate retrieval errors. It can be used for the quantification of the combined accuracy of ocean color models and the parameterizations of IOPs, or model-parametrization setup. Its 20 Will-be-set-by-IN-TECH

If we assume the value *s* = 0.021 nm−<sup>1</sup> and take our reference bands to be the blue (400 nm)

The wavelength variation term *∂*/*∂λ* is also negligible. It takes the values −4.86 × <sup>10</sup>−11*a*dg(440)Δ*<sup>λ</sup>* and <sup>−</sup>1.36 <sup>×</sup> <sup>10</sup>−13*a*dg(440)Δ*<sup>λ</sup>* for the blue and the red wavelengths, respectively. The error at the blue will be larger than that at the red if the relative error of

The parametrization of Chl-a absorption is based on the tabulated values *a*<sup>0</sup> and *a*1, see equation(4). These tabulated values are taken to be constant per wavelength, i.e. *a*ph(*λ*) is function of *a*ph(440) only. The error in *a*ph(440) will propagate to other wavelengths following

From equations (45, 46) it can be shown that the error at the blue band is larger than that at

The analytical expressions in equations (36), (40) and (44) show that the errors are related to absolute values of the IOPs. Therefore, the three error components are expected to be correlated to water turbidity, and hence to each others. The results of the numerical examples also demonstrate that the errors of *b*spm(*λ*) and *a*dg(*λ*) will be larger at the blue than that at the red if the relative errors of *b*spm(550) and *a*dg(440) satisfy equations (36) and (40), respectively. Whereas the error in *a*ph(440) will propagate to other wavelengths following equation (44)

Estimated errors from the deterministic method (Bates & Watts, 1988) did not correspond to the actual values of RMSE. This is due to the atmospheric and noise radiometric fluctuations. These fluctuations are imbedded in the observed signal and do not vary with IOPs values, i.e. different response function. Their large fluctuations may cause an ill-conditioned Jacobian matrix that produces erroneous estimates, see (Bates & Watts, 1988, pp.59, cf. 1.36 ). It should, nevertheless, be emphasized that the deterministic method is a well established technique to estimate retrieval errors. It can be used for the quantification of the combined accuracy of ocean color models and the parameterizations of IOPs, or model-parametrization setup. Its

<sup>Δ</sup>*a*dg(400) = 2.316Δ*a*dg(440) + 92.654 <sup>×</sup> <sup>10</sup>−9*a*dg(440)Δ*<sup>S</sup>* (41) <sup>Δ</sup>*a*dg(680) = 6.47 <sup>×</sup> <sup>10</sup>−3Δ*a*dg(440) <sup>−</sup> 1.554 <sup>×</sup> <sup>10</sup>−9*a*dg(440)Δ*<sup>S</sup>* (42)

Δ*a*ph(*λ*) = *a*1(*λ*) + *a*0(*λ*) + *a*1(*λ*)log *a*ph(440)Δ*a*ph(440) (44)

Δ*a*ph(400) = 0.731 + 0.012 log *a*ph(440)Δ*a*ph(440) (45) Δ*a*ph(680) = 0.945 + 0.149 log *a*ph(440)Δ*a*ph(440) (46)

log *a*ph(440)Δ*a*ph(440) < −1.562 (47)

*<sup>a</sup>*dg(440) <sup>&</sup>gt; 4.08 <sup>×</sup> <sup>10</sup>−8Δ*<sup>s</sup>* (43)

and red (680 nm) wavelengths we will have, with *λ* in meter:

*a*dg(440) satisfies the condition (from equations 41 and 42):

For the two reference bands, 400 nm and 680 nm, we will have:

and will be larger at the blue if the condition in (47) is satisfied.

**8. Advantages and limitations of error estimation methods**

the red if the following condition is satisfied:

the derivative of equation (4):

Δ*a*dg(440)

application produces a single (or ensemble) uncertainty measure for the collective errors in the derived IOPs relative to the radiometric uncertainty without the need for model inversion or prior information on the radiometric errors.

Error decomposition exercise shows that atmospheric and NER induced errors can be better quantified when prior knowledge is available. This is important for ocean color band ratio or single band algorithms, e.g. (Austin & Petzold, 1981; Salama et al., 2004). These algorithms are empirical in nature, i.e. Jacobian matrix is not available. In this case, deterministic methods to derive the error are not applicable. In contrary, the presented stochastic method is generic and can be applied to quantify the error of any derived bio-geophysical parameter regardless of the used derivation method. This is true if, beside to the derived quantity, two other values are known a priori so the IOP-triplet can be constructed.

The prior values were inferred from the quantiles of the populations. In practice this information is not available but it could be estimated from historical measurements or high temporal observations. The later, high temporal sampling, can be realized using sensors on board of geostationary satellites to quantify marine bio-geophysical parameters. For instance, the visible band of the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on board of the METEOSAT second generation satellite (MSG) can be used to quantify the concentrations of SPM (Neukermans et al., 2008). With MSG 15 minutes of repeated sampling cycle, the stochastic method can be applied on three consecutive acquisitions, i.e. each 45 minutes, to produce SPM concentration and related-error maps. This error map provides vital input to the recently developed SPM assimilation model (Eleveld et al., 2008). Moreover it can be used, as weights, for ocean color products merging (Pottier et al., 2006). This generality aspect of the presented stochastic method expands its applicability to different fields other than ocean color. For example, Velde van der et al. (2008) developed a basis for Synthetic Aperture Radar (SAR)-based soil moisture downscaling methodologies.

One limitation of the presented stochastic method is the choice of the acceptance-rejection method. Although it facilitates the search for a unique pair of N(0,1) values, the derived *σ* become sensitive to the ratio in (23), i.e. sensitive to the lower and upper pair (**iop**u,**iop**<sup>l</sup> ) in the IOP-triplet. This may caused the 7∼10% failure to reproduce the values of the standard deviation. This can be attributed to the small values of *α* � 1 which produce large values of *σ*. These large values will further be magnified by equation (30).

Using equation (Bates & Watts, 1988, pp.59, cf. 1.36) to estimate the total error as a linear sum of all other error components is another limitation. Atmospheric or noise radiometric fluctuations can be interpreted, by model inversion, as high/low IOPs values with high goodness-of-fit. Using the same reasoning, bad fit to very complex signal (turbid water with high SPM, CDOM and Chl-a contents) can be attributed to atmospheric and sensor noise errors, although the observed signal might be error-free.

Model-sensor error tables were simulated from IOCCG data set without accounting for sensor's band width and response function. A more detailed simulations that includes band width, response function of the sensor and a specific range of the IOPs should be carried out to establish a more accurate model-sensor error tables.

Although we showed that equation (34) is an acceptable approximation to the total variance, the computed relative contribution of errors should be treated with caution.

#### **9. Summary and future developments**

In this chapter we reviewed the recent advances in uncertainty estimation of the earth observation products of water quality. Both deterministic and stochastic methods are presented and their results are inter- compared. The stochastic method is more appropriate to estimate actual errors of ocean color derived products than the deterministic methods, however, it is still limited to few studies and as the deterministic approach requires prior information. The uncertainties could be decomposed only if additional information is provided a priori. Using a simple exercise it was shown that atmospheric-induced errors are major contributors to the total error of IOPs whereas model-induced errors are inherent to the derived IOPs depending on the used derivation method and number of spectral bands.

The error in this chapter was estimated as the difference between ground truth measurement and satellite derived products. Direct matching between earth observation data and just above the water field measurements imbed, however, an inherent scale difference. This scale difference between *in-situ* observation and a pixel of ocean color satellite is at least three to four orders of magnitude for nadir match-up sites and much larger for off-nadir ones. This huge scale difference, means that point measurement is sampling a tiny fraction of the water body which is observed by a satellite pixel. Few studies were carried out to address the scale difference between point and aerospace measurements directly. Most of these studies have used re-sampling to smooth out the scale differences in the match-up sites, see (Bailey & Werdell, 2006; Bissett et al., 2004; Harding Jr. et al., 2005; Hu et al., 2000). For example, Hyde et al. (2007) applied a correction algorithm to SeaWiFS products of chlorophyll-a to overcome the mismatch which was partially due to sampling size differences. Although this assumption of spatial homogeneity have resulted in good matches for most open ocean matchup data (Carder et al., 2004; Garcia et al., 2005; Karl & Lukas, 1996; McClain et al., n.d.), it lowers the percentage of usable match-up points considerably (Hooker & McClain, 2000; Mélin et al., 2005) and should be avoided for productive waters (Chang & Gould, 2006; Darecki & Stramski, 2004; Harding Jr. et al., 2005). Salama & Su (2010; 2011), used the differences between the earth observation products and *in situ* data to quantify the sub satellite pixel spatial viabilities using both the deterministic and stochastic methods, respectively and neglecting the error. In principle the mismatch between earth observation derived products and *in situ* measured quantities is attributed to the scale difference and errors due to noise, correction and retrieval accuracy. Current uncertainty estimation methods do not consider the spatial dependency of errors and their relationships to the actual distribution of IOPs. Understanding the spatial characteristics of errors is necessary to resolve the smallest sub-scale variability of the IOPs. This aspect should be investigated in the future to define spatial-thresholds of measurable physical processes based on their errors. Moreover, the dependency of both deterministic and stochastic methods on the radiometric uncertainties limit their accuracy and application to cases where such data are available with an acceptable degree of confidence. A self-consistent and operational method is still required to estimate the uncertainties of IOPs without additional inputs or assumptions on the radiometric fluctuations.

#### **10. Acknowledgment**

The authors would like to thank NASA Ocean Biology Processing Group and individual data contributors for maintaining and updating the SeaBASS database.

#### **11. References**

22 Will-be-set-by-IN-TECH

In this chapter we reviewed the recent advances in uncertainty estimation of the earth observation products of water quality. Both deterministic and stochastic methods are presented and their results are inter- compared. The stochastic method is more appropriate to estimate actual errors of ocean color derived products than the deterministic methods, however, it is still limited to few studies and as the deterministic approach requires prior information. The uncertainties could be decomposed only if additional information is provided a priori. Using a simple exercise it was shown that atmospheric-induced errors are major contributors to the total error of IOPs whereas model-induced errors are inherent to the derived IOPs depending on the used derivation method and number of spectral bands. The error in this chapter was estimated as the difference between ground truth measurement and satellite derived products. Direct matching between earth observation data and just above the water field measurements imbed, however, an inherent scale difference. This scale difference between *in-situ* observation and a pixel of ocean color satellite is at least three to four orders of magnitude for nadir match-up sites and much larger for off-nadir ones. This huge scale difference, means that point measurement is sampling a tiny fraction of the water body which is observed by a satellite pixel. Few studies were carried out to address the scale difference between point and aerospace measurements directly. Most of these studies have used re-sampling to smooth out the scale differences in the match-up sites, see (Bailey & Werdell, 2006; Bissett et al., 2004; Harding Jr. et al., 2005; Hu et al., 2000). For example, Hyde et al. (2007) applied a correction algorithm to SeaWiFS products of chlorophyll-a to overcome the mismatch which was partially due to sampling size differences. Although this assumption of spatial homogeneity have resulted in good matches for most open ocean matchup data (Carder et al., 2004; Garcia et al., 2005; Karl & Lukas, 1996; McClain et al., n.d.), it lowers the percentage of usable match-up points considerably (Hooker & McClain, 2000; Mélin et al., 2005) and should be avoided for productive waters (Chang & Gould, 2006; Darecki & Stramski, 2004; Harding Jr. et al., 2005). Salama & Su (2010; 2011), used the differences between the earth observation products and *in situ* data to quantify the sub satellite pixel spatial viabilities using both the deterministic and stochastic methods, respectively and neglecting the error. In principle the mismatch between earth observation derived products and *in situ* measured quantities is attributed to the scale difference and errors due to noise, correction and retrieval accuracy. Current uncertainty estimation methods do not consider the spatial dependency of errors and their relationships to the actual distribution of IOPs. Understanding the spatial characteristics of errors is necessary to resolve the smallest sub-scale variability of the IOPs. This aspect should be investigated in the future to define spatial-thresholds of measurable physical processes based on their errors. Moreover, the dependency of both deterministic and stochastic methods on the radiometric uncertainties limit their accuracy and application to cases where such data are available with an acceptable degree of confidence. A self-consistent and operational method is still required to estimate the uncertainties of IOPs without additional inputs or assumptions on the radiometric

The authors would like to thank NASA Ocean Biology Processing Group and individual data

contributors for maintaining and updating the SeaBASS database.

**9. Summary and future developments**

fluctuations.

**10. Acknowledgment**


Goody, R. (1964). *Atmospheric radiation 1, theoretical basis*, Oxford University Press.


24 Will-be-set-by-IN-TECH

Gordon, H. (1997). Atmospheric correction of ocean color imagery in the earth observing

Gordon, H., Brown, J. & Evans, R. (1988). Exact rayleigh scattering calculation for the use with the nimbus-7 coastal zone color scanner, *Applied Optics* 27(5): 862–871. Gordon, H., Brown, O., Evans, R., Brown, J., Smith, R., Baker, K. & Clark, D. (1988).

Gordon, H., Clark, D., Brown, J., Brown, O., Evans, R. & Broenkow, W. (1983).

Harding Jr., L., Magnusona, A. & Malloneea, M. (2005). SeaWiFS retrievals of chlorophyll

Holben, B., Eckdagger, T., Slutsker, I., Sospedra, E., Caselles, V., Coll, C., Valor, E. & Rubio, E.

Hoogenboom, H. & Dekker, A. (1998). The sensitivity of medium resolution imaging

Hooker, S. & McClain, C. (2000). The calibration and validation of SeaWiFS data, *Progress In*

Hu, C., Carder, K. & Muller-Karger, F. (2000). How precise are SeaWiFS ocean color

Hu, C., Chen, Z., Clayton, T., Swarzenski, P., Brock, J. & Muller-Karger, F. (2004). Assessment

results from Tampa Bay, FL, *Remote Sensing of Environment* 93(3): 423–441. Hyde, K., ÓReilly, J. & Oviatt, C. (2007). Validation of SeaWiFS chlorophyll-a in Massachusetts

Jaynes, E. (1957a). Information theory and statistical mechanics, *Physical Review* 106: 620–630. Jaynes, E. (1957b). Information theory and statistical mechanics, *Physical Review* 108: 171–190. Jaynes, E. (1968). Prior probabilities, *IEEE Transactions On Systems Science and Cybernetics*

Johnson, W. & Geisser, S. (1985). Estimative influence measures of the multivariate general

Karl, D. & Lukas, R. (1996). The hawaii ocean time-series (hot) program: Background,

Kendall, M. D. & Stuart, A. (1987). *The advanced theory of statistics: Distribution theory*, Vol. 1,

Kopelevich, O. (1983). Small-parameter model of optical properties of sea waters, *in* A. Monin (ed.), *Ocean Optics*, Vol. 1 Physical Ocean Optics, Nauka, pp. 208–234. Kullback, S. & Leibler, R. (1951). On information and sufficiency, *The Annals of Mathematical*

rationale and field implementation, *Deep Sea Research Part II: Topical Studies in*

linear model, *Journal of Statistical Planning Inference* 11: 33–56.

of ship determinations and CZCS estimates, *Applied Optics* 22(1): 20–36. Goutis, C. & Robert, C. (1998). Model choice in generalised linear models: A bayesian

approach via kullback-leibler projections, *Biometrika* 85(1): 22–37.

*Economic and Environmental Applications* pp. 119–123.

Bay, *Continental Shelf Research* 27(12): 1677–1691.

A semianalytical radiance model of ocean color, *Journal Of Geophysical Research*

Phytoplankton pigment concentrations in the Middle Atlantic Bight: Comparison

in Chesapeake Bay and the Mid Atlantic Bight, *Estuarine, Coastal and Shelf Science*

(2000). Validation of cloud detection algorithms, *Remote Sensing in the 21st Century:*

spectrometer (MERIS) for detecting chlorophyll and seston dry weight in coastal and inland waters, *IEEE Proceeding on Geoscience and Remote Sensing*, Vol. 1 of *IGARSS*,

estimates? implications of digitization-noise errors, *Remote Sensing of Environment*

of estuarine water-quality indicators using modis medium-resolution bands: Initial

Goody, R. (1964). *Atmospheric radiation 1, theoretical basis*, Oxford University Press.

system era, *Journal Of Geophysical Research* 102(D14): 17081–17106.

93(D9): 10,909–10,924.

62(1-2): 75–94.

IEEE, pp. 183 – 185.

76(2): 239–249.

4(3): 227–241.

*Oceanography* 45(3-4): 427–465.

*Oceanography* 43(1-2): 129–156.

5th ed., Griffin, London.

*Statistics* 22(1): 79–86.


26 Will-be-set-by-IN-TECH

254 Earth Observation

Salama, M. S., Mélin, F. & Van der Velde, R. (2011). Ensemble uncertainty of inherent optical

Salama, M. & Stein, A. (2009). Error decomposition and estimation of inherent optical

Salama, M. & Su, Z. (2010). Bayesian model for matching the radiometric measurements of

Salama, M. & Su, Z. (2011). Resolving the subscale spatial variability of apparent and inherent

Salama, S., Monbaliu, J. & Coppin, P. (2004). Atmospheric correction of advanced

Shannon, C. (1948). A mathematical theory of communication, *The Bell System Technical Journal*

Singh, V. (1998). *Entropy-based parameter estimation in Hydrology*, Vol. 30 of *Water Science and*

Sydor, M., Gould, R., Arnone, R., Haltrin, V. & Goode, W. (2004). Uniqueness in

Van Der Woerd, H. & Pasterkamp, R. (2008). Hydropt: A fast and flexible method to retrieve

Velde van der, R., Su, Z. & Ma, Y. (2008). Impact of soil moisture dynamics on asar

Vermote, E., Tanre, D., Deuze, J., Herman, M. & Morcrette, J. (1997). Second simulation of the

Wang, P., Boss, E. & Roesler, C. (2005). Uncertainties of inherent optical properties obtained from semianalytical inversions of ocean color, *Applied Optics* 44(9): 4074–4084. Werdell, J. & Bailey, S. (2005). An improved in-situ bio-optical data set for ocean color

Zaneveld, R. (1994). Optical closure: from theory to measurement, *in* R. Spinrad, K. Carder &

optical properties in ocean color matchup sites, *IEEE Transactions On Geoscience and*

very high resolution radiometer imagery, *internation Journal of Remote Sensing*

remote sensing of the inherent optical properties of ocean water, *Applied Optics*

chlorophyll-a from multispectral satellite observations of optically complex coastal

so signatures and its spatial variability observed over the tibetan plateau, *Sensors*

satellite signal in the solar spectrum, 6S: An overview, *IEEE Transactions on Geoscience*

algorithm development and satellite data product validation, *Remote Sensing of*

M. Perry (eds), *Ocean Optics*, Vol. 25 of *Oxford Monographs on Geology and Geophysics*,

aerospace and field ocean color sensors, *Sensors* 10(8): 7561–7575.

properties, *Optics Express* 19(18): 16772–16783.

27(Reprinted with corrections): 379–423, 623–656.

*Technology Library*, Kluwer Academic Publishers, Dordrecht.

waters, *Remote Sensing of Environment* 112(4): 1795–1807.

properties, *Applied Optics* 48(26): 4947–4962.

*Remote Sensing* 49(7):2612–2622.

*and Remote Sensing* 35(3): 675–686.

Oxford University Press, New York, p. 283.

*Environment* 98(1): 122–140.

25(7-8): 1349–1355.

43(10): 2156–2162.

8: 5479–5491.

### *Edited by Rustam B. Rustamov and Saida E. Salahova*

Today, space technology is used as an excellent instrument for Earth observation applications. Data is collected using satellites and other available platforms for remote sensing. Remote sensing data collection detects a wide range of electromagnetic energy which is emitting, transmitting, or reflecting from the Earth's surface. Appropriate detection systems are needed to implement further data processing. Space technology has been found to be a successful application for studying climate change, as current and past data can be dynamically compared. This book presents different aspects of climate change and discusses space technology applications.

Photo by RomoloTavani / iStock

Earth Observation

Earth Observation

*Edited by Rustam B. Rustamov* 

*and Saida E. Salahova*