**Meet the editor**

Qiguang Miao is a professor and Ph.D. supervisor of School of Computer Science and Technology in Xidian University. In 2012, he was supported by the Program for New Century Excellent Talents in University by Ministry of Education. He received his doctor degree in computer application technology from Xidian University in December 2005. In the field of teaching, he was awarded

as one of Pacemaker of Ten Excellent Teacher twice in 2008 and 2011. His research interests include machine learning, intelligent image processing and malware behavior analysis and understanding. As principal investigator, he is doing or has completed 3 projects of NSFC, 2 projects of Shaanxi provincial natural science fund, more than 10 projects of National Defence Pre-research Foundation, 863 and Weapons and Equipment fund. He has published over 50 papers in the significant domestic and international journal or conference.

Contents

**Preface VII**

**Application 1**

**Classification 57**

Shaohui Chen

Chapter 1 **Investigation of Image Fusion for Remote Sensing**

Selim Hemissi and Imed Riadh Farah

Chapter 4 **High-Resolution and Hyperspectral Data Fusion for**

**Synthetic Aperture Radar 37**

Hina Pande and Poonam S. Tiwari

**Vision Colorization Techniques 79**

Miao Qiguang, Shi Cheng and Li Weisheng

Chapter 6 **A Trous Wavelet and Image Fusion 103**

Chapter 7 **Image Fusion Based on Shearlets 113**

Dong Jiang, Dafang Zhuang and Yaohuan Huang

Chapter 2 **A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method 19**

Chapter 3 **Multi-Frequency Image Fusion Based on MIMO UWB OFDM**

Chapter 5 **The Objective Evaluation Index (OEI) for Evaluation of Night**

Yufeng Zheng, Wenjie Dong, Genshe Chen and Erik P. Blasch

Md Anowar Hossain, Ibrahim Elshafiey and Majeed A. S. Alkanhal

## Contents

#### **Preface XI**


Chapter 7 **Image Fusion Based on Shearlets 113** Miao Qiguang, Shi Cheng and Li Weisheng

Preface

a great help to this book.

tional and improved algorithms are present.

by Md Anowar Hossain, Ibrahim Elshafiey, et al.

quency imagery to enhance the resolution of the SAR image.

Image Fusion is an important branch of information fusion, and it is also an important tech‐ nology for image understanding and computer vision. The fusion process is to merging dif‐ ferent images into one to get more accurate description for the scene. The original images for image fusion are always obtained by several different image sensors, or the same sensor in different operating modes. The fused image can provide more effective information for fur‐ ther image processing, such as image segmentation, object detection and recognition. Image fusion is a new study field which combined with many different disciplines, such as sensors,

Since the 1990s, image fusion has been applied in many fields, such as remote sensing, robot vision, medical imaging and diagnostics. Hence its application values are increasingly con‐ cerned by many scholars. In the past two decades, a large number of research literatures appear. This book is edited based on these research results, and many research scholars give

This book consists of seven chapters. Chapter 1 introduces the applications of multi-spectral image for remote sensing techniques. This chapter is written by Dong Jiang, Dafang Zhuang, et al. These chapters focus on the multi-spectral image fusion, and several tradi‐

Chapter 2 introduces a multi-features fusion of multi-temporal hyperspectral images via a cooperative GDD/SVM method. This chapter is written by Selim Hemissi and Imed Riadh Farah. This chapter mainly discusses the feature fusion for hyperspectral data. Several fea‐

Chapter 3 gives a novel image fusion algorithm based wide-swath and high-resolution syn‐ thetic aperture radar system using MIMO UWB-OFDM architecture. This chapter is written

Multiple-Input Multiple-Output (MIMO) radar has been gradually concerned by people. However, only few scholars study the fusion technology on MIMO synthetic aperture radar (SAR). This chapter gives different fusion algorithms for the multi-sensor and multi-fre‐

Chapter 4 introduces the high-resolution and hyperspectral data fusion and classification. This chapter is written by Hina Pande and Poonam S. Tiwari. High-resolution image has a high spatial resolution, and hyperspectral data has a large number of measured wavelength bands. The purpose of the fusion is to get a new image which has the spatial resolution of the high resolution image and preserves the spectral characteristics of the hyperspectral im‐

signal processing, image processing, computer and artificial intelligence.

ture extraction and classification algorithms are discussed in this chapter.

## Preface

Image Fusion is an important branch of information fusion, and it is also an important tech‐ nology for image understanding and computer vision. The fusion process is to merging dif‐ ferent images into one to get more accurate description for the scene. The original images for image fusion are always obtained by several different image sensors, or the same sensor in different operating modes. The fused image can provide more effective information for fur‐ ther image processing, such as image segmentation, object detection and recognition. Image fusion is a new study field which combined with many different disciplines, such as sensors, signal processing, image processing, computer and artificial intelligence.

Since the 1990s, image fusion has been applied in many fields, such as remote sensing, robot vision, medical imaging and diagnostics. Hence its application values are increasingly con‐ cerned by many scholars. In the past two decades, a large number of research literatures appear. This book is edited based on these research results, and many research scholars give a great help to this book.

This book consists of seven chapters. Chapter 1 introduces the applications of multi-spectral image for remote sensing techniques. This chapter is written by Dong Jiang, Dafang Zhuang, et al. These chapters focus on the multi-spectral image fusion, and several tradi‐ tional and improved algorithms are present.

Chapter 2 introduces a multi-features fusion of multi-temporal hyperspectral images via a cooperative GDD/SVM method. This chapter is written by Selim Hemissi and Imed Riadh Farah. This chapter mainly discusses the feature fusion for hyperspectral data. Several fea‐ ture extraction and classification algorithms are discussed in this chapter.

Chapter 3 gives a novel image fusion algorithm based wide-swath and high-resolution syn‐ thetic aperture radar system using MIMO UWB-OFDM architecture. This chapter is written by Md Anowar Hossain, Ibrahim Elshafiey, et al.

Multiple-Input Multiple-Output (MIMO) radar has been gradually concerned by people. However, only few scholars study the fusion technology on MIMO synthetic aperture radar (SAR). This chapter gives different fusion algorithms for the multi-sensor and multi-fre‐ quency imagery to enhance the resolution of the SAR image.

Chapter 4 introduces the high-resolution and hyperspectral data fusion and classification. This chapter is written by Hina Pande and Poonam S. Tiwari. High-resolution image has a high spatial resolution, and hyperspectral data has a large number of measured wavelength bands. The purpose of the fusion is to get a new image which has the spatial resolution of the high resolution image and preserves the spectral characteristics of the hyperspectral im‐

age. Several algorithms is introduced to achieve this goal, and many classification results based on the fused images are present in this chapter.

Chapter 5 introduces a new metric for objective evaluation of night vision colorization. This charter is written by Yufeng Zheng. Evaluation metric is to balance whether the fused image is good or not. This chapter mainly focuses on how to objectively evaluate the image quali‐ ties of colorized images. In this chapter, some colorization techniques are introduced, and a new colorization metric, OEI, is proposed.

Chapter 6 shows the application of à trous wavelet in image fusion. à trous wavelet has the shift-invariance, and it has a better property than Mallat wavelet. Hence it is more suitable for image fusion. In this chapter, the theory of à trous wavelet is introduced, and several test experiments are shown.

Chapter 7 introduces the image fusion algorithm via shearlets. This chapter is written by Miao Qiguang, Shi Cheng, et al. Shearlet was proposed in 2005. In this chapter, the theory of shearlet is introduced. Meanwhile, for the multi-focus image, a novel fusion framework based on shearlets is proposed. And for the remote sensing image, combined with PCNN, a new fusion algorithm is discussed.

This book focus on the latest research achievements, and to some extent reflects the current work of the scholars. Thanks to all the scholars who have contributed to this book.

> **Prof. Qiguang Miao** School of Computer Science and Technology Xidian University China

**Chapter 1**

**Investigation of Image Fusion for Remote Sensing**

Remote sensing techniques have proven to be powerful tools for the monitoring of the Earth's surface and atmosphere on a global, regional, and even local scale, by providing im‐ portant coverage, mapping and classification of land cover features such as vegetation, soil, water and forests. The volume of remote sensing images continues to grow at an enormous rate due to advances in sensor technology for both high spatial and temporal resolution sys‐ tems. Consequently, an increasing quantity of image data from airborne/satellite sensors have been available, including multi-resolution images, multi-temporal images, multi-fre‐ quency/spectral bands images and multi-polarization image. Remote sensing information is convenient and easy to be accessed over a large area at low cost, but due to the impact of cloud, aerosol, solar elevation angle and bio-directional reflection, the surface energy pa‐ rameters retrieved from remote sensing data are often missing; meanwhile, the seasonal var‐ iation of surface parameter time-series plots will be also affected. To reduce such impacts, generally time composite method is adopted. The goal of multiple sensor data fusion is to integrate complementary and redundant information to provide a composite image which

**•** *Image fusion is the combination of two or more different images to form a new image by using a*

© 2013 Jiang et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Dong Jiang, Dafang Zhuang and Yaohuan Huang

Additional information is available at the end of the chapter

could be used to better understanding of the entire scene.

The definition of image fusion varies. For example:

*certain algorithm (Genderen and Pohl 1994 ) [1].*

**1.1. Definition of image fusion**

**Application**

**1. Introduction**

http://dx.doi.org/10.5772/56946

## **Investigation of Image Fusion for Remote Sensing Application**

Dong Jiang, Dafang Zhuang and Yaohuan Huang

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56946

### **1. Introduction**

age. Several algorithms is introduced to achieve this goal, and many classification results

Chapter 5 introduces a new metric for objective evaluation of night vision colorization. This charter is written by Yufeng Zheng. Evaluation metric is to balance whether the fused image is good or not. This chapter mainly focuses on how to objectively evaluate the image quali‐ ties of colorized images. In this chapter, some colorization techniques are introduced, and a

Chapter 6 shows the application of à trous wavelet in image fusion. à trous wavelet has the shift-invariance, and it has a better property than Mallat wavelet. Hence it is more suitable for image fusion. In this chapter, the theory of à trous wavelet is introduced, and several test

Chapter 7 introduces the image fusion algorithm via shearlets. This chapter is written by Miao Qiguang, Shi Cheng, et al. Shearlet was proposed in 2005. In this chapter, the theory of shearlet is introduced. Meanwhile, for the multi-focus image, a novel fusion framework based on shearlets is proposed. And for the remote sensing image, combined with PCNN, a

This book focus on the latest research achievements, and to some extent reflects the current

**Prof. Qiguang Miao**

Xidian University

China

School of Computer Science and Technology

work of the scholars. Thanks to all the scholars who have contributed to this book.

based on the fused images are present in this chapter.

new colorization metric, OEI, is proposed.

experiments are shown.

VIII Preface

new fusion algorithm is discussed.

Remote sensing techniques have proven to be powerful tools for the monitoring of the Earth's surface and atmosphere on a global, regional, and even local scale, by providing im‐ portant coverage, mapping and classification of land cover features such as vegetation, soil, water and forests. The volume of remote sensing images continues to grow at an enormous rate due to advances in sensor technology for both high spatial and temporal resolution sys‐ tems. Consequently, an increasing quantity of image data from airborne/satellite sensors have been available, including multi-resolution images, multi-temporal images, multi-fre‐ quency/spectral bands images and multi-polarization image. Remote sensing information is convenient and easy to be accessed over a large area at low cost, but due to the impact of cloud, aerosol, solar elevation angle and bio-directional reflection, the surface energy pa‐ rameters retrieved from remote sensing data are often missing; meanwhile, the seasonal var‐ iation of surface parameter time-series plots will be also affected. To reduce such impacts, generally time composite method is adopted. The goal of multiple sensor data fusion is to integrate complementary and redundant information to provide a composite image which could be used to better understanding of the entire scene.

#### **1.1. Definition of image fusion**

The definition of image fusion varies. For example:

**•** *Image fusion is the combination of two or more different images to form a new image by using a certain algorithm (Genderen and Pohl 1994 ) [1].*

© 2013 Jiang et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Image fusion has proved to be an effective way for optimum utilization of large volumes of image from multiple sources since early 1990's. Multiple image fusion seeks to combine in‐ formation from multiple sources to achieve inferences that are not feasible from a single sen‐ sor or source. It is the aim of image fusion to integrate different data in order to obtain more information than can be derived from each of the single sensor data alone [3].

This chapter focused on multi-sensor image fusion in remote sensing. The fusion of informa‐ tion from sensors with different physical characteristics enhances the understanding of our surroundings and provides the basis for regional planning, decision-making, urban sprawl monitoring and land use/ land cover classification, etc.

> spectral consistency. Among the hundreds of variations of image fusion techniques, the widely used methods include, but are not limited to, intensity-hue-saturation (IHS), highpass filtering, principal component analysis (PCA), different arithmetic combination(e.g. Brovey transform), multi-resolution analysis-based methods (e.g. pyramid algorithm, wave‐ let transform), and Artificial Neural Networks (ANNs), etc. The chapter will provide a gen‐ eral introduction to those selected methods with emphases on new advances in the remote sensing field. In general, all above mentioned approaches can be divided into four different

AVIRIS and LIDAR Coastal mapping Ahmed F. Elaksher [13] 2008

**Data source Objective Authors Time**

Land use classification G. Simone,

Kidiyo Kpalma, Joseph Ronsin [9]

Fredrick Bennet [10]

Bruzzone [5]

Mascarenhas [11]

Methods comparison Marcia L.S. Aguena, Nelson D.A.

Ying Lei, Dong Jiang, and Xiaohuan Yang [12]

Mon Young, Robert Knox, Ken Ellis,

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

A. Farina, F.C. Morabito, S.B. Serpico, L.

1994

3

1999

2001

2006

2007

SPOT HRV & ERS SAR Automatic registration Olivier Thepaut,

Hyperspectral image & SAR image Automatic target cueing Tamar Peli,

Multifrequency, multipolarization SAR

Landsat ETM+ Pan band & CBERS-1

Landsat ETM+ & MODIS Urban sprawl

**Table 1.** Examples of application of image fusion

multiple spectral data

images

**1.** Signal level fusion. In signal-based fusion, signals from different sensors are combined to create a new signal with a better signal-to noise ratio than the original signals.

**2.** Pixel level fusion. Pixel-based fusion is performed on a pixel-by-pixel basis. It generates a fused image in which information associated with each pixel is determined from a set of pixels in source images to improve the performance of image processing tasks such

**3.** Feature level fusion. Feature-based fusion at feature level requires an extraction of ob‐ jects recognized in the various data sources. It requires the extraction of salient features which are depending on their environment such as pixel intensities, edges or textures.

**4.** Decision-level fusion consists of merging information at a higher level of abstraction, combines the results from multiple algorithms to yield a final fused decision. Input im‐

types: signal level, pixel level, feature level, and decision level image fusion [4].

monitoring

These similar features from input images are fused.

as segmentation

#### **1.2. Techniques and application of image fusion**

In the past decades it has been applied to different fields such as pattern recognition, visual enhancement, object detection and area surveillance.In 1997, Hall and Llinas gave a general introduction to multi-sensor data fusion [4]. Another in-depth review paper on multiple sen‐ sors data fusion techniques was published in 1998 [3]. This paper explained the concepts, methods and applications of image fusion as a contribution to multi-sensor integration ori‐ ented data processing. Since then, image fusion has received increasing attention. Further scientific papers on image fusion have been published with an emphasis on improving fu‐ sion quality and finding more application areas. As a case in point, Simone *et al.* describe three typical applications of data fusion in remote sensing, such as obtaining elevation maps from synthetic aperture radar (SAR) interferometers, the fusion of multi-sensor and multitemporal images, and the fusion of multi-frequency, multi-polarization and multi-resolution SAR images [5]. Quite a few survey papers have been published recently, providing over‐ views of the history, developments, and the current state of the art of image fusion in the image-based application fields [6-8], but recent development of multi-sensor data fusion in remote sensing fields has not been discussed in detail (Table 1).

#### **2. Advance in image fusion techniques**

#### **2.1. Categorization of image fusion techniques**

During the past two decades, several fusion techniques have been proposed. Most of these techniques are based on the compromise between the desired spatial enhancement and the


**Table 1.** Examples of application of image fusion

**•** *Image fusion is the process of combining information from two or more images of a scene into a single composite image that is more informative and is more suitable for visual perception or com‐*

**•** *Image fusion is a process of combining images, obtained by sensors of different wavelengths simul‐ taneously viewing of the same scene, to form a composite image. The composite image is formed to improve image content and to make it easier for the user to detect, recognize, and identify targets and increase his situational awareness. 2010. (http://www.hcltech.com/aerospace-and-defense/ en‐*

Image fusion has proved to be an effective way for optimum utilization of large volumes of image from multiple sources since early 1990's. Multiple image fusion seeks to combine in‐ formation from multiple sources to achieve inferences that are not feasible from a single sen‐ sor or source. It is the aim of image fusion to integrate different data in order to obtain more

This chapter focused on multi-sensor image fusion in remote sensing. The fusion of informa‐ tion from sensors with different physical characteristics enhances the understanding of our surroundings and provides the basis for regional planning, decision-making, urban sprawl

In the past decades it has been applied to different fields such as pattern recognition, visual enhancement, object detection and area surveillance.In 1997, Hall and Llinas gave a general introduction to multi-sensor data fusion [4]. Another in-depth review paper on multiple sen‐ sors data fusion techniques was published in 1998 [3]. This paper explained the concepts, methods and applications of image fusion as a contribution to multi-sensor integration ori‐ ented data processing. Since then, image fusion has received increasing attention. Further scientific papers on image fusion have been published with an emphasis on improving fu‐ sion quality and finding more application areas. As a case in point, Simone *et al.* describe three typical applications of data fusion in remote sensing, such as obtaining elevation maps from synthetic aperture radar (SAR) interferometers, the fusion of multi-sensor and multitemporal images, and the fusion of multi-frequency, multi-polarization and multi-resolution SAR images [5]. Quite a few survey papers have been published recently, providing over‐ views of the history, developments, and the current state of the art of image fusion in the image-based application fields [6-8], but recent development of multi-sensor data fusion in

During the past two decades, several fusion techniques have been proposed. Most of these techniques are based on the compromise between the desired spatial enhancement and the

information than can be derived from each of the single sensor data alone [3].

monitoring and land use/ land cover classification, etc.

remote sensing fields has not been discussed in detail (Table 1).

**2. Advance in image fusion techniques**

**2.1. Categorization of image fusion techniques**

**1.2. Techniques and application of image fusion**

*puter processing. (Guest editorial of Information Fusion, 2007) [2].*

*hanced-vision-system/).*

2 New Advances in Image Fusion

spectral consistency. Among the hundreds of variations of image fusion techniques, the widely used methods include, but are not limited to, intensity-hue-saturation (IHS), highpass filtering, principal component analysis (PCA), different arithmetic combination(e.g. Brovey transform), multi-resolution analysis-based methods (e.g. pyramid algorithm, wave‐ let transform), and Artificial Neural Networks (ANNs), etc. The chapter will provide a gen‐ eral introduction to those selected methods with emphases on new advances in the remote sensing field. In general, all above mentioned approaches can be divided into four different types: signal level, pixel level, feature level, and decision level image fusion [4].


ages are processed individually for information extraction. The obtained information is then combined applying decision rules to reinforce common interpretation.

structing wavelet orthonormal basis. On the basis, wavelet transform can be really applied to image decomposition and reconstruction [19, 20]. Wavelet transforms provide a frame‐ work in which an image is decomposed, with each level corresponding to a coarser resolu‐ tion band. For example, in the case of fusing a MS image with a high-resolution PAN image with wavelet fusion, the Pan image is first decomposed into a set of low-resolution Pan im‐ ages with corresponding wavelet coefficients (spatial details) for each level. Individual bands of the MS image then replace the low-resolution Pan at the resolution level of the original MS image. The high resolution spatial detail is injected into each MS band by per‐ forming a reverse wavelet transform on each MS band together with the corresponding

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

5

In the wavelet-based fusion schemes, detail information is extracted from the PAN image using wavelet transforms and injected into the MS image. Distortion of the spectral informa‐ tion is minimized compared to the standard methods. In order to achieve optimum fusion results, various wavelet-based fusion schemes had been tested by many researchers. Among these schemes several new concepts/algorithms were presented and discussed. Candes pro‐ vided a method for fusing SAR and visible MS images using the Curvelet transformation. The method was proven to be more efficient for detecting edge information and denoising than wavelet transformation [21]. Curvelet-based image fusion has been used to merge a Landsat ETM+ panchromatic and multiple-spectral image. The proposed method simultane‐ ously provides richer information in the spatial and spectral domains [22]. Donoho *et al.* pre‐ sented a flexible multi-resolution, local, and directional image expansion using contour segments, the Contourlet transform, to solve the problem that wavelet transform could not efficiently represent the singularity of linear/curve in image processing [23]. Contourlet transform provides flexible number of directions and captures the intrinsic geometrical

wavelet coefficients (Figure 1).

**Figure 1.** Generic flowchart of wavelet-based image fusion

structure of images.

#### **2.2. Advance in image fusion methods**

#### *2.2.1. Convenient image fusion methods*

The PCA transform converts inter-correlated multi-spectral (MS) bands into a new set of un‐ correlated components. To do this approach first we must get the principle components of the MS image bands. After that, the first principle component which contains the most infor‐ mation of the image is substituted by the panchromatic image. Finally the inverse principal component transform is done to get the new RGB (Red, Green, and Blue) bands of multispectral image from the principle components.

The IHS fusion converts a color MS image from the RGB space into the IHS color space. Be‐ cause the intensity (I) band resembles a panchromatic (PAN) image, it is replaced by a highresolution PAN image in the fusion. A reverse IHS transform is then performed on the PAN together with the hue (H) and saturation (S) bands, resulting in an IHS fused image.

Different arithmetic combinations have been developed for image fusion. The Brovey trans‐ form, Synthetic Variable Ratio (SVR), and Ratio Enhancement (RE) techniques are some suc‐ cessful examples. The basic procedure of the Brovey transform first multiplies each MS band by the high resolution PAN band, and then divides each product by the sum of the MS bands. The SVR and RE techniques are similar, but involve more sophisticated calculations for the MS sum for better fusion quality.

Convenient fusion algorithms mentioned above have been widely used for relatively simple and time efficient fusion schemes. However, several problems must be considered before their application: 1) These fusion algorithms generate a fused image from a set of pixels in the various sources. These pixel-level fusion methods are very sensitive to registration accu‐ racy, so that co-registration of input images at sub-pixel level is required; 2) One of the main limitations of HIS and Brovey transform is that the number of input multiple spectral bands should be equal or less than three at a time; 3) These image fusion methods are often suc‐ cessful at improves the spatial resolution, however, they tend to distort the original spectral signatures to some extent [14, 15]. More recently new techniques such as the wavelet trans‐ form seem to reduce the color distortion problem and to keep the statistical parameters in‐ variable.

#### *2.2.2. Multi-resolution analysis-based methods*

Multi-resolution or multi-scale methods, such as pyramid transformation, have been adopt‐ ed for data fusion since the early 1980s [16]. The Pyramid-based image fusion methods, in‐ cluding Laplacian pyramid transform, were all developed from Gaussian pyramid transform, have been modified and widely used [17, 18].

In 1989, Mallat put all the methods of wavelet construction into the framework of functional analysis and described the fast wavelet transform algorithm and general method of con‐ structing wavelet orthonormal basis. On the basis, wavelet transform can be really applied to image decomposition and reconstruction [19, 20]. Wavelet transforms provide a frame‐ work in which an image is decomposed, with each level corresponding to a coarser resolu‐ tion band. For example, in the case of fusing a MS image with a high-resolution PAN image with wavelet fusion, the Pan image is first decomposed into a set of low-resolution Pan im‐ ages with corresponding wavelet coefficients (spatial details) for each level. Individual bands of the MS image then replace the low-resolution Pan at the resolution level of the original MS image. The high resolution spatial detail is injected into each MS band by per‐ forming a reverse wavelet transform on each MS band together with the corresponding wavelet coefficients (Figure 1).

**Figure 1.** Generic flowchart of wavelet-based image fusion

ages are processed individually for information extraction. The obtained information is

The PCA transform converts inter-correlated multi-spectral (MS) bands into a new set of un‐ correlated components. To do this approach first we must get the principle components of the MS image bands. After that, the first principle component which contains the most infor‐ mation of the image is substituted by the panchromatic image. Finally the inverse principal component transform is done to get the new RGB (Red, Green, and Blue) bands of multi-

The IHS fusion converts a color MS image from the RGB space into the IHS color space. Be‐ cause the intensity (I) band resembles a panchromatic (PAN) image, it is replaced by a highresolution PAN image in the fusion. A reverse IHS transform is then performed on the PAN

Different arithmetic combinations have been developed for image fusion. The Brovey trans‐ form, Synthetic Variable Ratio (SVR), and Ratio Enhancement (RE) techniques are some suc‐ cessful examples. The basic procedure of the Brovey transform first multiplies each MS band by the high resolution PAN band, and then divides each product by the sum of the MS bands. The SVR and RE techniques are similar, but involve more sophisticated calculations

Convenient fusion algorithms mentioned above have been widely used for relatively simple and time efficient fusion schemes. However, several problems must be considered before their application: 1) These fusion algorithms generate a fused image from a set of pixels in the various sources. These pixel-level fusion methods are very sensitive to registration accu‐ racy, so that co-registration of input images at sub-pixel level is required; 2) One of the main limitations of HIS and Brovey transform is that the number of input multiple spectral bands should be equal or less than three at a time; 3) These image fusion methods are often suc‐ cessful at improves the spatial resolution, however, they tend to distort the original spectral signatures to some extent [14, 15]. More recently new techniques such as the wavelet trans‐ form seem to reduce the color distortion problem and to keep the statistical parameters in‐

Multi-resolution or multi-scale methods, such as pyramid transformation, have been adopt‐ ed for data fusion since the early 1980s [16]. The Pyramid-based image fusion methods, in‐ cluding Laplacian pyramid transform, were all developed from Gaussian pyramid

In 1989, Mallat put all the methods of wavelet construction into the framework of functional analysis and described the fast wavelet transform algorithm and general method of con‐

together with the hue (H) and saturation (S) bands, resulting in an IHS fused image.

then combined applying decision rules to reinforce common interpretation.

**2.2. Advance in image fusion methods**

spectral image from the principle components.

for the MS sum for better fusion quality.

*2.2.2. Multi-resolution analysis-based methods*

transform, have been modified and widely used [17, 18].

variable.

*2.2.1. Convenient image fusion methods*

4 New Advances in Image Fusion

In the wavelet-based fusion schemes, detail information is extracted from the PAN image using wavelet transforms and injected into the MS image. Distortion of the spectral informa‐ tion is minimized compared to the standard methods. In order to achieve optimum fusion results, various wavelet-based fusion schemes had been tested by many researchers. Among these schemes several new concepts/algorithms were presented and discussed. Candes pro‐ vided a method for fusing SAR and visible MS images using the Curvelet transformation. The method was proven to be more efficient for detecting edge information and denoising than wavelet transformation [21]. Curvelet-based image fusion has been used to merge a Landsat ETM+ panchromatic and multiple-spectral image. The proposed method simultane‐ ously provides richer information in the spatial and spectral domains [22]. Donoho *et al.* pre‐ sented a flexible multi-resolution, local, and directional image expansion using contour segments, the Contourlet transform, to solve the problem that wavelet transform could not efficiently represent the singularity of linear/curve in image processing [23]. Contourlet transform provides flexible number of directions and captures the intrinsic geometrical structure of images.

In general, as a typical feature level fusion method, wavelet-based fusion could evidently perform better than convenient methods in terms of minimizing color distortion and denois‐ ing effects. It has been one of the most popular fusion methods in remote sensing in recent years, and has been standard module in many commercial image processing soft wares, such as ENVI, PCI, ERDAS. Problems and limitations associated with them include: (1) Its computational complexity compared to the standard methods; (2) Spectral content of small objects often lost in the fused images; (3) It often requires the user to determine appropriate values for certain parameters (such as thresholds). The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could improve the performance results, but these new schemes may cause greater complexi‐ ty in the computation and setting of parameters.

In Figure 6, the hidden layer has several neurons and the output layer has one neuron (or more neuron). The *i*th neuron of the input layer connects with the *j*th neuron of the hidden layer by weight *Wij*, and weight between the *j*th neuron of the hidden layer and the *t*th neu‐ ron of output layer is *Vjt* (in this case *t* = 1). The weighting function is used to simulate and recognize the response relationship between features of fused image and corresponding fea‐ ture from original images (image A and image B). The ANN model is given as follows:

1

In equation (6), *Y*=pixel value of fused image exported from the neural network model,

1

As the first step of ANN-based data fusion, two registered images are decomposed into sev‐ eral blocks with size of M and N (Figure 2). Then, features of the corresponding blocks in the two original images are extracted, and the normalized feature vector incident to neural net‐ works can be constructed. The features used here to evaluate the fusion effect are normally spatial frequency, visibility, and edge. The next step is to select some vector samples to train neural networks. An ANN is a universal function approximator that directly adapts to any nonlinear function defined by a representative set of training data. Once trained, the ANN model can remember a functional relationship and be used for further calculations. For these reasons, the ANN concept has been adopted to develop strongly nonlinear models for multiple sensors data fusion. Thomas *et al.* discussed the optimal fusion method of TV and infrared images using artificial neural networks [26]. After that, many neural network mod‐ els have been proposed for image fusion such as BP, SOFM, and ARTMAP neural networks. BP algorithm has been mostly used. However, the convergence of BP networks is slow and the global minima of the error space may not be always achieved [27]. As an unsupervised network, SOFM network clusters input sample through competitive learning. But the num‐ ber of output neurons should be set before constructing neural networks model [28]. RBF neural network can approximate objective function at any precise level if enough hidden units are provided. The advantages of RBF network training include no iteration, few train‐

é ù æ ö +- - ê ú ç ÷ ë û è ø <sup>å</sup>

=

*i*

*n*

é ù æ ö +- - ê ú ç ÷ ê ú ë û è ø å *q*

*j j*

=

*j*

(in this case, there is only one output node), c=threshold of the output node, *Hj*

1 =

1 exp

Where *Wij*=weight between *i*th input node and the *j*th hidden node, *ai*

g

q

*ij i j*

*V H* (2)

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

7

=weight between *j*th hidden node and output node

*W a* (3)

=threshold of the *j*th hidden node.

=exported

=values of the *i*th input

1 =

1 exp

*Y*

*H*

factor, *n*=number of nodes of input (n~5 here), *hj*

*q*=number of nodes hidden (*q*~8 here), *Vj*

values from the *j*th hidden node:

#### *2.2.3. Artificial neural network based fusion method*

Artificial neural networks (ANNs) have proven to be a more powerful and self-adaptive method of pattern recognition as compared to traditional linear and simple nonlinear analy‐ ses [24]. The ANN-based method employs a nonlinear response function that iterates many times in a special network structure in order to learn the complex functional relationship be‐ tween input and output training data. The general schematic diagram of the ANN-based im‐ age fusion method can be seen in Figure 2.

**Figure 2.** General schematic diagram of the ANN-based image fusion method.

The input layer has several neurons, which represent the feature factors extracted and nor‐ malized from image A and image B. The function of each neuron is a sigmoid function given by [25]:

$$\text{If } (\mathbf{x}) = \frac{1}{1 + e^{-\mathbf{x}}} \tag{1}$$

In Figure 6, the hidden layer has several neurons and the output layer has one neuron (or more neuron). The *i*th neuron of the input layer connects with the *j*th neuron of the hidden layer by weight *Wij*, and weight between the *j*th neuron of the hidden layer and the *t*th neu‐ ron of output layer is *Vjt* (in this case *t* = 1). The weighting function is used to simulate and recognize the response relationship between features of fused image and corresponding fea‐ ture from original images (image A and image B). The ANN model is given as follows:

In general, as a typical feature level fusion method, wavelet-based fusion could evidently perform better than convenient methods in terms of minimizing color distortion and denois‐ ing effects. It has been one of the most popular fusion methods in remote sensing in recent years, and has been standard module in many commercial image processing soft wares, such as ENVI, PCI, ERDAS. Problems and limitations associated with them include: (1) Its computational complexity compared to the standard methods; (2) Spectral content of small objects often lost in the fused images; (3) It often requires the user to determine appropriate values for certain parameters (such as thresholds). The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could improve the performance results, but these new schemes may cause greater complexi‐

Artificial neural networks (ANNs) have proven to be a more powerful and self-adaptive method of pattern recognition as compared to traditional linear and simple nonlinear analy‐ ses [24]. The ANN-based method employs a nonlinear response function that iterates many times in a special network structure in order to learn the complex functional relationship be‐ tween input and output training data. The general schematic diagram of the ANN-based im‐

The input layer has several neurons, which represent the feature factors extracted and nor‐ malized from image A and image B. The function of each neuron is a sigmoid function given

<sup>1</sup> f (x) =1 - <sup>+</sup> *<sup>x</sup> <sup>e</sup>* (1)

ty in the computation and setting of parameters.

6 New Advances in Image Fusion

*2.2.3. Artificial neural network based fusion method*

age fusion method can be seen in Figure 2.

**Figure 2.** General schematic diagram of the ANN-based image fusion method.

by [25]:

$$Y = \frac{1}{1 + \exp\left[-\left(\sum\_{j=1}^{q} V\_j H\_j - \gamma\right)\right]} \tag{2}$$

In equation (6), *Y*=pixel value of fused image exported from the neural network model, *q*=number of nodes hidden (*q*~8 here), *Vj* =weight between *j*th hidden node and output node (in this case, there is only one output node), c=threshold of the output node, *Hj* =exported values from the *j*th hidden node:

$$H = \frac{1}{1 + \exp\left[-\left(\sum\_{i=1}^{n} W\_{ij} a\_i - \theta\_j\right)\right]} \tag{3}$$

Where *Wij*=weight between *i*th input node and the *j*th hidden node, *ai* =values of the *i*th input factor, *n*=number of nodes of input (n~5 here), *hj* =threshold of the *j*th hidden node.

As the first step of ANN-based data fusion, two registered images are decomposed into sev‐ eral blocks with size of M and N (Figure 2). Then, features of the corresponding blocks in the two original images are extracted, and the normalized feature vector incident to neural net‐ works can be constructed. The features used here to evaluate the fusion effect are normally spatial frequency, visibility, and edge. The next step is to select some vector samples to train neural networks. An ANN is a universal function approximator that directly adapts to any nonlinear function defined by a representative set of training data. Once trained, the ANN model can remember a functional relationship and be used for further calculations. For these reasons, the ANN concept has been adopted to develop strongly nonlinear models for multiple sensors data fusion. Thomas *et al.* discussed the optimal fusion method of TV and infrared images using artificial neural networks [26]. After that, many neural network mod‐ els have been proposed for image fusion such as BP, SOFM, and ARTMAP neural networks. BP algorithm has been mostly used. However, the convergence of BP networks is slow and the global minima of the error space may not be always achieved [27]. As an unsupervised network, SOFM network clusters input sample through competitive learning. But the num‐ ber of output neurons should be set before constructing neural networks model [28]. RBF neural network can approximate objective function at any precise level if enough hidden units are provided. The advantages of RBF network training include no iteration, few train‐ ing parameters, high training speed, simply process and memory functions [29]. Hong ex‐ plored the way that using RBF neural networks combined with nearest neighbor clustering method to cluster, and membership weighting is used to fuse. Experiments show this meth‐ od can obtain the better effect of cluster fusion with proper width parameter [30].

**3. Applications of image fusion**

**3.2. Land use and land cover classification**

of image fusion in more detail.

**3.1. Object identification**

It has been widely used in many fields of remote sensing, such as object identification, clas‐ sification, and change detection. The following paragraphs describe the recent achievements

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

9

The feature enhancement capability of image fusion is visually apparent in VIR/VIR combi‐ nations that often results in images that are superior to the original data. In order to maxi‐ mize the amount of information extracted from satellite image data useful products can be found in fused images [3]. An integrated system for automatic road mapping from high-res‐ olution multi-spectral satellite imagery by information fusion was discussed by Jin *et al*. in 2005 [36]. Garzeli. A. presents a solution to enhance the spatial resolution of MS images with high-resolution PAN data. The proposed method exploits the undecimated discrete wavelet transform, and the vector multi-scale Kalman filter, which is used to model the injection process of wavelet details. Fusion simulations on spatially degraded data and fusion tests at the full scale reveal that an accurate and reliable PAN-sharpening is achieved by the pro‐ posed method [37]. A case study, which extracted artificial forest and residential areas using

high spatial resolution image and multiple spectral images, was shown as follows.

Forest classification and mapping provides an important basis for forest monitoring and ecological protection. The method based on single pixel or only on spectral features cannot effectively distinguish the types of forest. Here we present an approach for extracted artifi‐ cial forest areas using SPOT 5 Panchromatic band and multiple spectral images in Naban River National Nature Reserve, is located in Jing Hong City, Yunnan province, South China. The resolution of the panchromatic band of SPOT-5 image is 2.5 m and that of the multispectral bands is 10 m. The Pansharpening fusion method is first used for panchromatic and multi-spectral data fusion of SPOT-5 image data. Next, histogram equalization, median fil‐ tering and PCA method are used to make image optical spectrum enhancement and denois‐ ing, so as to improve the multi-scale image segmentation effect. Compared with the original spectrum data, the image textures of artificial forest after the pretreatment are regularly ar‐ ranged and its visual texture features are very obvious. The particle size information of nat‐ ural forests is significant. So that forest classification could be easily achieved (Figure 3).

Classification of land use and land cover is one of the key tasks of remote sensing applica‐ tions. The classification accuracy of remote sensing images is improved when multiple source image data are introduced to the processing [3]. Images from microwave and optical sensors offer complementary information that helps in discriminating the different classes. As discussed in the work of Wu *et al.*, a multi-sensor decision level image fusion algorithm based on fuzzy theory are used for classification of each sensor image, and the classification results are fused by the fusion rule. Interesting result was achieved mainly for the high speed classification and efficient fusion of complementary information [38]. Land-use/land-

Gail *et al.* used Adaptive Resonance Theory (ART) neural networks to form a new frame‐ work for self-organizing information fusion. The ARTMAP neural network can act as a selforganizing expert system to derive hierarchical knowledge structures from inconsistent training data [31]. ARTMAP information fusion resolves apparent contradictions in input pixel labels by assigning output classes to levels in a knowledge hierarchy. Wang *et al.* pre‐ sented a feature-level image fusion method based on segmentation region and neural net‐ works. The results indicated that this combined fusion scheme was more efficient than that of traditional methods [32].

The ANN-based fusion method exploits the pattern recognition capabilities of artificial neu‐ ral networks, and meanwhile, the learning capability of neural networks makes it feasible to customize the image fusion process. Many of applications indicated that the ANN-based fu‐ sion methods had more advantages than traditional statistical methods, especially when in‐ put multiple sensor data were incomplete or with much noises. It is often served as an efficient decision level fusion tools for its self learning characters, especially in land use/land cover classification. In addition, the multiple inputs − multiple outputs framework make it to be a possible approach to fuse high dimension data, such as long-term time-series data or hyper-spectral data.

#### *2.2.4. Dempster-Shafer evidence theory based fusion method*

Dempster-Shafer decision theory is considered a generalized Bayesian theory, used when the data contributing to the determination of the analysis of the images is subject to uncer‐ tainty. It allows distributing support for proposition not only to a proposition itself but also to the union of propositions that include it. Huadong Wu et.al. presented a system frame‐ work that manages information overlap and resolves conflicts, and the system provides en‐ eralizable architectural support that facilitates sensor fusion [33].

Compared with Bayesian theory, the Dempster-Shafer theory of evidence feels closer to our human perception and reasoning processes. Its capability to assign uncertainty or ignorance to propositions is a powerful tool for dealing with a large range of problems that otherwise would seem intractable [33]. The Dempster-Shafer theory of evidence has been applied on image fusion using SPOT/HRV image and NOAA/AVHRR series. The results show unam‐ biguously the major improvement brought by such a data fusion, and the performance of the proposed method [34]. H. Borotschnig et.al. compared three frameworks for information fusion and view-planning using different uncertainty calculi: probability theory, possibility theory and Dempster-Shafer theory of evidence [35]. The results indicated that Dempster-Shafer decision theory based sensor fusion method will achieve much higher performance improvement, and it provides estimates of imprecision and uncertainty of the information derived from different sources

### **3. Applications of image fusion**

It has been widely used in many fields of remote sensing, such as object identification, clas‐ sification, and change detection. The following paragraphs describe the recent achievements of image fusion in more detail.

#### **3.1. Object identification**

ing parameters, high training speed, simply process and memory functions [29]. Hong ex‐ plored the way that using RBF neural networks combined with nearest neighbor clustering method to cluster, and membership weighting is used to fuse. Experiments show this meth‐

Gail *et al.* used Adaptive Resonance Theory (ART) neural networks to form a new frame‐ work for self-organizing information fusion. The ARTMAP neural network can act as a selforganizing expert system to derive hierarchical knowledge structures from inconsistent training data [31]. ARTMAP information fusion resolves apparent contradictions in input pixel labels by assigning output classes to levels in a knowledge hierarchy. Wang *et al.* pre‐ sented a feature-level image fusion method based on segmentation region and neural net‐ works. The results indicated that this combined fusion scheme was more efficient than that

The ANN-based fusion method exploits the pattern recognition capabilities of artificial neu‐ ral networks, and meanwhile, the learning capability of neural networks makes it feasible to customize the image fusion process. Many of applications indicated that the ANN-based fu‐ sion methods had more advantages than traditional statistical methods, especially when in‐ put multiple sensor data were incomplete or with much noises. It is often served as an efficient decision level fusion tools for its self learning characters, especially in land use/land cover classification. In addition, the multiple inputs − multiple outputs framework make it to be a possible approach to fuse high dimension data, such as long-term time-series data or

Dempster-Shafer decision theory is considered a generalized Bayesian theory, used when the data contributing to the determination of the analysis of the images is subject to uncer‐ tainty. It allows distributing support for proposition not only to a proposition itself but also to the union of propositions that include it. Huadong Wu et.al. presented a system frame‐ work that manages information overlap and resolves conflicts, and the system provides en‐

Compared with Bayesian theory, the Dempster-Shafer theory of evidence feels closer to our human perception and reasoning processes. Its capability to assign uncertainty or ignorance to propositions is a powerful tool for dealing with a large range of problems that otherwise would seem intractable [33]. The Dempster-Shafer theory of evidence has been applied on image fusion using SPOT/HRV image and NOAA/AVHRR series. The results show unam‐ biguously the major improvement brought by such a data fusion, and the performance of the proposed method [34]. H. Borotschnig et.al. compared three frameworks for information fusion and view-planning using different uncertainty calculi: probability theory, possibility theory and Dempster-Shafer theory of evidence [35]. The results indicated that Dempster-Shafer decision theory based sensor fusion method will achieve much higher performance improvement, and it provides estimates of imprecision and uncertainty of the information

od can obtain the better effect of cluster fusion with proper width parameter [30].

of traditional methods [32].

8 New Advances in Image Fusion

hyper-spectral data.

derived from different sources

*2.2.4. Dempster-Shafer evidence theory based fusion method*

eralizable architectural support that facilitates sensor fusion [33].

The feature enhancement capability of image fusion is visually apparent in VIR/VIR combi‐ nations that often results in images that are superior to the original data. In order to maxi‐ mize the amount of information extracted from satellite image data useful products can be found in fused images [3]. An integrated system for automatic road mapping from high-res‐ olution multi-spectral satellite imagery by information fusion was discussed by Jin *et al*. in 2005 [36]. Garzeli. A. presents a solution to enhance the spatial resolution of MS images with high-resolution PAN data. The proposed method exploits the undecimated discrete wavelet transform, and the vector multi-scale Kalman filter, which is used to model the injection process of wavelet details. Fusion simulations on spatially degraded data and fusion tests at the full scale reveal that an accurate and reliable PAN-sharpening is achieved by the pro‐ posed method [37]. A case study, which extracted artificial forest and residential areas using high spatial resolution image and multiple spectral images, was shown as follows.

Forest classification and mapping provides an important basis for forest monitoring and ecological protection. The method based on single pixel or only on spectral features cannot effectively distinguish the types of forest. Here we present an approach for extracted artifi‐ cial forest areas using SPOT 5 Panchromatic band and multiple spectral images in Naban River National Nature Reserve, is located in Jing Hong City, Yunnan province, South China. The resolution of the panchromatic band of SPOT-5 image is 2.5 m and that of the multispectral bands is 10 m. The Pansharpening fusion method is first used for panchromatic and multi-spectral data fusion of SPOT-5 image data. Next, histogram equalization, median fil‐ tering and PCA method are used to make image optical spectrum enhancement and denois‐ ing, so as to improve the multi-scale image segmentation effect. Compared with the original spectrum data, the image textures of artificial forest after the pretreatment are regularly ar‐ ranged and its visual texture features are very obvious. The particle size information of nat‐ ural forests is significant. So that forest classification could be easily achieved (Figure 3).

#### **3.2. Land use and land cover classification**

Classification of land use and land cover is one of the key tasks of remote sensing applica‐ tions. The classification accuracy of remote sensing images is improved when multiple source image data are introduced to the processing [3]. Images from microwave and optical sensors offer complementary information that helps in discriminating the different classes. As discussed in the work of Wu *et al.*, a multi-sensor decision level image fusion algorithm based on fuzzy theory are used for classification of each sensor image, and the classification results are fused by the fusion rule. Interesting result was achieved mainly for the high speed classification and efficient fusion of complementary information [38]. Land-use/land-

(a) (b)

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

11

**Figure 4.** Result of image fusion: CBERS MS and P5 PAN; (a) CBERS multiple spectral image (b) Fused image

(a) (b) (c)

Results indicated that the accuracy of residential areas of Yiwu city derived from fused im‐ age is much higher than result derived from CBERS multiple spectral image (Figure 5).

Change detection is the process of identifying differences in the state of an object or phe‐ nomenon by observing it at different times. Change detection is an important process in monitoring and managing natural resources and urban development because it provides quantitative analysis of the spatial distribution of the population of interest [41]. Image fu‐ sion for change detection takes advantage of the different configurations of the platforms carrying the sensors. The combination of these temporal images in same place enhances in‐ formation on changes that might have occurred in the area observed. Sensor image data

**Figure 5.** Different land use types in fused image; (a) cultivated land (b) water (c) urban settlements

**3.3. Change detection**

**Figure 3.** Extracted artificial forest and residential areas using image fusion techniques; (a) Before fusion (Artificial for‐ est), (b) Fused image (Artificial forest), (c) Before fusion (Natural forest), (d) Fused image (Natural forest and residential area)and residential area.

cover classification had been improved using data fusion techniques such as ANN and the Dempster-Shafer theory of evidence. The experimental results show that the excellent per‐ formance of classification as compared to existing classification techniques [39, 40]. Image fusion methods will lead to strong advances in land use/land cover classifications by use of the complementary of the data presenting either high spatial resolution or high time repeti‐ tiveness.

For example, Indian P5 Panchromatic image(Figure 4 b) with spatial resolution of 2.18 m of Yiwu City, Southeast China, in 2007 was fused with multiple spectral bands of China-Brazil CBERS data (spatial resolution: 19.2m) (Figure 4 a)in 2007. Brovey transformation fusion method was used.

Investigation of Image Fusion for Remote Sensing Application http://dx.doi.org/10.5772/56946 11

**Figure 4.** Result of image fusion: CBERS MS and P5 PAN; (a) CBERS multiple spectral image (b) Fused image

**Figure 5.** Different land use types in fused image; (a) cultivated land (b) water (c) urban settlements

Results indicated that the accuracy of residential areas of Yiwu city derived from fused im‐ age is much higher than result derived from CBERS multiple spectral image (Figure 5).

#### **3.3. Change detection**

cover classification had been improved using data fusion techniques such as ANN and the Dempster-Shafer theory of evidence. The experimental results show that the excellent per‐ formance of classification as compared to existing classification techniques [39, 40]. Image fusion methods will lead to strong advances in land use/land cover classifications by use of the complementary of the data presenting either high spatial resolution or high time repeti‐

(c) (d)

**Figure 3.** Extracted artificial forest and residential areas using image fusion techniques; (a) Before fusion (Artificial for‐ est), (b) Fused image (Artificial forest), (c) Before fusion (Natural forest), (d) Fused image (Natural forest and residential

(a) (b)

For example, Indian P5 Panchromatic image(Figure 4 b) with spatial resolution of 2.18 m of Yiwu City, Southeast China, in 2007 was fused with multiple spectral bands of China-Brazil CBERS data (spatial resolution: 19.2m) (Figure 4 a)in 2007. Brovey transformation fusion

tiveness.

method was used.

area)and residential area.

10 New Advances in Image Fusion

Change detection is the process of identifying differences in the state of an object or phe‐ nomenon by observing it at different times. Change detection is an important process in monitoring and managing natural resources and urban development because it provides quantitative analysis of the spatial distribution of the population of interest [41]. Image fu‐ sion for change detection takes advantage of the different configurations of the platforms carrying the sensors. The combination of these temporal images in same place enhances in‐ formation on changes that might have occurred in the area observed. Sensor image data with low temporal resolution and high spatial resolution can be fused with high temporal resolution data to enhance the changing information of certain ground objects. Madhavan *et al.* presented a decision level fusion system that automatically performs fusion of informa‐ tion from multi-spectral, multi-resolution, and multi-temporal high-resolution airborne data for a change-detection analysis. Changes are automatically detected in buildings, building structures, roofs, roof color, industrial structures, smaller vehicles, and vegetation [42]. An example of change detection using Landsat ETM+ and MODIS data is presented as follow.

Recent study indicated that urban expansion could be efficiently monitored using satellite images with multi-temporal and multi-spatial resolution. For example, Landsat ETM+ Pan‐ chromatic image(Figure 6 a) with spatial resolution of 10 m of Chongqing City, Southwest China, in 2000 was fused with daily-received multiple spectral bands of MODIS data (spa‐ tial resolution: 250m) (Figure 6 b)in 2006.

Brovey transformation fusion method was used.

$$DN\_{fused} = DN\_{pan} \times DN\_{b1} / \ \left(DN\_{b1} + DN\_{b2} + DN\_{b3} \right) \tag{4}$$

**Figure 7.** Fusion result of multiple sources images of Chongqing City

es or derived socio-economic attributes.

**4. Discussion and conclusions**

some emerging challenges and recommendations.

In recent years, object-oriented processing techniques are becoming more popular, com‐ pared to traditional pixel-based image analysis, object-oriented change information is neces‐ sary in decision support systems and uncertainty management strategies. An in-depth paper presented by Ruvimbo *et al.* introduced the concept and applications of object-oriented change detection for urban areas [43]. In general, due to the extensive statistical and derived information available with the object-oriented approach, a number of change images can be presented depending on research objectives. In land use and land cover analysis; this level of precision is valuable as analysis at the object level enables linkage with other GIS databas‐

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

13

Multi-sensor image fusion seeks to combine information from different images to obtain more inferences than can be derived from a single sensor. It is widely recognized as an effi‐ cient tool for improving overall performance in image based application. The chapter pro‐ vides a state-of-art of multi-sensor image fusion in the field of remote sensing. Below are

Where *DNfused* means the DN of the resulting fused image produced from the input data in three MODIS multiple spectral bands (*DNb1, DNb2, DNb3*) multiplied by the high resolution Landsat ETM+ Pan band (*DNpan*).

The building areas remained unchanged from 2000 to 2006 were in grey-pink. Meanwhile, the newly established buildings were in dark red color in the composed image (Figure 7) and could be easily identified.

**Figure 6.** Satellite images of Chongqing City; a) ETM image, 2000 b) MODIS image, 2006

**Figure 7.** Fusion result of multiple sources images of Chongqing City

with low temporal resolution and high spatial resolution can be fused with high temporal resolution data to enhance the changing information of certain ground objects. Madhavan *et al.* presented a decision level fusion system that automatically performs fusion of informa‐ tion from multi-spectral, multi-resolution, and multi-temporal high-resolution airborne data for a change-detection analysis. Changes are automatically detected in buildings, building structures, roofs, roof color, industrial structures, smaller vehicles, and vegetation [42]. An example of change detection using Landsat ETM+ and MODIS data is presented as follow.

Recent study indicated that urban expansion could be efficiently monitored using satellite images with multi-temporal and multi-spatial resolution. For example, Landsat ETM+ Pan‐ chromatic image(Figure 6 a) with spatial resolution of 10 m of Chongqing City, Southwest China, in 2000 was fused with daily-received multiple spectral bands of MODIS data (spa‐

Where *DNfused* means the DN of the resulting fused image produced from the input data in three MODIS multiple spectral bands (*DNb1, DNb2, DNb3*) multiplied by the high resolution

The building areas remained unchanged from 2000 to 2006 were in grey-pink. Meanwhile, the newly established buildings were in dark red color in the composed image (Figure 7)

**Figure 6.** Satellite images of Chongqing City; a) ETM image, 2000 b) MODIS image, 2006

*DN DN DN DN DN DN fused pan b b b b* = ´ ++ 1 123 / ( ) (4)

tial resolution: 250m) (Figure 6 b)in 2006.

12 New Advances in Image Fusion

Landsat ETM+ Pan band (*DNpan*).

and could be easily identified.

Brovey transformation fusion method was used.

In recent years, object-oriented processing techniques are becoming more popular, com‐ pared to traditional pixel-based image analysis, object-oriented change information is neces‐ sary in decision support systems and uncertainty management strategies. An in-depth paper presented by Ruvimbo *et al.* introduced the concept and applications of object-oriented change detection for urban areas [43]. In general, due to the extensive statistical and derived information available with the object-oriented approach, a number of change images can be presented depending on research objectives. In land use and land cover analysis; this level of precision is valuable as analysis at the object level enables linkage with other GIS databas‐ es or derived socio-economic attributes.

#### **4. Discussion and conclusions**

Multi-sensor image fusion seeks to combine information from different images to obtain more inferences than can be derived from a single sensor. It is widely recognized as an effi‐ cient tool for improving overall performance in image based application. The chapter pro‐ vides a state-of-art of multi-sensor image fusion in the field of remote sensing. Below are some emerging challenges and recommendations.

#### **4.1. Improvements of fusion algorithms**

Among the hundreds of variations of image fusion techniques, methods which had be wide‐ ly used including IHS, PCA, Brovey transform, wavelet transform, and Artificial Neural Network (ANN). For methods like HIS, PCA and Brovey transform, which have lower com‐ plexity and faster processing time, the most significant problem is color distortion. Waveletbased schemes perform better than those methods in terms of minimizing color distortion. The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could evidently improve performance result, but they often cause greater complexity in computation and parameters setting. Another chal‐ lenge on existing fusion techniques will be the ability for processing hyper-spectral satellite sensor data. Artificial neural network seem to be one possible approach to handle the high dimension nature of hyper-spectral satellite sensor data.

measures and deployment factors. We expect that future research will address new per‐

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

15

State Key Lab of Resources and Environmental Information System, Institute of Geographi‐ cal Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China

[1] Genderen, J. L. van, and Pohl, C. Image fusion: Issues, techniques and applications. Intelligent Image Fusion, Proceedings EARSeL Workshop, Strasbourg, France, 11 September 1994, edited by J. L. van Genderen and V. Cappellini (Enschede: ITC), 18-

[2] Guest editorial, Image fusion: Advances in the state of the art.(2007). *Information Fu‐*

[3] Pohl, C.; Van Genderen, J.L. (1998). Multisensor image fusion in remote sensing: con‐

[4] Hall, L.; Llinas, J. (1997).An introduction to multisensor data fusion. *Proc. IEEE*. Vol.

[5] Simone, G.; Farina, A.; Morabito, F.C.; Serpico, S.B.; Bruzzone, L. (2002). Image fusion techniques for remote sensing applications. *Information Fusion* Vol.3, pp.3-15, ISSN

[6] Dasarathy, B.V. (2007). A special issue on image fusion: advances in the state of the

[7] Smith, M.I.; Heather, J.P. (2005). Review of image fusion technology in 2005. *In Pro‐*

[8] Blum, R.S.; Liu, Z. (2006). Multi-Sensor Image Fusion and Its Applications; special series on Signal Processing and Communications; *CRC Press: Boca Raton, FL, USA*,

[9] Olivier TheÂpaut, Kidiyo Kpalma, Joseph Ronsin. (2000). Automatic registration of ERS and SPOT multisensory images in a data fusion context. *Forest Ecology and Man‐*

cepts, methods and applications. *Int. J. Remote Sens.Vol.* 19, pp.823-854,

formance assessment criteria and automatic quality assessment methods [46].

, Dafang Zhuang and Yaohuan Huang

\*Address all correspondence to: jiangd@igsnrr.ac.cn

*sion*.Vol.8, pp.114-118, ISSN 0018-9251

art. *Information Fusion*. Vol.8, pp.113, ISSN 0018-9251

*ceedings of Defense and Security Symposium, Orlando, FL, USA, 2005*.

85, pp. 6-23, ISSN 0018-9219

*agement.* Vol.123, pp.93-100,

**Author details**

Dong Jiang\*

**References**

26.

0018-9251

2006.

#### **4.2. From image fusion to multiple algorithm fusion**

Each fusion method has its own set of advantages and limitations. The combination of sev‐ eral different fusion schemes has been approved to be the useful strategy which may ach‐ ieve better quality of results. As a case in point, quite a few researchers have focused on incorporating the traditional IHS method into wavelet transforms, since the IHS fusion method performs well spatially while the wavelet methods perform well spectrally. Howev‐ er, selection and arrangement of those candidate fusion schemes are quite arbitrary and of‐ ten depends upon the user's experience. Optimal combining strategy for different fusion algorithms, in another word, 'algorithm fusion' strategy, is thus urgent needed. Further in‐ vestigations are necessary for the following aspects: 1) Design of a general framework for combination of different fusion approaches; 2) Development of new approaches which can combine aspects of pixel/feature/decision level image fusion; 3) Establishment of automatic quality assessment method for evaluation of fusion results.

#### **4.3. Establishment of an automatic quality assessment scheme.**

Automatic quality assessment is highly desirable to evaluate the possible benefits of fusion, to determine an optimal setting of parameters for a certain fusion scheme, as well as to com‐ pare results obtained with different algorithms. Mathematical methods were used to judge the quality of merged imagery in respect to their improvement of spatial resolution while preserving the spectral content of the data. Statistical indices, such as cross entropy, mean square error, signal-to-noise ratio, have been used for evaluation purpose. While recently a few image fusion quality measures have been proposed, analytical studies of these measures have been lacking. The work of Chen *et al*. focused on one popular mutual informationbased quality measure and weighted averaging image fusion [44]. Zhao presented a new metric based on image phase congruency to assess the performance of the image fusion al‐ gorithm [45]. However, in general, no automatic solution has been achieved to consistently produce high quality fusion for different data sets. It is expected that the result of fusing da‐ ta from multiple independent sensors will offer the potential for better performance than can be achieved by either sensor, and will reduce vulnerability to sensor specific counter‐ measures and deployment factors. We expect that future research will address new per‐ formance assessment criteria and automatic quality assessment methods [46].

#### **Author details**

**4.1. Improvements of fusion algorithms**

14 New Advances in Image Fusion

dimension nature of hyper-spectral satellite sensor data.

quality assessment method for evaluation of fusion results.

**4.3. Establishment of an automatic quality assessment scheme.**

**4.2. From image fusion to multiple algorithm fusion**

Among the hundreds of variations of image fusion techniques, methods which had be wide‐ ly used including IHS, PCA, Brovey transform, wavelet transform, and Artificial Neural Network (ANN). For methods like HIS, PCA and Brovey transform, which have lower com‐ plexity and faster processing time, the most significant problem is color distortion. Waveletbased schemes perform better than those methods in terms of minimizing color distortion. The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could evidently improve performance result, but they often cause greater complexity in computation and parameters setting. Another chal‐ lenge on existing fusion techniques will be the ability for processing hyper-spectral satellite sensor data. Artificial neural network seem to be one possible approach to handle the high

Each fusion method has its own set of advantages and limitations. The combination of sev‐ eral different fusion schemes has been approved to be the useful strategy which may ach‐ ieve better quality of results. As a case in point, quite a few researchers have focused on incorporating the traditional IHS method into wavelet transforms, since the IHS fusion method performs well spatially while the wavelet methods perform well spectrally. Howev‐ er, selection and arrangement of those candidate fusion schemes are quite arbitrary and of‐ ten depends upon the user's experience. Optimal combining strategy for different fusion algorithms, in another word, 'algorithm fusion' strategy, is thus urgent needed. Further in‐ vestigations are necessary for the following aspects: 1) Design of a general framework for combination of different fusion approaches; 2) Development of new approaches which can combine aspects of pixel/feature/decision level image fusion; 3) Establishment of automatic

Automatic quality assessment is highly desirable to evaluate the possible benefits of fusion, to determine an optimal setting of parameters for a certain fusion scheme, as well as to com‐ pare results obtained with different algorithms. Mathematical methods were used to judge the quality of merged imagery in respect to their improvement of spatial resolution while preserving the spectral content of the data. Statistical indices, such as cross entropy, mean square error, signal-to-noise ratio, have been used for evaluation purpose. While recently a few image fusion quality measures have been proposed, analytical studies of these measures have been lacking. The work of Chen *et al*. focused on one popular mutual informationbased quality measure and weighted averaging image fusion [44]. Zhao presented a new metric based on image phase congruency to assess the performance of the image fusion al‐ gorithm [45]. However, in general, no automatic solution has been achieved to consistently produce high quality fusion for different data sets. It is expected that the result of fusing da‐ ta from multiple independent sensors will offer the potential for better performance than can be achieved by either sensor, and will reduce vulnerability to sensor specific counter‐ Dong Jiang\* , Dafang Zhuang and Yaohuan Huang

\*Address all correspondence to: jiangd@igsnrr.ac.cn

State Key Lab of Resources and Environmental Information System, Institute of Geographi‐ cal Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China

#### **References**


[10] Tamar Peli, Mon Young, Robert Knox, Kenneth K. Ellis and Frederick Bennett, Fea‐ ture-level sensor fusion, Proc. SPIE 3719, 1999, 332

[25] Shutao, L.; Kwok, J.T.; Yaonan W.(2002). Multifocus image fusion using artificial neural networks. *Pattern Recognit. Lett.* Vol. 23, pp. 985-997., ISSN 0167-8655

Investigation of Image Fusion for Remote Sensing Application

http://dx.doi.org/10.5772/56946

17

[26] Thomas, F.; Grzegorz, G. (1995).Optimal fusion of TV and infrared images using arti‐ ficial neural networks. *In Proceedings of Applications and Science of Artificial Neural Net‐*

[27] Huang, W.; Jing, Z.(2007). Multi-focus image fusion using pulse coupled neural net‐

[28] Wu, Y.; Yang, W. (2003).Image fusion based on wavelet decomposition and evolu‐

[29] Sun, Z.Z.; Fu, K.; Wu, Y.R. (2003).The high-resolution SAR image terrain classifica‐ tion algorithm based on mixed double hint layers RBFN model. *Acta Electron. Sin.*

[30] Zhang, H.; Sun, X.N.; Zhao, L.; Liu, L. (2008).Image fusion algorithm using RBF neu‐

[31] Gail, A.; Siegfried, M.; Ogas, J.(2005). Self-organizing information fusion and hier‐ archical knowledge discovery- a new framework using ARTMAP neural networks.

[32] Wang, R.; Bu, F.L.; Jin, H.; Li, L.H.(2007). A feature-level image fusion algorithm

[33] Huadong Wu;Mel Siegel; Rainer Stiefelhagen;Jie Yang.(2002).Sensor Fusion Using Dempster-Shafer Theory, *IEEE Instrumentation and Measurement Technology Conference*

[34] S. Le Hégarat-Mascle, D. Richard, C. (2003).Ottlé, Multi-scale data fusion using Dempster-Shafer evidence theory, Integrated Computer-Aided Engineering, Vol.10,

[35] H. Borotschnig, L. Paletta, M. Prantl, and A. Pinz, Graz. (1999).A Comparison of Probabilistic, Possibilistic and Evidence Theoretic Fusion Schemes for Active Object

[36] Jin, X.Y.; Davis, C.H. (2005).An integrated system for automatic road mapping from high-resolution multi-spectral satellite imagery by information fusion. *Inf. Fusion Vol.*

[37] Garzelli, A.; Nencini, F. (2007).Panchromatic sharpening of remote sensing images using a multiscale Kalman filter. *Pattern Recognit.*Vol. 40, pp. 3568-3577, ISSN:

[38] Wu, Y.; Yang, W.(2003). Image fusion based on wavelet decomposition and evolu‐

tionary strategy. *Acta Opt. Sin*.Vol. 23, pp.671-676, ISSN 0253-2239

based on neural networks. *Bioinf. Biomed. Eng*. Vol. 7, pp. 821-824,

*works, Orlando, FL, USA, April 21, 1995*;Vol. 2492, pp.919-925,

tionary strategy. *Acta Opt. Sin*. Vol. 23, pp. 671-676,

ral networks. *Lect. Notes Comput. Sci.* Vol. 9, pp. 417-424,

*Vol.*, 31, pp. 2040-2044,

*Neural Netw*. Vol. 18, pp.287-295,

*Anchorage, AK, USA, 21-23 May 2002,*

Recognition. *Computing*. Vol.62, pp. 293–319,

No.1, pp.9-22, ISSN: 1875-8835

6, pp.257-273, ISSN 0018-9251

0167-8655

work. *Pattern Recognit. Lett*. Vol. 28, pp.1123-1132,, ISSN 0167-8655


[25] Shutao, L.; Kwok, J.T.; Yaonan W.(2002). Multifocus image fusion using artificial neural networks. *Pattern Recognit. Lett.* Vol. 23, pp. 985-997., ISSN 0167-8655

[10] Tamar Peli, Mon Young, Robert Knox, Kenneth K. Ellis and Frederick Bennett, Fea‐

[11] Marcia L.S. Aguena, Nelson D.A. Mascarenhas.(2006). Multispectral image data fu‐ sion using POCS and super-resolution. *Computer Vision and Image Understanding* Vol.

[12] Ying Lei, Dong Jiang, and Xiaohuan Yang.(2007). Appllcation of image fusion in ur‐ ban expanding detection. *Journal of Geomatics*, vol.32, No.3, pp.4-5, ISSN 1007-3817

[13] Ahmed F. Elaksher. (2008). Fusion of hyperspectral images and lidar-based dems for coastal mapping. *Optics and Lasers in Engineering* Vol.46, pp.493-498, ISSN 0143-8166

[14] Pouran, B.(2005). Comparison between four methods for data fusion of ETM+ multi‐

[15] Adelson, C.H.; Bergen, J.R.(1984). Pyramid methods in image processing. *RCA Eng.*

[16] Miao, Q.G.; Wang, B.S. (2007). Multi-sensor image fusion based on improved lapla‐ cian pyramid transform. *Acta Opti. Sin.* Vol.27, pp.1605-1610, ISSN 1424-8220

[17] Xiang, J.; Su, X. (2009). A pyramid transform of image denoising algorithm based on

[18] Mallat, S.G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. *IEEE Trans. Pattern Anal. Mach. Intell*.Vol.11, pp.674-693, ISSN

[19] Ganzalo, P.; Jesus, M.A. (2004). Wavelet-based image fusion tutorial. *Pattern Recognit.*

[20] Ma, H.; Jia, C.Y.; Liu, S. (2005).Multisource image fusion based on wavelet transform.

[21] Candes, E.J.; Donoho, D.L.(2000). Curvelets-A Surprisingly Effective Nonadaptive Representation for Objects with Edges.Curves and Surfcaces; *Vanderbilt University*

[22] Choi, M.; Kim, RY.; Nam, MR. Fusion of multi-spectral and panchromatic satellite images using the Curvelet transform. *IEEE Geosci. Remote Sens. Lett.* Vol.2, pp.

[23] Donoho, M.N.; Vetterli, M. (2002).Contourlets; *Academic Press: New York, NY, USA*,

[24] Dong. J.; Yang, X.; Clinton, N.; Wang, N. (2004).An artificial neural network model for estimating crop yields using remotely sensed information. *Int. J. Remote Sens.* Vol.

spectral and pan images. *Geo-spat. Inf. Sci*.Vol.8, pp.112-122, ISSN

morphology. *Acta Photon. Sin.*Vol.38, pp.89-103, ISSN 1000-7032

ture-level sensor fusion, Proc. SPIE 3719, 1999, 332

102, pp.178-187,

16 New Advances in Image Fusion

Vol.29, pp.33-41,

0162-8828

VOL.37, pp.1855-1872,

136-140, ISSN 0196-2892

ISSN 0890-5401

*Int. J. Inf. Technol.* Vol. 11, pp 81-91,

*Press: Nashville, TN, USA,* pp.105-120,

25, pp. 1723-1732, ISSN 0143-1161


[39] Sarkar, A.; Banerjee, A.; Banerjee, N.; Brahma, S.; Kartikeyan, B.; Chakraborty, M.; Majumder, K.L.(2005). Landcover classification in MRF context using Dempster-Sha‐ fer fusion for multisensor imagery. IEEE Trans. *Image Processing*, Vol.14, pp. 634-645, ISSN : 1057-7149

**Chapter 2**

**Provisional chapter**

**A Multi-Features Fusion of Multi-Temporal**

**A Multi-Features Fusion of Multi-Temporal**

**Hyperspectral Images Using a Cooperative**

Selim Hemissi and Imed Riadh Farah

Selim Hemissi and Imed Riadh Farah

http://dx.doi.org/10.5772/56949

**GDD/SVM Method**

Additional information is available at the end of the chapter

approach has consistence over different testing datasets.

flexible way is still an open research question.

Additional information is available at the end of the chapter

**Method**

10.5772/56949

**1. Introduction**

**2. Problem statement**

**Hyperspectral Images Using a Cooperative GDD/SVM**

Considering the emergence of hyperspectral sensors, feature fusion has been more and more important for images classification, indexing and retrieval. In this chapter, a cooperative fusion method GDD/SVM (Generalized Dirichlet Distribution/Support Vector Machines), which involves heterogeneous features, is proposed for multi-temporal hyperspectral images classification. It differentiates, from most of the previous approaches, by incorporating the potentials of generative models into a discriminative classifier. Therefore, the multi-features, including the 3D spectral features and textural features, can be integrated with an efficient way into a unified robust framework. The experimental results on a series of Hyperion images show that the precision is 92.64% and the recall is 91.87%. The experiments on AVIRIS dataset also confirm the improved performance and show that this cooperative fusion

The semantic categorization of remote-sensing images requires analysis of many features of the images such as texture, spectral profiles, etc. Current feature fusion approaches commonly concatenate different features. It gives, generally good results and several approaches have been proposed using this schema. However, most of them have various conditional constraints, such as noise and imperfection, which might retain the use of such systems under degraded performance. However, how to fuse heterogeneous features in a

> ©2012 Hemissi and Riadh Farah, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Hemissi and Riadh Farah; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.


**Provisional chapter**

#### **A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method**

Selim Hemissi and Imed Riadh Farah Selim Hemissi and Imed Riadh Farah

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56949

#### **1. Introduction**

10.5772/56949

[39] Sarkar, A.; Banerjee, A.; Banerjee, N.; Brahma, S.; Kartikeyan, B.; Chakraborty, M.; Majumder, K.L.(2005). Landcover classification in MRF context using Dempster-Sha‐ fer fusion for multisensor imagery. IEEE Trans. *Image Processing*, Vol.14, pp. 634-645,

[40] Liu, C.P.; Ma, X.H.; Cui, Z.M.(2007). Multi-source remote sensing image fusion classi‐ fication based on DS evidence theory. *In Proceedings of Conference on Remote Sensing and GIS Data Processing and Applications; and Innovative Multispectral Technology and*

[41] Rottensteiner, F.; Trinder, J.; Clode, S.; Kubik, K.; Lovell, B.(2004).Building detection by Dempster-Shafer fusion of LIDAR data and multispectral aerial imagery. *In Pro‐ ceedings of the 17th International Conference on Pattern Recognition, Cambridge, UK, Au‐*

[42] Madhavan, B.B.; Sasagawa, T.; Tachibana, K.; Mishra, K.(2005). A decision level fu‐ sion of ADS-40, TABI and AISA data. *Nippon Shashin Sokuryo Gakkai Gakujutsu Koen‐*

[43] Ruvimbo, G.;Philippe, D.; Morgan, D.(2009). Object-oriented change detection for the city of Harare, Zimbabwe. *Exp. Syst. Appl*.Vol. 36, pp.571-588, ISSN 0013-8703

[44] Chen, Y.; Xue, Z.Y.; Blum, R.S. (2008).Theoretical analysis of an information-based quality measure for image fusion. *Information Fusion*, Vol. 9, pp. 161-175, ISSN

[45] Zhao, J.Y.; Laganiere, R.; Liu, Z.(2006). Image fusion algorithm assessment based on feature measurement. In Proceedings of the 1st International Conference on Innova‐ tive Computing, Information and Control, Beijing, China, August 30 – September 1,

[46] Dong Jiang;Dafang Zhuang, ; Yaohuan Huang; Jingying Fu.(2009). Advances in mul‐ ti-sensor data fusion: algorithms and applications. *Sensors*, Vol.9, No.10, pp. 7771-

*Applications, Wuhan, China, November 15–17, 2007;* Vol. 6790, part 2.

*gust 23–26, 2004;* Vol. 2, pp. 339-342, ISSN: 1001-0920

*kai Happyo Ronbunshu*, Vol.2005, 163-166

ISSN : 1057-7149

18 New Advances in Image Fusion

0018-9251

Vol. 2, pp. 701-704,

7784, ISSN 1424-8220

Considering the emergence of hyperspectral sensors, feature fusion has been more and more important for images classification, indexing and retrieval. In this chapter, a cooperative fusion method GDD/SVM (Generalized Dirichlet Distribution/Support Vector Machines), which involves heterogeneous features, is proposed for multi-temporal hyperspectral images classification. It differentiates, from most of the previous approaches, by incorporating the potentials of generative models into a discriminative classifier. Therefore, the multi-features, including the 3D spectral features and textural features, can be integrated with an efficient way into a unified robust framework. The experimental results on a series of Hyperion images show that the precision is 92.64% and the recall is 91.87%. The experiments on AVIRIS dataset also confirm the improved performance and show that this cooperative fusion approach has consistence over different testing datasets.

#### **2. Problem statement**

The semantic categorization of remote-sensing images requires analysis of many features of the images such as texture, spectral profiles, etc. Current feature fusion approaches commonly concatenate different features. It gives, generally good results and several approaches have been proposed using this schema. However, most of them have various conditional constraints, such as noise and imperfection, which might retain the use of such systems under degraded performance. However, how to fuse heterogeneous features in a flexible way is still an open research question.

the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2013 Hemissi and Riadh Farah; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

©2012 Hemissi and Riadh Farah, licensee InTech. This is an open access chapter distributed under the terms of

Similarly, in the area of Supervised Machine Learning (SML), diversity with respect to the errors committed by component classifiers has received much attention. Generative and discriminative approaches are two distinct schools of probabilistic machine learning. It has shown that discriminative approaches such as SVM [1] outperform model based approaches due to their flexibility in decision boundaries estimation. Conversely, since that discriminative methods are concerned with boundaries, all the classes need to be estimated conjointly [2]. Complementary, one of the interesting characteristics, that generative models have over discriminative ones, is that they are learnt independently for each class. Moreover, following their modeling power, generative models are able to deal with missing data. An ideal fusion method should combine these two approaches in order to improve the classification accuracy.

#### **3. Theoretical background**

#### **3.1. Generalized dirichlet distribution**

Priors based on Dirichlet location-scale mixture of normals are widely used to model densities as mixtures of normal kernels. A random density *f* arising from such a prior can be expressed as

$$f(y) = (\phi \* P)(y) = \int \frac{1}{\sigma} \phi\left(\frac{y-\theta}{\sigma}\right) dP(\theta, \sigma), \tag{1}$$

10.5772/56949

21

http://dx.doi.org/10.5772/56949

Π(*A*|*X*1, ··· , *Xn*) =

Π(*A*|*Y*1, ··· ,*Yn*) =

neighborhood *U* of *f*<sup>0</sup> and any *δ* > 0,

*3.1.2. Density estimation: weak consistency*

tool is the following theorem due to Schwartz (1965).

∀*ǫ* > 0, Π

A prior Π achieves weak posterior consistency at a density *f*0, if

supported. The next lemma reveals the implication of this property.

Consider an *f*<sup>0</sup> ∈ F such that *x*<sup>2</sup> *f*0(*x*)*dx* < ∞. Suppose ˜

*W* of *P*˜ such that for any *f* = *φ* ∗ *P* with *P* ∈ *W*,

*f* ∈ F :

We would use the notation *f*<sup>0</sup> ∈ *KL*(Π) to indicate that a density *f*<sup>0</sup> satisfies (2).

General Mixture Priors First consider the case when the mixing distribution *P* in (1) follows some general distribution Π˜ , not necessarily a Dirichlet process. It is *reasonable* to assume that the weak support of Π˜ contains all probability measures on **R** × **R**<sup>+</sup> that are compactly

(*σ*, *σ*)) = 1 for some *a* > 0, 0 < *σ* < *σ*. Then for any *ǫ* > 0, there exists a weak neighborhood

*f*(*x*) *f*(*x*)

*<sup>f</sup>*0(*x*)log ˜

almost surely as *n* → ∞.

 *<sup>A</sup>* <sup>∏</sup>*<sup>n</sup>*

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

 <sup>F</sup> <sup>∏</sup>*<sup>n</sup>*

We say that the posterior achieves weak (or strong) posterior consistency at *f*<sup>0</sup> if for any weak

**Posterior consistency for regression:** Suppose one observes *Y*1,*Y*2, ··· from the model *Yi* = *α*<sup>0</sup> + *β*0*xi* + *ǫi*, where *xi*'s are known non-random covariate values and *ǫi*'s are independent and identically distributed with an unknown symmetric density *f*0. The regression coefficients *α*0, *β*<sup>0</sup> are also unknown. Here, it is appropriate to consider the parameter space as Θ = F<sup>∗</sup> × **R** × **R**, where F<sup>∗</sup> is a set of symmetric probability densities on **R** with a prior Π on Θ. The posterior distribution Π(·|*Y*!, ··· ,*Yn*) is then computed as,

We say that the posterior achieves weak consistency at (*f*0, *α*0, *β*0) if for any weak

Π((*f* , *α*, *β*) : *f* ∈ *U*, |*α* − *α*0| < *δ*, |*β* − *β*0| < *δ*|*Y*1,*Y*2, ··· ,*Yn*) → 1

We start with weak posterior consistency for the problem of density estimation. Our main

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

*<sup>f</sup>*(*x*) *dx* <sup>&</sup>lt; *<sup>ǫ</sup>*

> 0 (2)

*f* = *φ* ∗ *P*˜ is such that *P*˜((−*a*, *a*) ×

*dx* < *ǫ* (3)

(or *L*1) neighborhood *U* of *f*0, Π(*U*|*X*1, *X*2, ··· , *Xn*) → 1 almost surely as *n* → ∞.

 *<sup>A</sup>* <sup>∏</sup>*<sup>n</sup>*

 <sup>×</sup> <sup>∏</sup>*<sup>n</sup>* *<sup>i</sup>*=<sup>1</sup> *<sup>f</sup>*(*Xi*)*d*Π(*f*)

*<sup>i</sup>*=<sup>1</sup> *<sup>f</sup>*(*Xi*)*d*Π(*f*)

*<sup>i</sup>*=<sup>1</sup> *<sup>f</sup>*(*Yi* <sup>−</sup> *<sup>α</sup>* <sup>−</sup> *<sup>β</sup>xi*)*d*Π(*<sup>f</sup>* , *<sup>α</sup>*, *<sup>β</sup>*)

*<sup>i</sup>*=<sup>1</sup> *<sup>f</sup>*(*Yi* <sup>−</sup> *<sup>α</sup>* <sup>−</sup> *<sup>β</sup>xi*)*d*Π(*<sup>f</sup>* , *<sup>α</sup>*, *<sup>β</sup>*)

.

.

where *φ*(·) is the standard normal density and the mixing distribution *P* follows a Dirichlet process.

[3] initiated a theoretical study of these priors for the problem of density estimation. They showed that if a density *f*<sup>0</sup> satisfies certain conditions, then a Dirichlet location mixture of normals achieves posterior consistency at *f*0. Their conditions can be best summarized as *f*<sup>0</sup> having a moment generating function on an open interval containing [−1, 1]. Ghosal and van der Vaart (2001) extended these results to rate calculations for the more general Dirichlet location-scale mixture prior. However, they restricted the scale parameter *σ* to a compact interval [*σ*, *σ*] ⊂ (0, ∞).

#### *3.1.1. Preliminaries*

To make this chapter relatively self-contained, we recall the definitions of posterior consistency in the context of density estimation and regression. These definitions formalize the concept that in order to achieve consistency, the posterior should concentrate on arbitrarily small neighborhoods of the true model when more observations are made available.

**Posterior consistency for density estimation:** Suppose *X*1, *X*2, ··· are independent and identically distributed according to an unknown density *f*0. We take the parameter space as F - a set of probability densities on the space of the observations and consider a prior distribution Π on F. Then the posterior distribution Π(·|*X*1, ··· , *Xn*) given a sample *X*1, ··· , *Xn* is obtained as,

$$\Pi(A|X\_1\cdots \mathcal{X}\_n) = \frac{\int\_A \prod\_{i=1}^n f(X\_i) d\Pi(f)}{\int\_{\mathcal{F}} \prod\_{i=1}^n f(X\_i) d\Pi(f)}.$$

We say that the posterior achieves weak (or strong) posterior consistency at *f*<sup>0</sup> if for any weak (or *L*1) neighborhood *U* of *f*0, Π(*U*|*X*1, *X*2, ··· , *Xn*) → 1 almost surely as *n* → ∞.

**Posterior consistency for regression:** Suppose one observes *Y*1,*Y*2, ··· from the model *Yi* = *α*<sup>0</sup> + *β*0*xi* + *ǫi*, where *xi*'s are known non-random covariate values and *ǫi*'s are independent and identically distributed with an unknown symmetric density *f*0. The regression coefficients *α*0, *β*<sup>0</sup> are also unknown. Here, it is appropriate to consider the parameter space as Θ = F<sup>∗</sup> × **R** × **R**, where F<sup>∗</sup> is a set of symmetric probability densities on **R** with a prior Π on Θ. The posterior distribution Π(·|*Y*!, ··· ,*Yn*) is then computed as,

$$
\Pi(A|Y\_1, \dots, Y\_n) = \frac{\int\_A \prod\_{i=1}^n f(Y\_i - \alpha - \beta \ge\_i) d\Pi(f, \alpha, \beta)}{\int\_X \prod\_{i=1}^n f(Y\_i - \alpha - \beta \ge\_i) d\Pi(f, \alpha, \beta)}.
$$

We say that the posterior achieves weak consistency at (*f*0, *α*0, *β*0) if for any weak neighborhood *U* of *f*<sup>0</sup> and any *δ* > 0,

$$\Pi((f,\mathfrak{a},\mathfrak{\beta}):f\in \mathcal{U}, |\mathfrak{a}-\mathfrak{a}\_{0}|<\delta, |\mathfrak{\beta}-\mathfrak{\beta}\_{0}|<\delta|Y\_{1}, Y\_{2}, \dots, Y\_{n})\to 1$$

almost surely as *n* → ∞.

2 Image Fusion

classification accuracy.

can be expressed as

interval [*σ*, *σ*] ⊂ (0, ∞).

*X*1, ··· , *Xn* is obtained as,

*3.1.1. Preliminaries*

available.

process.

**3. Theoretical background**

**3.1. Generalized dirichlet distribution**

Similarly, in the area of Supervised Machine Learning (SML), diversity with respect to the errors committed by component classifiers has received much attention. Generative and discriminative approaches are two distinct schools of probabilistic machine learning. It has shown that discriminative approaches such as SVM [1] outperform model based approaches due to their flexibility in decision boundaries estimation. Conversely, since that discriminative methods are concerned with boundaries, all the classes need to be estimated conjointly [2]. Complementary, one of the interesting characteristics, that generative models have over discriminative ones, is that they are learnt independently for each class. Moreover, following their modeling power, generative models are able to deal with missing data. An ideal fusion method should combine these two approaches in order to improve the

Priors based on Dirichlet location-scale mixture of normals are widely used to model densities as mixtures of normal kernels. A random density *f* arising from such a prior

> 1 *σ φ*

where *φ*(·) is the standard normal density and the mixing distribution *P* follows a Dirichlet

[3] initiated a theoretical study of these priors for the problem of density estimation. They showed that if a density *f*<sup>0</sup> satisfies certain conditions, then a Dirichlet location mixture of normals achieves posterior consistency at *f*0. Their conditions can be best summarized as *f*<sup>0</sup> having a moment generating function on an open interval containing [−1, 1]. Ghosal and van der Vaart (2001) extended these results to rate calculations for the more general Dirichlet location-scale mixture prior. However, they restricted the scale parameter *σ* to a compact

To make this chapter relatively self-contained, we recall the definitions of posterior consistency in the context of density estimation and regression. These definitions formalize the concept that in order to achieve consistency, the posterior should concentrate on arbitrarily small neighborhoods of the true model when more observations are made

**Posterior consistency for density estimation:** Suppose *X*1, *X*2, ··· are independent and identically distributed according to an unknown density *f*0. We take the parameter space as F - a set of probability densities on the space of the observations and consider a prior distribution Π on F. Then the posterior distribution Π(·|*X*1, ··· , *Xn*) given a sample

 *y* − *θ σ*

*dP*(*θ*, *σ*), (1)

*f*(*y*)=(*φ* ∗ *P*)(*y*) =

#### *3.1.2. Density estimation: weak consistency*

We start with weak posterior consistency for the problem of density estimation. Our main tool is the following theorem due to Schwartz (1965).

A prior Π achieves weak posterior consistency at a density *f*0, if

$$\forall \epsilon > 0, \ II \left( f \in \mathcal{F} : \int f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x} < \epsilon \right) > 0 \tag{2}$$

We would use the notation *f*<sup>0</sup> ∈ *KL*(Π) to indicate that a density *f*<sup>0</sup> satisfies (2).

General Mixture Priors First consider the case when the mixing distribution *P* in (1) follows some general distribution Π˜ , not necessarily a Dirichlet process. It is *reasonable* to assume that the weak support of Π˜ contains all probability measures on **R** × **R**<sup>+</sup> that are compactly supported. The next lemma reveals the implication of this property.

Consider an *f*<sup>0</sup> ∈ F such that *x*<sup>2</sup> *f*0(*x*)*dx* < ∞. Suppose ˜ *f* = *φ* ∗ *P*˜ is such that *P*˜((−*a*, *a*) × (*σ*, *σ*)) = 1 for some *a* > 0, 0 < *σ* < *σ*. Then for any *ǫ* > 0, there exists a weak neighborhood *W* of *P*˜ such that for any *f* = *φ* ∗ *P* with *P* ∈ *W*,

$$\int f\_0(\mathbf{x}) \log \frac{\tilde{f}(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x} < \varepsilon \tag{3}$$

The proof of this lemma is similar to the proof of Theorem 3 of Ghosal *et al.* (1999) and we present it in the appendix. Here we state and prove the main result.

10.5772/56949

23

http://dx.doi.org/10.5772/56949

<sup>1</sup>+*η*). (10)

<sup>0</sup> *<sup>φ</sup>*(*t*)*dt* <sup>&</sup>lt; 1.

*dθ* (11)

<sup>1</sup>+*η*) (13)

Since *tn* → 1 and *σ<sup>n</sup>* → 0, (7) would imply that *fn*(*x*) → *f*0(*x*) as *n* → ∞ by continuity of *f*0.

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

Since *tn* is a decreasing sequence and *f*0(*θ*) < *M* for all *θ* ∈ **R**, one can readily see that for all

Now, fix an *x* ∈ **R**. Since, |*x* − *θ*|≤|*x*| + *n* for all *θ* ∈ [−*n*, *n*] and *tn* ≥ 1, it follows that for

The last inequality follows from the fact that *τηφ*(*τη*(|*x*| + *τ*)) is decreasing in *τ* for *τ* ≥ 1. Let *ψn*(*x*) = inf*t*∈[*x*−*σn*,*x*+*σn*] *f*0(*t*). It may be noted that the function *ψ*1(*x*) of assumption *3*

= *nηφ*(*nη*(|*x*| + *n*)) ≥ |*x*|

*f*0(*θ*)*dθ* ≥ *tnψn*(*x*)

*<sup>σ</sup><sup>n</sup>* )*d<sup>θ</sup>* <sup>≥</sup> <sup>1</sup>

*cψ*1(*x*) |*x*| < 1

 *An* 1 *σn φ*

*<sup>c</sup>ψ*1(*x*) <sup>+</sup> *<sup>I</sup>*{|*x*|>1} log *<sup>f</sup>*0(*x*)

 *x* − *θ σn*

*fn*(*x*) <sup>→</sup> 0 for all *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>** (8)

*f*0(*θ*)*dθ* ≤ *Mtn* ≤ *Mt*1. (9)

*ηφ*(2|*x*|

 *x* − *θ σn*

<sup>0</sup> *<sup>φ</sup>*(*t*)*dt* <sup>=</sup> *<sup>c</sup>* for all *<sup>n</sup>* <sup>≥</sup> 1 and all *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**

<sup>1</sup>+*η*), *<sup>c</sup>ψ*1(*x*)) <sup>|</sup>*x*| ≥ <sup>1</sup> (12)

*ηφ*(2|*x*|

*dx* = 0. (14)


log *<sup>f</sup>*0(*x*)

*fn*(*x*) = *tn*

1 *σn φ*

 *An* 1 *σn φ*

*fn*(*x*) ≥

*fn*(*x*) ≥

Observe that for all *n* > |*x*|,

 

implies that,

log *<sup>f</sup>*0(*x*) *fn*(*x*) 

Therefore we can simply choose ˜

*fn*(*x*) ≥ *tn*

Since *tn* ≥ 1, *ψn*(*x*) ≥ *ψ*1(*x*) and

 *<sup>n</sup>* −*n* 1 *σn φ*

 |*x*| + *n σn*

is consistent with this definition. Let *An* = [−*n*, *<sup>n</sup>*] ∩ [*<sup>x</sup>* − *<sup>σ</sup>n*, *<sup>x</sup>* + *<sup>σ</sup>n*] and *<sup>c</sup>* = <sup>1</sup>

*ηφ*(2|*x*|

*<sup>f</sup>*0(*x*) <sup>+</sup> log *<sup>f</sup>*0(*x*)

From the assumptions of Theorem 3.2, it can be easily verified that the function on the right hand side of the above display is *f*<sup>0</sup> integrable. Therefore an application of DCT on (8)

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

*fn*(*x*)

*f* = *fn*<sup>0</sup> for some large enough *n*0.

 *x* − *θ σn*

> *An* 1 *<sup>σ</sup><sup>n</sup> <sup>φ</sup>*( *<sup>x</sup>*−*<sup>θ</sup>*

min(|*x*|

it follows from (11) that *fn*(*x*) ≥ *cψ*1(*x*) for all *n* > |*x*|.. Therefore,

A little algebraic manipulation with (9) and (12) obtains, ∀*n* ≥ 1,

lim*n*→<sup>∞</sup> 

<sup>≤</sup> log *Mt*<sup>1</sup>

Therefore one can conclude,

*n* ≥ 1 and all *x* ∈ **R**,

all *n* ≤ |*x*|,

Let *f*0(*x*) be a continuous density on **R** satisfying:

1. *f*<sup>0</sup> is nowhere zero and bounded above by *M* < ∞.

$$\text{2. } |\int\_{\mathbb{R}} f\_0(x) \log f\_0(x) dx| < \infty.$$


Then, *f*<sup>0</sup> ∈ *KL*(Π).

Assumption *4* provides the important moment condition on *f*0. Assumption *2* is satisfied by most of the common densities and assumption *3* can be viewed as a regularity conditions. The interval [*x* − 1, *x* + 1] that appears in assumption *3* can be replaced by [*x* − *a*, *x* + *a*] for any *a* > 0.

*Proof.* **of Theorem 3.1.2** Note that,

$$\int f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x} = \int f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{\tilde{f}(\mathbf{x})} d\mathbf{x} + \int f\_0(\mathbf{x}) \log \frac{\tilde{f}(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x}.\tag{4}$$

Therefore, the result would follow if for any *ǫ* > 0, we can find an ˜ *f* which makes *<sup>f</sup>*<sup>0</sup> log *<sup>f</sup>*<sup>0</sup> ˜ *<sup>f</sup> dx* <sup>&</sup>lt; *<sup>ǫ</sup>*/2 and also satisfies the condition of Lemma 3.1.2. Next we show how to construct such an ˜ *f* .

Consider the densities *fn* = *φ* ∗ *Pn*, *n* ≥ 1, with *Pn*'s constructed as,

$$dP\_n(\theta, \sigma) = t\_n I\_{\{\theta \in [-n, n]\}} f\_0(\theta) \delta\_{\sigma\_n}(\sigma) \tag{5}$$

where *<sup>σ</sup><sup>n</sup>* = *<sup>n</sup>*−*η*, *tn* = ( *<sup>n</sup>* <sup>−</sup>*<sup>n</sup> <sup>f</sup>*0(*y*)*dy*)<sup>−</sup>1, *IA* is the indicator function of a set *<sup>A</sup>* and *<sup>δ</sup><sup>x</sup>* is the point mass at a point *x*. Note that *fn* can be simply written as,

$$f\_{\boldsymbol{\theta}}(\boldsymbol{\chi}) = t\_{\boldsymbol{\eta}} \int\_{-\boldsymbol{n}}^{\boldsymbol{n}} \frac{1}{\sigma\_{\boldsymbol{n}}} \boldsymbol{\phi} \left( \frac{\boldsymbol{\chi} - \boldsymbol{\theta}}{\sigma\_{\boldsymbol{n}}} \right) f\_{0}(\boldsymbol{\theta}) d\boldsymbol{\theta}. \tag{6}$$

Find a positive constant *ξ* such that *<sup>ξ</sup>* <sup>−</sup>*<sup>ξ</sup> <sup>φ</sup>*(*t*)*dt* <sup>&</sup>gt; <sup>1</sup> <sup>−</sup> *<sup>ǫ</sup>*. Now fix an *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**. For sufficiently large *n* such that [*x* − *ξσn*, *x* + *ξσn*] ⊂ [−*n*, *n*], one obtains,

$$\inf\_{y \in \left(\mathbf{x} - \mathbf{\tilde{\xi}}\sigma\_n, \mathbf{x} + \mathbf{\tilde{\xi}}\sigma\_n\right)} f\_0(y)(1 - \epsilon) < \frac{f\_n(\mathbf{x})}{t\_n} < \sup\_{y \in \left(\mathbf{x} - \mathbf{\tilde{\xi}}\sigma\_n, \mathbf{x} + \mathbf{\tilde{\xi}}\sigma\_n\right)} f\_0(y) + M\epsilon \tag{7}$$

Since *tn* → 1 and *σ<sup>n</sup>* → 0, (7) would imply that *fn*(*x*) → *f*0(*x*) as *n* → ∞ by continuity of *f*0. Therefore one can conclude,

4 Image Fusion

2. 

3.

any *a* > 0.

 *<sup>f</sup>*<sup>0</sup> log *<sup>f</sup>*<sup>0</sup> ˜

The proof of this lemma is similar to the proof of Theorem 3 of Ghosal *et al.* (1999) and we

*<sup>ψ</sup>*1(*x*) *dx* <sup>&</sup>lt; <sup>∞</sup> where *<sup>ψ</sup>*1(*x*) = inf*t*∈[*x*−1,*x*+1] *<sup>f</sup>*0(*t*).

Assumption *4* provides the important moment condition on *f*0. Assumption *2* is satisfied by most of the common densities and assumption *3* can be viewed as a regularity conditions. The interval [*x* − 1, *x* + 1] that appears in assumption *3* can be replaced by [*x* − *a*, *x* + *a*] for

> *<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*) ˜ *f*(*x*)

*<sup>f</sup> dx* <sup>&</sup>lt; *<sup>ǫ</sup>*/2 and also satisfies the condition of Lemma 3.1.2. Next we show how

 *x* − *θ σn*

< sup

*y*∈(*x*−*ξσn*,*x*+*ξσn*)

*dx* + 

*dPn*(*θ*, *σ*) = *tn I*(*θ*∈[−*n*,*n*]) *f*0(*θ*)*δσ<sup>n</sup>* (*σ*) (5)

<sup>−</sup>*<sup>ξ</sup> <sup>φ</sup>*(*t*)*dt* <sup>&</sup>gt; <sup>1</sup> <sup>−</sup> *<sup>ǫ</sup>*. Now fix an *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**. For sufficiently

<sup>−</sup>*<sup>n</sup> <sup>f</sup>*0(*y*)*dy*)<sup>−</sup>1, *IA* is the indicator function of a set *<sup>A</sup>* and *<sup>δ</sup><sup>x</sup>* is the

*<sup>f</sup>*0(*x*)log ˜

*f*(*x*) *f*(*x*)

*f*0(*θ*)*dθ*. (6)

*f*0(*y*) + *Mǫ* (7)

*dx*. (4)

*f* which makes

<sup>2</sup>(1+*η*) *f*0(*x*)*dx* < ∞.

present it in the appendix. Here we state and prove the main result.

Let *f*0(*x*) be a continuous density on **R** satisfying:

**<sup>R</sup>** *f*0(*x*)log *f*0(*x*)*dx*

*Proof.* **of Theorem 3.1.2** Note that,

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

*f* .

Find a positive constant *ξ* such that *<sup>ξ</sup>*

inf *y*∈(*x*−*ξσn*,*x*+*ξσn*)

*<sup>f</sup>*(*x*) *dx* <sup>=</sup>

Consider the densities *fn* = *φ* ∗ *Pn*, *n* ≥ 1, with *Pn*'s constructed as,

point mass at a point *x*. Note that *fn* can be simply written as,

large *n* such that [*x* − *ξσn*, *x* + *ξσn*] ⊂ [−*n*, *n*], one obtains,

*fn*(*x*) = *tn*

 *<sup>n</sup>* −*n* 1 *σn φ*

*<sup>f</sup>*0(*y*)(<sup>1</sup> <sup>−</sup> *<sup>ǫ</sup>*) <sup>&</sup>lt; *fn*(*x*)

*tn*

Therefore, the result would follow if for any *ǫ* > 0, we can find an ˜

to construct such an ˜

where *<sup>σ</sup><sup>n</sup>* = *<sup>n</sup>*−*η*, *tn* = ( *<sup>n</sup>*

**<sup>R</sup>** *<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

4. ∃*η* > 0 such that

Then, *f*<sup>0</sup> ∈ *KL*(Π).

1. *f*<sup>0</sup> is nowhere zero and bounded above by *M* < ∞.

<sup>&</sup>lt; <sup>∞</sup>.

**<sup>R</sup>** |*x*|

$$\log \frac{f\_0(\mathbf{x})}{f\_n(\mathbf{x})} \to 0 \quad \text{for all } \mathbf{x} \in \mathbb{R} \tag{8}$$

Since *tn* is a decreasing sequence and *f*0(*θ*) < *M* for all *θ* ∈ **R**, one can readily see that for all *n* ≥ 1 and all *x* ∈ **R**,

$$f\_{\boldsymbol{\theta}}(\mathbf{x}) = t\_{\boldsymbol{\eta}} \int\_{-\boldsymbol{\eta}}^{\boldsymbol{\eta}} \frac{1}{\sigma\_{\boldsymbol{\eta}}} \boldsymbol{\phi} \left( \frac{\mathbf{x} - \boldsymbol{\theta}}{\sigma\_{\boldsymbol{\eta}}} \right) f\_0(\boldsymbol{\theta}) d\boldsymbol{\theta} \le M t\_{\boldsymbol{\eta}} \le M t\_1. \tag{9}$$

Now, fix an *x* ∈ **R**. Since, |*x* − *θ*|≤|*x*| + *n* for all *θ* ∈ [−*n*, *n*] and *tn* ≥ 1, it follows that for all *n* ≤ |*x*|,

$$f\_{\boldsymbol{\eta}}(\mathbf{x}) \ge \frac{1}{\sigma\_n} \boldsymbol{\phi}\left(\frac{|\mathbf{x}| + n}{\sigma\_n}\right) = n^\eta \boldsymbol{\phi}(n^\eta(|\mathbf{x}| + n)) \ge |\mathbf{x}|^\eta \boldsymbol{\phi}(2|\mathbf{x}|^{1 + \eta}).\tag{10}$$

The last inequality follows from the fact that *τηφ*(*τη*(|*x*| + *τ*)) is decreasing in *τ* for *τ* ≥ 1.

Let *ψn*(*x*) = inf*t*∈[*x*−*σn*,*x*+*σn*] *f*0(*t*). It may be noted that the function *ψ*1(*x*) of assumption *3* is consistent with this definition. Let *An* = [−*n*, *<sup>n</sup>*] ∩ [*<sup>x</sup>* − *<sup>σ</sup>n*, *<sup>x</sup>* + *<sup>σ</sup>n*] and *<sup>c</sup>* = <sup>1</sup> <sup>0</sup> *<sup>φ</sup>*(*t*)*dt* <sup>&</sup>lt; 1. Observe that for all *n* > |*x*|,

$$f\_{\rm ll}(\mathbf{x}) \ge t\_{\rm n} \int\_{A\_{\rm n}} \frac{1}{\sigma\_{\rm n}} \phi \left( \frac{\mathbf{x} - \boldsymbol{\theta}}{\sigma\_{\rm n}} \right) f\_{\rm 0}(\boldsymbol{\theta}) d\boldsymbol{\theta} \ge t\_{\rm n} \psi\_{\rm n}(\mathbf{x}) \int\_{A\_{\rm n}} \frac{1}{\sigma\_{\rm n}} \phi \left( \frac{\mathbf{x} - \boldsymbol{\theta}}{\sigma\_{\rm n}} \right) d\boldsymbol{\theta} \tag{11}$$

Since *tn* ≥ 1, *ψn*(*x*) ≥ *ψ*1(*x*) and *An* 1 *<sup>σ</sup><sup>n</sup> <sup>φ</sup>*( *<sup>x</sup>*−*<sup>θ</sup> <sup>σ</sup><sup>n</sup>* )*d<sup>θ</sup>* <sup>≥</sup> <sup>1</sup> <sup>0</sup> *<sup>φ</sup>*(*t*)*dt* <sup>=</sup> *<sup>c</sup>* for all *<sup>n</sup>* <sup>≥</sup> 1 and all *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>** it follows from (11) that *fn*(*x*) ≥ *cψ*1(*x*) for all *n* > |*x*|.. Therefore,

$$f\_{\boldsymbol{\eta}}(\mathbf{x}) \ge \begin{cases} c\psi\_1(\mathbf{x}) & |\mathbf{x}| < 1\\ \min(|\mathbf{x}|^\eta \phi(\mathbf{2}|\mathbf{x}|^{1+\eta}), c\psi\_1(\mathbf{x})) & |\mathbf{x}| \ge 1 \end{cases} \tag{12}$$

A little algebraic manipulation with (9) and (12) obtains, ∀*n* ≥ 1,

$$\left| \log \frac{f\_0(\mathbf{x})}{f\_n(\mathbf{x})} \right| \le \log \frac{Mt\_1}{f\_0(\mathbf{x})} + \log \frac{f\_0(\mathbf{x})}{c\psi\_1(\mathbf{x})} + I\_{\{|\mathbf{x}|>1\}} \log \frac{f\_0(\mathbf{x})}{|\mathbf{x}|^\eta \phi(2|\mathbf{x}|^{1+\eta})} \tag{13}$$

From the assumptions of Theorem 3.2, it can be easily verified that the function on the right hand side of the above display is *f*<sup>0</sup> integrable. Therefore an application of DCT on (8) implies that,

$$\lim\_{n \to \infty} \int f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{f\_n(\mathbf{x})} d\mathbf{x} = 0. \tag{14}$$

Therefore we can simply choose ˜ *f* = *fn*<sup>0</sup> for some large enough *n*0.

#### **3.2. Dirichlet mixture of normals**

Next we consider Π˜ = *Dir*(*αG*0), a Dirichlet process with parameter *αG*0. Here *α* is a positive constant and *G*<sup>0</sup> is a probability measure on **R** × **R**+.

Suppose *f*<sup>0</sup> ∈ F satisfies the following property: For any 0 < *τ* < 1, *ǫ* > 0, there exist a set A and a positive number *x*<sup>0</sup> such that Π˜ (A) > 1 − *τ* and for any *f* = *φ* ∗ *P* with *P* ∈ A,

$$\int\_{|\mathbf{x}| > \mathbf{x}\_0} f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x} < \epsilon. \tag{15}$$

10.5772/56949

25

(16)

. To see that the conditions of *Assumptions 4, 4'* also hold,

<sup>−</sup>*<sup>γ</sup>* for a suitable constant *c*. Now, using the inequality

, *c*′′. The desired inequality follows from these two bounds.

1 <sup>√</sup>2*<sup>π</sup>* *w*(|*x*|) 

*dx*

*<sup>η</sup>* + log

*dP*(*θ*, *σ*)

<sup>√</sup>2*<sup>π</sup> P*(*Kx*)

*<sup>η</sup>*) for all |*x*| > *x*<sup>0</sup> for some fixed constants

*<sup>η</sup>*/*b*1) for all |*x*| > *x*0.

*dx*

*dx*. (17)

<sup>2</sup>*<sup>ξ</sup>* <sup>+</sup> *<sup>λ</sup>*)*r*−1/2 <sup>≤</sup>

−2|*x*| *<sup>η</sup>*+1).

*<sup>η</sup>*+*r*. Therefore, this term

http://dx.doi.org/10.5772/56949

*c*′′ *x*2*<sup>r</sup>*

≤ Pr(*θ* > *x*) + Pr(*σ*−<sup>2</sup> < *e*

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

*dv* <sup>=</sup> *<sup>c</sup>*′ *x*( *<sup>x</sup>*<sup>2</sup>

for some positive constants *c*, *c*′

1 − *G*<sup>0</sup> 

can be made smaller than *c*|*x*|

for some positive constants *c*, *c*′

Pr(*θ* > *x*) ≤

*Kx* = 

 |*x*|>*x*<sup>0</sup>

These sets are of particular interest, since for *f* = *φ* ∗ *P*,

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

large *x*<sup>0</sup> if we can show that *P*(*Kx*) > *c*<sup>1</sup> exp(−*c*2|*x*|

details of the proof in the Appendix.

(−∞, *x*) × (0,*e*

*c x* <sup>∞</sup> 0


*vr*−1/2−1*e*

(*θ*, *<sup>σ</sup>*) <sup>∈</sup> **<sup>R</sup>** <sup>×</sup> **<sup>R</sup>**<sup>+</sup> : <sup>1</sup>

*<sup>f</sup>*(*x*) *dx* <sup>≤</sup>

≤ |*x*|>*x*<sup>0</sup>

≤ |*x*|>*x*<sup>0</sup>

A with Π˜ (A) > 1 − *τ* such that *P* ∈A⇒ *P*(*Kx*) ≥ (1/2) exp(−2|*x*|

 |*x*|>*x*<sup>0</sup>

> *f*0(*x*)

By the assumptions of the Theorem, this quantity can be made arbitrarily small for a suitably

*c*1, *c*<sup>2</sup> > 0. Therefore it suffices to prove that, For any *τ* > 0 there exists an *x*<sup>0</sup> > 0 and a set

The proof of this Lemma is fairly technical. It makes an extensive use of the tail behavior of a random probability *P* arising from a Dirichlet process. For clarity of reading, we present

An argument similar to the one provided above shows that the second term, namely,

1− Φ(*X*) ≤ (1/*x*)*φ*(*x*), where Φ(·) and *φ*(·) are the standard normal distribution and density

Therefore, such a choice of *G*<sup>0</sup> would lead to posterior consistency, for example, when *f*<sup>0</sup> is a

*Proof.* **of Theorem 3.2** We simply need show that such an *f*<sup>0</sup> satisfies the condition of Lemma 3.2. Let *w*(*x*) = exp(−*xη*), *x* ≥ 0. Define a class of subsets of **R** × **R**<sup>+</sup> indexed by *x* ∈ **R**, as

*σ φ*

 *x* − *θ σ*

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*) √ 1

 ≥

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*) *Kx* 1 *σ φ x*−*θ σ* 

<sup>2</sup>*<sup>π</sup> <sup>w</sup>*(|*x*|)*P*(*Kx*)

log *f*0(*x*) + |*x*|

−( *<sup>x</sup>*<sup>2</sup> <sup>2</sup>*<sup>ξ</sup>* +*λ*)*v*

*<sup>η</sup>*+1) is bounded by a constant times *e*−2*r*|*x*<sup>|</sup>

note that,

Pr(*σ*−<sup>2</sup> < *e*−2|*x*<sup>|</sup>

functions, we obtain

Cauchy density.

follows:

Then, *f*<sup>0</sup> ∈ *KL*(Π).

Note that the moment condition of Theorem 3.1.2 is substantially reduced.

Let *f*<sup>0</sup> be a density on **R** satisfying


Further assume that there exist *σ*<sup>0</sup> > 0, 0 < *β* < *η*, *γ* > *β* and *b*1, *b*<sup>2</sup> > 0 such that for large *x* > 0

$$\begin{split} \text{3. } \max \left( \mathcal{G}\_{0} \left( \left[ \mathbf{x} - \sigma\_{0} \mathbf{x}^{\frac{\eta}{2}}, \infty \right) \times [\sigma\_{0}, \infty) \right), \mathcal{G}\_{0} \left( \left[ 0, \infty \right) \times \left( \mathbf{x}^{1 - \frac{\eta}{2}}, \infty \right) \right) \right) \geq b\_{1} \mathbf{x}^{-\beta}, \\ \text{4. } \mathcal{G}\_{0} \left( \left( -\infty, \mathbf{x} \right) \times \left( 0, e^{|\mathbf{x}|^{\eta} - \frac{1}{2}} \right) \right) \geq 1 - b\_{2} |\mathbf{x}|^{ - \gamma}. \end{split}$$

and for large *x* < 0,

$$\begin{split} & \text{3'. } \max \left( \mathcal{G}\_{\mathbf{0}} \left( \left( -\infty, \mathbf{x} + \sigma\_{\mathbf{0}} |\mathbf{x}|^{\frac{\eta}{2}} \right) \times [\sigma\_{\mathbf{0}}, \infty) \right), \mathcal{G}\_{\mathbf{0}} \left( (-\infty, \mathbf{0}] \times (|\mathbf{x}|^{1 - \frac{\eta}{2}}, \infty) \right) \right) \geq b\_{1} |\mathbf{x}|^{-\beta}, \\ & \text{4'. } \mathcal{G}\_{\mathbf{0}} \left( (\mathbf{x}, \infty) \times (\mathbf{0}, e^{|\mathbf{x}|^{\eta} - \frac{1}{2}}) \right) > 1 - b\_{2} |\mathbf{x}|^{-\gamma}. \end{split}$$

then *f*<sup>0</sup> ∈ *KL*(Π). Other than the important moment condition on *f*<sup>0</sup> this theorem also requires some regularity in the tail of the base measure *G*0. For example, assumption *3,3'* requires the tail of *G*<sup>0</sup> not to decay faster than a polynomial rate for the scale parameter *σ*. This condition seems very reasonable since the Cauchy density itself can be written as a scale mixture of normals with the mixing density having a polynomial decay towards infinity.

A standard choice for *G*<sup>0</sup> is the conjugate normal-inverse gamma distribution (see Escobar and West 1995), under which, *θ*|*σ* ∼ *N*(0, *ξσ*2) and *σ*−<sup>2</sup> ∼ *Gamma*(*r*, *λ*), for some *ξ*,*r*, *λ* > 0. For such a *G*<sup>0</sup> with *r* ∈ (1/2, 1), one can show that the conditions of Theorem 3.2 hold true with *η* ∈ (2*r*/(1 + *r*), 1), *β* = *r*(2 − *η*) and *γ* = 2*r*. For example, the conditions in *Assumptions 3, 3'* are satisfied since,

$$\mathcal{G}\_0\left([0,\infty)\times(\mathbf{x}^{1-\frac{\eta}{2}},\infty)\right) = \frac{1}{2}\Pr(\sigma^{-2} \le \mathbf{x}^{-(2-\eta)}) = c \int\_0^{\mathbf{x}^{-(2-\eta)}} v^{\mathbf{y}-1} e^{-\lambda v} dv \le c' \mathbf{x}^{-r(2-\eta)} \Big|\_0$$

for some positive constants *c*, *c*′ . To see that the conditions of *Assumptions 4, 4'* also hold, note that,

$$(1 - G\_0\left(( -\infty, \mathbf{x}) \times (0, e^{|\mathbf{x}|^\eta - \frac{1}{2}}) \right) \le \Pr(\theta > \mathbf{x}) + \Pr(\sigma^{-2} < e^{-2|\mathbf{x}|^\eta + 1}).$$

An argument similar to the one provided above shows that the second term, namely, Pr(*σ*−<sup>2</sup> < *e*−2|*x*<sup>|</sup> *<sup>η</sup>*+1) is bounded by a constant times *e*−2*r*|*x*<sup>|</sup> *<sup>η</sup>*+*r*. Therefore, this term can be made smaller than *c*|*x*| <sup>−</sup>*<sup>γ</sup>* for a suitable constant *c*. Now, using the inequality 1− Φ(*X*) ≤ (1/*x*)*φ*(*x*), where Φ(·) and *φ*(·) are the standard normal distribution and density functions, we obtain

$$\Pr(\theta > \mathbf{x}) \le \frac{c}{\mathfrak{x}} \int\_0^\infty v^{r - 1/2 - 1} e^{-(\frac{\mathbf{x}^2}{2\mathfrak{x}} + \lambda)v} dv = \frac{c'}{\mathfrak{x}(\frac{\mathbf{x}^2}{2\mathfrak{x}} + \lambda)^{r - 1/2}} \le \frac{c''}{\mathfrak{x}^{2r}}$$

for some positive constants *c*, *c*′ , *c*′′. The desired inequality follows from these two bounds. Therefore, such a choice of *G*<sup>0</sup> would lead to posterior consistency, for example, when *f*<sup>0</sup> is a Cauchy density.

*Proof.* **of Theorem 3.2** We simply need show that such an *f*<sup>0</sup> satisfies the condition of Lemma 3.2. Let *w*(*x*) = exp(−*xη*), *x* ≥ 0. Define a class of subsets of **R** × **R**<sup>+</sup> indexed by *x* ∈ **R**, as follows:

$$K\_{\mathbf{X}} = \left\{ (\theta, \sigma) \in \mathbb{R} \times \mathbb{R}^+ : \frac{1}{\sigma} \phi \left( \frac{\mathbf{x} - \theta}{\sigma} \right) \ge \frac{1}{\sqrt{2\pi}} w(|\mathbf{x}|) \right\} \tag{16}$$

These sets are of particular interest, since for *f* = *φ* ∗ *P*,

6 Image Fusion

Then, *f*<sup>0</sup> ∈ *KL*(Π).

*x* > 0

3. max *G*0 

4. *G*<sup>0</sup> 

3'. max

4.' *G*<sup>0</sup> 

and for large *x* < 0,

 *G*0 

*3, 3'* are satisfied since,

[0, <sup>∞</sup>) <sup>×</sup> (*x*1<sup>−</sup> *<sup>η</sup>*

<sup>2</sup> , ∞) <sup>=</sup> <sup>1</sup>

*G*0 

**3.2. Dirichlet mixture of normals**

Let *f*<sup>0</sup> be a density on **R** satisfying

1. *f*0(*x*)log *f*0(*x*)*dx* < ∞. 2. ∃ *η* ∈ (0, 1) such that |*x*|

> *x* − *σ*0*x η* <sup>2</sup> , ∞

(−∞, *x*) × (0,*e*|*x*<sup>|</sup>

(*x*, ∞) × (0,*e*|*x*<sup>|</sup>

constant and *G*<sup>0</sup> is a probability measure on **R** × **R**+.

 |*x*|>*x*<sup>0</sup>

Note that the moment condition of Theorem 3.1.2 is substantially reduced.

*<sup>η</sup> f*0(*x*)*dx* < ∞.

 , *G*<sup>0</sup> 

> 1 − *b*2|*x*|

× [*σ*0, ∞)

> 1 − *b*2|*x*|

× [*σ*0, ∞)

*η* 2 

*<sup>η</sup>*− <sup>1</sup> 2 ) 

−∞, *x* + *σ*0|*x*|

*<sup>η</sup>*− <sup>1</sup> 2 ) 

Next we consider Π˜ = *Dir*(*αG*0), a Dirichlet process with parameter *αG*0. Here *α* is a positive

Suppose *f*<sup>0</sup> ∈ F satisfies the following property: For any 0 < *τ* < 1, *ǫ* > 0, there exist a set A and a positive number *x*<sup>0</sup> such that Π˜ (A) > 1 − *τ* and for any *f* = *φ* ∗ *P* with *P* ∈ A,

*<sup>f</sup>*0(*x*)log *<sup>f</sup>*0(*x*)

Further assume that there exist *σ*<sup>0</sup> > 0, 0 < *β* < *η*, *γ* > *β* and *b*1, *b*<sup>2</sup> > 0 such that for large

<sup>−</sup>*γ*.

 , *G*<sup>0</sup> 

<sup>−</sup>*γ*.

then *f*<sup>0</sup> ∈ *KL*(Π). Other than the important moment condition on *f*<sup>0</sup> this theorem also requires some regularity in the tail of the base measure *G*0. For example, assumption *3,3'* requires the tail of *G*<sup>0</sup> not to decay faster than a polynomial rate for the scale parameter *σ*. This condition seems very reasonable since the Cauchy density itself can be written as a scale mixture of normals with the mixing density having a polynomial decay towards infinity.

A standard choice for *G*<sup>0</sup> is the conjugate normal-inverse gamma distribution (see Escobar and West 1995), under which, *θ*|*σ* ∼ *N*(0, *ξσ*2) and *σ*−<sup>2</sup> ∼ *Gamma*(*r*, *λ*), for some *ξ*,*r*, *λ* > 0. For such a *G*<sup>0</sup> with *r* ∈ (1/2, 1), one can show that the conditions of Theorem 3.2 hold true with *η* ∈ (2*r*/(1 + *r*), 1), *β* = *r*(2 − *η*) and *γ* = 2*r*. For example, the conditions in *Assumptions*

) = *c*

 *<sup>x</sup>*−(2−*η*) 0

*vr*−1*e*

<sup>−</sup>*λvdv* ≤ *c*′

*x*−*r*(2−*η*) ,

<sup>2</sup> Pr(*σ*−<sup>2</sup> <sup>≤</sup> *<sup>x</sup>*−(2−*η*)

[0, <sup>∞</sup>) × (*x*1<sup>−</sup> *<sup>η</sup>*

(−∞, 0] × (|*x*|

<sup>2</sup> , ∞) 

> 1− *<sup>η</sup>* <sup>2</sup> , ∞)

≥ *b*1*x*−*<sup>β</sup>*

≥ *b*1|*x*|

−*β*

*<sup>f</sup>*(*x*) *dx* <sup>&</sup>lt; *<sup>ǫ</sup>*. (15)

$$\int\_{|\mathbf{x}| > \mathbf{x}\_0} f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{f(\mathbf{x})} d\mathbf{x} \le \int\_{|\mathbf{x}| > \mathbf{x}\_0} f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{\int\_{\mathcal{X}\_\varepsilon} \frac{1}{\sigma} \Phi\left(\frac{\mathbf{x} - \boldsymbol{\theta}}{\sigma}\right) dP(\boldsymbol{\theta}, \sigma)} d\mathbf{x}$$

$$\le \int\_{|\mathbf{x}| > \mathbf{x}\_0} f\_0(\mathbf{x}) \log \frac{f\_0(\mathbf{x})}{\frac{1}{\sqrt{2\pi}} w(|\mathbf{x}|) P(\mathbf{K}\_\varepsilon)} d\mathbf{x}$$

$$\le \int\_{|\mathbf{x}| > \mathbf{x}\_0} f\_0(\mathbf{x}) \left\{ \log f\_0(\mathbf{x}) + |\mathbf{x}|^\eta + \log \frac{\sqrt{2\pi}}{P(\mathbf{K}\_\varepsilon)} \right\} d\mathbf{x}. \tag{17}$$

By the assumptions of the Theorem, this quantity can be made arbitrarily small for a suitably large *x*<sup>0</sup> if we can show that *P*(*Kx*) > *c*<sup>1</sup> exp(−*c*2|*x*| *<sup>η</sup>*) for all |*x*| > *x*<sup>0</sup> for some fixed constants *c*1, *c*<sup>2</sup> > 0. Therefore it suffices to prove that, For any *τ* > 0 there exists an *x*<sup>0</sup> > 0 and a set A with Π˜ (A) > 1 − *τ* such that *P* ∈A⇒ *P*(*Kx*) ≥ (1/2) exp(−2|*x*| *<sup>η</sup>*/*b*1) for all |*x*| > *x*0.

The proof of this Lemma is fairly technical. It makes an extensive use of the tail behavior of a random probability *P* arising from a Dirichlet process. For clarity of reading, we present details of the proof in the Appendix.

#### **4. Density estimation: strong consistency**

We establish *L*1-consistency of a Dirichlet location-scale mixture of normal prior Π by verifying the conditions of Theorem 8 of Ghosal *et al.* (1999). This theorem is reproduced below.

Let Π be a prior on F such that *f*<sup>0</sup> ∈ *KL*(Π). If there is a *δ* < *ǫ*/4, *c*1, *c*<sup>2</sup> > 0, *β* < *ǫ*2/8 and F*<sup>n</sup>* ⊆ F such that for all *n* large,

$$\begin{aligned} \text{1. } \Pi(\mathcal{F}\_n^c) &< c\_1 e^{-n c\_2} \\ \text{2. } f(\delta, \mathcal{F}\_n) &< n \beta \text{.} \end{aligned}$$

then Π achieves strong posterior consistency at *f*0.

Here *J*(*δ*, G) denotes logarithm of the covering number of G by *L*<sup>1</sup> balls of radii *δ*.

We first show how to calculate *J*(*δ*, G) for certain type of sets G. For some *a* > 0, *u* > *l* > 0 define

$$\mathcal{F}\_{a,l,u} = \{ f = \phi \* P : P\left( (-a, a] \times (l, u] \right) = 1 \} \tag{18}$$

10.5772/56949

27

http://dx.doi.org/10.5772/56949

(23)

*<sup>N</sup>* be a *κ*-net in P*N*. Let *τj*'s be as before and

be the *N* dimensional probability simplex and P<sup>∗</sup>

*N* ≤

≤ �32 *π a l*

= *b*<sup>1</sup> *a*

From this the result follows with *b*<sup>0</sup> = 1 + log <sup>1</sup>+*<sup>κ</sup>*

further satisfies � *f* − *f* ∗� < 2*κ*. This proves the lemma.

with *ln* < *un* and constant *β*0, all depending on *κ* and *β* such that

then *f*<sup>0</sup> ∈ *KL*(Π) implies that Π achieves strong posterior consistency at *f*0.

1. Π˜ ({*P* : *P*((−*an*, *an*] × (*ln*, *un*]) < 1 − *κ*}) < *e*−*nβ*<sup>0</sup> ,

2. *an*/*ln* < *nβ*, log(*un*/*ln*) < *nβ*.

*an*,*ln*,*un*

*M* ∑ *j*=1 F =

is a 2*κ*-net in F*a*,*l*,*<sup>u</sup>* and consequently, *J*(2*κ*, F*a*,*l*,*u*) ≤ *J*(*κ*,P*N*) ≤ *N*

1 + *ζ κζ* <sup>+</sup>

*<sup>l</sup>* <sup>+</sup> *<sup>b</sup>*<sup>2</sup> log *<sup>u</sup>*

*a σj*−1*κ*

*<sup>a</sup>*,*l*,*<sup>u</sup>* <sup>=</sup> { *<sup>f</sup>* <sup>=</sup> *<sup>φ</sup>* <sup>∗</sup> *<sup>P</sup>* : *<sup>P</sup>*((−*a*, *<sup>a</sup>*] <sup>×</sup> (*l*, *<sup>u</sup>*]) <sup>≥</sup> <sup>1</sup> <sup>−</sup> *<sup>κ</sup>*}. Then *<sup>J</sup>*(3*κ*, <sup>F</sup>*<sup>κ</sup>*

��32 *π*

 

*M* ∑ *j*=1

*Nj* ∑ *i*=1 *P*∗

+ 1 � = �32 *π a lκ*

1 <sup>1</sup> <sup>+</sup> *<sup>ζ</sup>* log *<sup>u</sup>*

(1999) that ,

Let F*<sup>κ</sup>*

*Proof.* Let *f* = *φ* ∗ *P* ∈ F*<sup>κ</sup>*

*Proof.* Take F*<sup>n</sup>* = F*<sup>κ</sup>*

4 for a suitable choice of *κ* > 0.

theorem 4 are satisfied if *an* = *O*(

*θij* = −*a* + 2*a*(*i* − 1/2)/*Nj*, 1 ≤ *i* ≤ *Nj*, 1 ≤ *j* ≤ *M*. So (*θij*, *σj*) ∈ *Eij* ∀*i*, *j*. It can be shown by following an argument similar to the one presented in the proof of Lemma 1 of Ghosal *et al.*

*ijφθij*,*σ<sup>j</sup>* : *<sup>P</sup>*<sup>∗</sup> ∈ P<sup>∗</sup>

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

*<sup>l</sup>* <sup>+</sup> <sup>1</sup>

*<sup>κ</sup>* .

(−*a*, *a*] × (*l*, *u*])/*P*((−*a*, *a*] × (*l*, *u*]). Then the density *f* <sup>∗</sup> = *φ* ∗ *P*<sup>∗</sup> clearly belongs to F*a*,*l*,*<sup>u</sup>* and

Suppose for each *κ* > 0, *β* > 0, there exist sequences of positive numbers *an*, *un* ↑ ∞, *ln* ↓ 0

If Π˜ = *Dir*(*αG*0), verification of conditions 1 and 2 becomes particularly simple. For example, if *G*<sup>0</sup> is a product of a normal on *θ* and an inverse gamma on *σ*2, then the conditions of

<sup>√</sup>*n*), *ln* <sup>=</sup> *<sup>O</sup>*(1/√*n*) and *un* <sup>=</sup> *<sup>O</sup>*(*en*).

*N* 

*M*−1 ∑ *j*=0

*<sup>a</sup>*,*l*,*u*. Consider the probability measure defined by *<sup>P</sup>*∗(*A*) = *<sup>P</sup>*(*<sup>A</sup>* <sup>∩</sup>

. Then the conditions of Theorem 4 are easily verified using Lemma

�

(1 + *ζ*)−*<sup>j</sup>* + *M*

*<sup>l</sup>* <sup>+</sup> <sup>1</sup> (24)

1 + log <sup>1</sup>+*<sup>κ</sup> κ* � . But,

*<sup>a</sup>*,*l*,*u*) ≤ *J*(*κ*, F*a*,*l*,*u*).

Then,

$$J(2\kappa, \mathcal{F}\_{a,l,u}) \le b\_0 \left( b\_1 \frac{a}{l} + b\_2 \log \frac{u}{l} + 1 \right). \tag{19}$$

where *b*0, *b*<sup>1</sup> and *b*<sup>2</sup> depend upon *κ* but not on *a*, *l* or *u*.

*Proof.* Let *φθ*,*<sup>σ</sup>* denote the normal density with mean *θ* and standard deviation *σ*. For *σ*<sup>2</sup> > *σ*<sup>1</sup> > *σ*2/2, it can be shown that,

$$\begin{split} \left|| \phi\_{\theta\_1 \mathcal{O}\_1} - \phi\_{\theta\_2 \mathcal{O}\_2} \right|| &\leq \left|| \phi\_{\theta\_1 \mathcal{O}\_2} - \phi\_{\theta\_2 \mathcal{O}\_2} \right|| + \left|| \phi\_{\theta\_2 \mathcal{O}\_1} - \phi\_{\theta\_2 \mathcal{O}\_2} \right|| \\ &\leq \sqrt{\frac{2}{\pi}} \frac{|\theta\_2 - \theta\_1|}{\sigma\_2} + 3 \frac{\sigma\_2 - \sigma\_1}{\sigma\_1} . \end{split} \tag{20}$$

Let *ζ* = min(*κ*/6, 1). Define *σ<sup>m</sup>* = *l*(1 + *ζ*)*m*, *m* ≥ 0. Let *M* be the smallest integer such that *σ<sup>M</sup>* = *l*(1 + *ζ*)*<sup>M</sup>* ≥ *u*. This implies *M* ≤ (1 + *ζ*)−<sup>1</sup> log(*u*/*l*) + 1. For 1 ≤ *j* ≤ *M*, let *Nj* = ��<sup>32</sup> *<sup>π</sup> a*/(*κσj*−1) � . For 1 ≤ *i* ≤ *Nj*; 1 ≤ *j* ≤ *M*, define

$$E\_{lj} = \left( -a + \frac{2a(i-1)}{N\_j} \; , \; -a + \frac{2ai}{N\_j} \right] \times (\sigma\_{j-1} \; \sigma\_{\bar{j}} \rceil . \tag{21}$$

Then, (*θ*, *σ*), (*θ*′ , *σ*′ ) ∈ *Eij* ⇒ � �*φθ*,*<sup>σ</sup>* <sup>−</sup> *φθ*′ ,*σ*′ � � <sup>&</sup>lt; *<sup>κ</sup>*. Take *<sup>N</sup>* <sup>=</sup> <sup>∑</sup>*<sup>M</sup> <sup>j</sup>*=<sup>1</sup> *Nj* and let

$$\mathcal{P}\_N = \left\{ (P\_{11\prime}, \dots, P\_{N\_1 1\prime}, \dots, P\_{1M\prime}, \dots, P\_{N\_M M}) : P\_{ij} \ge 0, \sum\_{ij} P\_{ij} = 1 \right\} \tag{22}$$

be the *N* dimensional probability simplex and P<sup>∗</sup> *<sup>N</sup>* be a *κ*-net in P*N*. Let *τj*'s be as before and *θij* = −*a* + 2*a*(*i* − 1/2)/*Nj*, 1 ≤ *i* ≤ *Nj*, 1 ≤ *j* ≤ *M*. So (*θij*, *σj*) ∈ *Eij* ∀*i*, *j*. It can be shown by following an argument similar to the one presented in the proof of Lemma 1 of Ghosal *et al.* (1999) that ,

$$\mathcal{F} = \left\{ \sum\_{j=1}^{M} \sum\_{i=1}^{N\_j} P\_{ij}^\* \phi\_{\theta\_{ij}\sigma\_j} : P^\* \in \mathcal{P}\_N^\* \right\} \tag{23}$$

is a 2*κ*-net in F*a*,*l*,*<sup>u</sup>* and consequently, *J*(2*κ*, F*a*,*l*,*u*) ≤ *J*(*κ*,P*N*) ≤ *N* � 1 + log <sup>1</sup>+*<sup>κ</sup> κ* � . But,

$$\begin{split} N &\leq \sum\_{j=1}^{M} \left( \sqrt{\frac{32}{\pi}} \frac{a}{\sigma\_{j-1} \kappa} + 1 \right) = \sqrt{\frac{32}{\pi}} \frac{a}{l\kappa} \sum\_{j=0}^{M-1} (1 + \zeta)^{-j} + M \\ &\leq \sqrt{\frac{32}{\pi}} \frac{a}{l} \frac{1 + \zeta}{\kappa \zeta} + \frac{1}{1 + \overline{\zeta}} \log \frac{u}{l} + 1 \\ &= b\_1 \frac{a}{l} + b\_2 \log \frac{u}{l} + 1 \end{split} \tag{24}$$

From this the result follows with *b*<sup>0</sup> = 1 + log <sup>1</sup>+*<sup>κ</sup> <sup>κ</sup>* .

8 Image Fusion

below.

define

Then,

*Nj* =

��<sup>32</sup>

Then, (*θ*, *σ*), (*θ*′

1. Π(F*<sup>c</sup>*

2. *J*(*δ*, F*n*) < *nβ*,

**4. Density estimation: strong consistency**

then Π achieves strong posterior consistency at *f*0.

where *b*0, *b*<sup>1</sup> and *b*<sup>2</sup> depend upon *κ* but not on *a*, *l* or *u*.

�*φθ*1,*σ*<sup>1</sup> <sup>−</sup> *φθ*2,*σ*<sup>2</sup>

F*<sup>n</sup>* ⊆ F such that for all *n* large,

*<sup>n</sup>*) < *<sup>c</sup>*1*e*−*nc*<sup>2</sup> ,

*σ*<sup>1</sup> > *σ*2/2, it can be shown that,

*<sup>π</sup> a*/(*κσj*−1)

, *σ*′

 

P*<sup>N</sup>* =

�

�

*Eij* =

) ∈ *Eij* ⇒ �

� −*a* +

We establish *L*1-consistency of a Dirichlet location-scale mixture of normal prior Π by verifying the conditions of Theorem 8 of Ghosal *et al.* (1999). This theorem is reproduced

Let Π be a prior on F such that *f*<sup>0</sup> ∈ *KL*(Π). If there is a *δ* < *ǫ*/4, *c*1, *c*<sup>2</sup> > 0, *β* < *ǫ*2/8 and

We first show how to calculate *J*(*δ*, G) for certain type of sets G. For some *a* > 0, *u* > *l* > 0

� *b*1 *a*

*Proof.* Let *φθ*,*<sup>σ</sup>* denote the normal density with mean *θ* and standard deviation *σ*. For *σ*<sup>2</sup> >

�*φθ*1,*σ*<sup>2</sup> <sup>−</sup> *φθ*2,*σ*<sup>2</sup>

Let *ζ* = min(*κ*/6, 1). Define *σ<sup>m</sup>* = *l*(1 + *ζ*)*m*, *m* ≥ 0. Let *M* be the smallest integer such that *σ<sup>M</sup>* = *l*(1 + *ζ*)*<sup>M</sup>* ≥ *u*. This implies *M* ≤ (1 + *ζ*)−<sup>1</sup> log(*u*/*l*) + 1. For 1 ≤ *j* ≤ *M*, let

, −*a* +


F*a*,*l*,*<sup>u</sup>* = { *f* = *φ* ∗ *P* : *P* ((−*a*, *a*] × (*l*, *u*]) = 1} (18)

*<sup>l</sup>* <sup>+</sup> <sup>1</sup> �

�*φθ*2,*σ*<sup>1</sup> <sup>−</sup> *φθ*2,*σ*<sup>2</sup>

*σ*<sup>2</sup> − *σ*<sup>1</sup> *σ*1

� �

. (19)

. (20)

× (*σj*−1, *σj*]. (21)

*<sup>j</sup>*=<sup>1</sup> *Nj* and let

*ij*

*Pij* = 1

 

(22)

*<sup>l</sup>* <sup>+</sup> *<sup>b</sup>*<sup>2</sup> log *<sup>u</sup>*

� � <sup>+</sup> �

+ 3

2*ai Nj*

� <sup>&</sup>lt; *<sup>κ</sup>*. Take *<sup>N</sup>* <sup>=</sup> <sup>∑</sup>*<sup>M</sup>*

�

Here *J*(*δ*, G) denotes logarithm of the covering number of G by *L*<sup>1</sup> balls of radii *δ*.

*J*(2*κ*, F*a*,*l*,*u*) ≤ *b*<sup>0</sup>

� � <sup>≤</sup> �

> ≤ � 2 *π*

. For 1 ≤ *i* ≤ *Nj*; 1 ≤ *j* ≤ *M*, define

2*a*(*i* − 1) *Nj*

> ,*σ*′ �

(*P*11, ··· , *PN*11, ··· , *<sup>P</sup>*1*M*, ··· , *PNM <sup>M</sup>*) : *Pij* <sup>≥</sup> 0,∑

�*φθ*,*<sup>σ</sup>* <sup>−</sup> *φθ*′

$$\text{Let } \mathcal{F}\_{a,l,\mu}^{\mathbf{x}} = \{ f = \phi \* P : P((-a, a] \times (l, \mu]) \ge 1 - \kappa \}. \text{ Then } I(\mathfrak{K}, \mathcal{F}\_{a,l,\mu}^{\mathbf{x}}) \le I(\mathfrak{x}, \mathcal{F}\_{a,l,\mu}).$$

*Proof.* Let *f* = *φ* ∗ *P* ∈ F*<sup>κ</sup> <sup>a</sup>*,*l*,*u*. Consider the probability measure defined by *<sup>P</sup>*∗(*A*) = *<sup>P</sup>*(*<sup>A</sup>* <sup>∩</sup> (−*a*, *a*] × (*l*, *u*])/*P*((−*a*, *a*] × (*l*, *u*]). Then the density *f* <sup>∗</sup> = *φ* ∗ *P*<sup>∗</sup> clearly belongs to F*a*,*l*,*<sup>u</sup>* and further satisfies � *f* − *f* ∗� < 2*κ*. This proves the lemma.

$$\mathbb{D}$$

Suppose for each *κ* > 0, *β* > 0, there exist sequences of positive numbers *an*, *un* ↑ ∞, *ln* ↓ 0 with *ln* < *un* and constant *β*0, all depending on *κ* and *β* such that

$$\begin{aligned} \text{1. } \tilde{\Pi} \left( \{ P : P((-a\_{\hbar}, a\_{\hbar}] \times (l\_{\hbar \nu} u\_{\hbar}) \} < 1 - \kappa \} \right) &< e^{-\eta \beta\_{0}}, \\ \text{2. } a\_{\hbar}/l\_{\hbar} &< \mathfrak{n} \mathfrak{beta}, \ \log(\mathfrak{u}\_{\hbar}/l\_{\hbar}) < \mathfrak{n} \mathfrak{beta}. \end{aligned}$$

then *f*<sup>0</sup> ∈ *KL*(Π) implies that Π achieves strong posterior consistency at *f*0.

*Proof.* Take F*<sup>n</sup>* = F*<sup>κ</sup> an*,*ln*,*un* . Then the conditions of Theorem 4 are easily verified using Lemma 4 for a suitable choice of *κ* > 0.

If Π˜ = *Dir*(*αG*0), verification of conditions 1 and 2 becomes particularly simple. For example, if *G*<sup>0</sup> is a product of a normal on *θ* and an inverse gamma on *σ*2, then the conditions of theorem 4 are satisfied if *an* = *O*( <sup>√</sup>*n*), *ln* <sup>=</sup> *<sup>O</sup>*(1/√*n*) and *un* <sup>=</sup> *<sup>O</sup>*(*en*).

#### **4.1. Support vector machines**

We give, in this section, a very brief presentation of Support Vector Machines (SVMs) that is needed for the definition of their functional versions. We refer the reader to e.g. [4] for a more comprehensive presentation. As stated in section **??**, X denotes an arbitrary Hilbert space. Our presentation of SVM departs from the standard introduction because it assumes that the observations belong to X rather than to a d. This will make clear that the definition of SVM on arbitrary Hilbert spaces is not the difficult part in the construction of functional SVM. We will discuss problems related to the functional nature of the data in section 4.1.5.

10.5772/56949

29

http://dx.doi.org/10.5772/56949

*4.1.3. Non linear SVM*

spaces.

*4.1.4. Dual formulation and Kernels*

(the same is true for X ).

be written (*x*) = sign(∑*<sup>N</sup>*

As noted in the previous section, some classification problems don't have a satisfactory linear solution but have a non linear one. Non linear SVMs are obtained by transforming the original data. Assume given an Hilbert space H (and denote �., .�H the corresponding inner product) and a function *φ* from X to H (this function is called a *feature map*). A linear SVM in H can be constructed on the data set (*φ*(*x*1), *y*1),...,(*φ*(*xN*), *yN*). If *φ* is a non linear

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

In order to obtain the linear SVM in H one has to solve the following optimization problem:

*ξ<sup>i</sup>* ≥ 0, 1 ≤ *i* ≤ *N*.

It should be noted that this feature mapping allows to define SVM on almost arbitrary input

Solving problems (*PC*) or (*PC*,H) might seem very difficult at first, because X and H are arbitrary Hilbert spaces and can therefore have very high or even infinite dimension (when X is a functional space for instance). However, each problem has a dual formulation. More

*<sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>N</sup>*

*<sup>i</sup>*=<sup>1</sup> *αiyi* = 0, 0 ≤ *α<sup>i</sup>* ≤ *C*, 1 ≤ *i* ≤ *N*.

This result applies to the original problem in which data are not mapped into H, but also to the mapped data, i.e., (*PC*,H) is equivalent to a problem (*DC*,H) in which the *xi* are replaced

*<sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>N</sup>*

Solving (*DC*,H) rather than (*PC*,H) has two advantages. The first positive aspect is that (*DC*,H) is an optimization problem in *N* rather than in H which can have infinite dimension

The second important point is linked to the fact that the optimal classification rule can

problem and the classification rule do not make direct use of the transformed data, i.e. of the *φ*(*xi*). All the calculations are done through the inner product in H, more precisely through the values �*φ*(*xi*), *φ*(*xj*)�H. Therefore, rather than choosing directly H and *φ*, one can provide a so called *Kernel function K* such that *K*(*xi*, *xj*) = �*φ*(*xi*), *φ*(*xj*)�H for a given pair (H, *φ*).

*<sup>i</sup>*=<sup>1</sup> *αiyi* = 0, 0 ≤ *α<sup>i</sup>* ≤ *C*, 1 ≤ *i* ≤ *N*.

*<sup>j</sup>*=<sup>1</sup> *αiαjyiyj*�*xi*, *xj*�,

*<sup>j</sup>*=<sup>1</sup> *αiαjyiyj*�*φ*(*xi*), *φ*(*xj*)�H,

*<sup>i</sup>*=<sup>1</sup> *αiyi*�*φ*(*xi*), *φ*(*x*)�H + *b*). This means that both the optimization

*<sup>i</sup>*=<sup>1</sup> *ξi*, subject to *yi*(�*w*, *φ*(*xi*)�H + *b*) ≥ 1 − *ξi*, 1 ≤ *i* ≤ *N*,

mapping, the classification rule (*x*) = sign(�*w*, *φ*(*x*)�H + *b*) is also non linear.

precisely, (*PC*) is equivalent to the following optimization problem (see [6]):

subject to ∑*<sup>N</sup>*

by *φ*(*xi*) and in which the inner product of H is used. This leads to:

subject to ∑*<sup>N</sup>*

*<sup>i</sup>*=<sup>1</sup> *<sup>α</sup><sup>i</sup>* <sup>−</sup> <sup>∑</sup>*<sup>N</sup>*

*<sup>i</sup>*=<sup>1</sup> *<sup>α</sup><sup>i</sup>* <sup>−</sup> <sup>∑</sup>*<sup>N</sup>*

(*DC*) max*<sup>α</sup>* ∑*<sup>N</sup>*

(*DC*,H) max*<sup>α</sup>* ∑*<sup>N</sup>*

(*PC*,H) min*w*,*b*,*<sup>ξ</sup>* �*w*, *w*�H + *C* ∑*<sup>N</sup>*

Our goal is to classify data into two predefined classes. We assume given a learning set, i.e. *N* examples (*x*1, *y*1),...,(*xN*, *yN*) which are i.i.d. realizations of the random variable pair (*X*,*Y*) where *X* has values in X and *Y* in {−1, 1}, i.e. *Y* is the class label for *X* which is the observation.

#### *4.1.1. Hard margin SVM*

The principle of SVM is to perform an affine discrimination of the observations with maximal margin, that is to find an element *w* ∈ X with a minimum norm and a real value *b*, such that *yi*(�*w*, *xi*� + *b*) ≥ 1 for all *i*. To do so, we have to solve the following quadratic programming problem:

$$(P\_0) \min\_{w, b} \langle w, w \rangle\_{\prime} \text{ subject to } y\_i(\langle w, x\_i \rangle + b) \ge 1, \ 1 \le i \le N.$$

The classification rule associated to (*w*, *b*) is simply (*x*) = sign(�*w*, *x*� + *b*). In this situation (called hard margin SVM), we request the rule to have zero error on the learning set.

#### *4.1.2. Soft margin SVM*

In practice, the solution provided by problem (*P*0) is not very satisfactory. Firstly, perfectly linearly separable problems are quite rare, partly because non linear problems are frequent, but also because noise can turn a linearly separable problem into a non separable one. Secondly, choosing a classifier with maximal margin does not prevent overfitting, especially in very high dimensional spaces (see e.g. [5] for a discussion about this point).

A first step to solve this problem is to allow some classification errors on the learning set. This is done by replacing (*P*0) by its soft margin version, i.e., by the problem:

$$\begin{array}{c} \left(P\_{\mathsf{C}}\right) \min\_{w,b,\mathsf{f}} \langle w,w \rangle + \mathsf{C} \sum\_{i=1}^{N} \mathsf{f}\_{i\prime} \\ \text{subject to } \begin{array}{l} y\_{i}(\langle w,x\_{i}\rangle + b) \geq 1 - \mathsf{f}\_{i\prime} \ 1 \leq i \leq N \text{.} \end{array} \end{array} $$

Classification errors are allowed thanks to the slack variables *ξi*. The *C* parameter acts as an inverse regularization parameter. When *C* is small, the cost of violating the hard margin constraints, i.e., the cost of having some *ξ<sup>i</sup>* > 0 is small and therefore the constraint on *w* dominates. On the contrary, when *C* is large, classification errors dominate and (*PC*) gets closer to (*P*0).

#### *4.1.3. Non linear SVM*

10 Image Fusion

observation.

problem:

*4.1.1. Hard margin SVM*

*4.1.2. Soft margin SVM*

closer to (*P*0).

(*P*0) min *w*,*b*

**4.1. Support vector machines**

We give, in this section, a very brief presentation of Support Vector Machines (SVMs) that is needed for the definition of their functional versions. We refer the reader to e.g. [4] for a more comprehensive presentation. As stated in section **??**, X denotes an arbitrary Hilbert space. Our presentation of SVM departs from the standard introduction because it assumes that the observations belong to X rather than to a d. This will make clear that the definition of SVM on arbitrary Hilbert spaces is not the difficult part in the construction of functional SVM. We will discuss problems related to the functional nature of the data in section 4.1.5. Our goal is to classify data into two predefined classes. We assume given a learning set, i.e. *N* examples (*x*1, *y*1),...,(*xN*, *yN*) which are i.i.d. realizations of the random variable pair (*X*,*Y*) where *X* has values in X and *Y* in {−1, 1}, i.e. *Y* is the class label for *X* which is the

The principle of SVM is to perform an affine discrimination of the observations with maximal margin, that is to find an element *w* ∈ X with a minimum norm and a real value *b*, such that *yi*(�*w*, *xi*� + *b*) ≥ 1 for all *i*. To do so, we have to solve the following quadratic programming

The classification rule associated to (*w*, *b*) is simply (*x*) = sign(�*w*, *x*� + *b*). In this situation (called hard margin SVM), we request the rule to have zero error on the learning set.

In practice, the solution provided by problem (*P*0) is not very satisfactory. Firstly, perfectly linearly separable problems are quite rare, partly because non linear problems are frequent, but also because noise can turn a linearly separable problem into a non separable one. Secondly, choosing a classifier with maximal margin does not prevent overfitting, especially

A first step to solve this problem is to allow some classification errors on the learning set.

Classification errors are allowed thanks to the slack variables *ξi*. The *C* parameter acts as an inverse regularization parameter. When *C* is small, the cost of violating the hard margin constraints, i.e., the cost of having some *ξ<sup>i</sup>* > 0 is small and therefore the constraint on *w* dominates. On the contrary, when *C* is large, classification errors dominate and (*PC*) gets

*<sup>i</sup>*=<sup>1</sup> *ξi*, subject to *yi*(�*w*, *xi*� + *b*) ≥ 1 − *ξi*, 1 ≤ *i* ≤ *N*, *ξ<sup>i</sup>* ≥ 0, 1 ≤ *i* ≤ *N*.

in very high dimensional spaces (see e.g. [5] for a discussion about this point).

This is done by replacing (*P*0) by its soft margin version, i.e., by the problem:

(*PC*) min*w*,*b*,*<sup>ξ</sup>* �*w*, *w*� + *C* ∑*<sup>N</sup>*

�*w*, *w*�, subject to *yi*(�*w*, *xi*� + *b*) ≥ 1, 1 ≤ *i* ≤ *N*.

As noted in the previous section, some classification problems don't have a satisfactory linear solution but have a non linear one. Non linear SVMs are obtained by transforming the original data. Assume given an Hilbert space H (and denote �., .�H the corresponding inner product) and a function *φ* from X to H (this function is called a *feature map*). A linear SVM in H can be constructed on the data set (*φ*(*x*1), *y*1),...,(*φ*(*xN*), *yN*). If *φ* is a non linear mapping, the classification rule (*x*) = sign(�*w*, *φ*(*x*)�H + *b*) is also non linear.

In order to obtain the linear SVM in H one has to solve the following optimization problem:

$$\begin{array}{c} (P\_{\mathcal{C},\mathcal{H}}) \min\_{w,b,\mathfrak{z}} \langle w,w \rangle\_{\mathcal{H}} + \mathcal{C} \sum\_{i=1}^{N} \mathfrak{z}\_{i\prime} \\ \text{subject to } \begin{array}{l} y\_{i}(\langle w,\phi(\mathfrak{x}\_{i})\rangle\_{\mathcal{H}} + b) \ge 1 - \mathfrak{z}\_{i\prime} \cdot 1 \le i \le N, \\ \mathfrak{z}\_{i} \ge 0, \ 1 \le i \le N. \end{array} \end{array}$$

It should be noted that this feature mapping allows to define SVM on almost arbitrary input spaces.

#### *4.1.4. Dual formulation and Kernels*

Solving problems (*PC*) or (*PC*,H) might seem very difficult at first, because X and H are arbitrary Hilbert spaces and can therefore have very high or even infinite dimension (when X is a functional space for instance). However, each problem has a dual formulation. More precisely, (*PC*) is equivalent to the following optimization problem (see [6]):

$$\begin{array}{c} (D\_{\mathsf{C}}) \max\_{\mathsf{A}} \sum\_{i=1}^{N} \alpha\_{i} - \sum\_{i=1}^{N} \sum\_{j=1}^{N} \alpha\_{i} a\_{j} y\_{i} y\_{j} \langle \mathbf{x}\_{i}, \mathbf{x}\_{j} \rangle, \\ \text{subject to } \sum\_{i=1}^{N} \alpha\_{i} y\_{i} = 0, \\ 0 \le \alpha\_{i} \le \mathsf{C}, \ 1 \le i \le N. \end{array}$$

This result applies to the original problem in which data are not mapped into H, but also to the mapped data, i.e., (*PC*,H) is equivalent to a problem (*DC*,H) in which the *xi* are replaced by *φ*(*xi*) and in which the inner product of H is used. This leads to:

$$\begin{array}{c} (D\_{\mathsf{C},\mathcal{H}}) \max\_{\boldsymbol{\alpha}} \sum\_{i=1}^{N} \boldsymbol{\alpha}\_{i} - \sum\_{i=1}^{N} \sum\_{j=1}^{N} \boldsymbol{\alpha}\_{i} \boldsymbol{\alpha}\_{j} y\_{i} y\_{j} \langle \phi(\boldsymbol{x}\_{i}), \phi(\boldsymbol{x}\_{j}) \rangle\_{\mathcal{H}}, \\ \text{subject to } \sum\_{i=1}^{N} \boldsymbol{\alpha}\_{i} y\_{i} = 0, \\ 0 \le \boldsymbol{\alpha}\_{i} \le \mathsf{C}, \ 1 \le i \le N. \end{array}$$

Solving (*DC*,H) rather than (*PC*,H) has two advantages. The first positive aspect is that (*DC*,H) is an optimization problem in *N* rather than in H which can have infinite dimension (the same is true for X ).

The second important point is linked to the fact that the optimal classification rule can be written (*x*) = sign(∑*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *αiyi*�*φ*(*xi*), *φ*(*x*)�H + *b*). This means that both the optimization problem and the classification rule do not make direct use of the transformed data, i.e. of the *φ*(*xi*). All the calculations are done through the inner product in H, more precisely through the values �*φ*(*xi*), *φ*(*xj*)�H. Therefore, rather than choosing directly H and *φ*, one can provide a so called *Kernel function K* such that *K*(*xi*, *xj*) = �*φ*(*xi*), *φ*(*xj*)�H for a given pair (H, *φ*).

In order that *K* corresponds to an actual inner product in a Hilbert space, it has to fulfill some conditions. *K* has to be symmetric and positive definite, that is, for every *N*, *x*1,..., *xN* in X and *α*1,..., *α<sup>N</sup>* in , ∑*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>N</sup> <sup>j</sup>*=<sup>1</sup> *αiαjK*(*xi*, *xj*) ≥ 0. If *K* satisfies those conditions, according to Moore-Aronszajn theorem [**?** ], there exists a Hilbert space H and feature map *φ* such that *K*(*xi*, *xj*) = �*φ*(*xi*), *φ*(*xj*)�H.

10.5772/56949

31

http://dx.doi.org/10.5772/56949

Using a kernel corresponds therefore both to replace a linear classifier by a non linear one, but also to replace the ridge penalization by a penalization induced by the kernel which might be more adapted to the problem (see [9] for links between regularization operators

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

In this chapter, we propose a new technique in remote-sensing images classification by fusing heterogeneous representations. The proposed approach involve several steps including preprocessing; features extraction; features fusion; matching and classification stages. The block diagram of the proposed technique is shown in Fig. 1. In our previous work [12], we proposed a novel 3D model which design the spectral signature as a three dimensional function which are the time, reflectance, and wavelength band (equation 1). For each pixel, we generated a surface (3D Mesh) which generalizes the usual signature by adding a time dimension. We call this new representation the *multi-temporal spectral signature*. Interested

In this study multi-temporal hyperspectral images constitutes the source data. Spectral and textural features are the foundational data for this kind of images. The 3D spectral features are extracted from the relative mesh of a given pixel (multi-temporal spectral signature) while the textural ones are derived directly from images. Mainly, two features vectors are

**Heat kernel signature (HKS) :** The HKS is a signature computed only from the intrinsic geometry of an object. Suppose (*m*, *g*) is a compelte Riemannian manifold, *g* is the Riemannian metric. *δ* is the Laplace-Beltrami operator. The eigenvalues {*λn*} and eigenfunctions {*φn*} of *δ* are *δφ<sup>n</sup>* = *λnφn*, where *φ<sup>n</sup>* is normalized to be orthonormal in *L*2(*M*). The Laplace spectrum is given by 0 = *λ*<sup>0</sup> < *λ*<sup>1</sup> ≤ *λ*<sup>2</sup> ≤ ..., *λ<sup>n</sup>* → ∞. △ is the

and kernels). The applications presented in **??** illustrate this fact.

**5.1. Overview of the proposed fusion schema**

**5. Proposed approach**

readers can refer to [12].

**Figure 1.** General workflow of the proposed approach

generated for each pixel as follows:

**5.2. Images pre-processing and features extraction**

#### *4.1.5. The case of functional data*

The short introduction to SVM proposed in the previous section has clearly shown that defining linear SVM for data in a functional space is as easy as for data in d, because we only assumed that the input space was a Hilbert space. By the dual formulation of the optimization problem (*PC*), a software implementation of linear SVM on functional data is even possible, by relying on numerical quadrature methods to calculate the requested integrals (inner product in *L*2(*µ*), cf section **??**).

However, the functional nature of the data has some effects. It should be first noted that in infinite dimensional Hilbert spaces, the hard margin problem (*P*0) has always a solution when the input data are in general positions, i.e., when *N* observations span a *N* dimensional subspace of X . A very naive solution would therefore consists in avoiding soft margins and non linear kernels. This would not give very interesting results in practice because of the lack of regularization (see [5] for some examples in very high dimension spaces, as well as section **??**).

Moreover, the linear SVM with soft margin can also lead to bad performances. It is indeed well known (see e.g. [7]) that problem (*PC*) is equivalent to the following unconstrained optimization problem:

$$(R\_{\lambda})\min\_{w,b} \frac{1}{N} \sum\_{i=1}^{N} \max\left(0, 1 - y\_i(\langle w, x\_i \rangle + b)\right) + \lambda \langle w, w \rangle\_{\lambda}$$

with *λ* = <sup>1</sup> *CN* . This way of viewing (*PC*) emphasizes the regularization aspect (see also [8–10]) and links the SVM model to ridge regression [**?** ]. As shown in [11], the penalization used in ridge regression behaves poorly with functional data. Of course, the loss function used by SVM (the *hinge loss*, i.e., *h*(*u*, *v*) = max(0, 1 − *uv*)) is different from the quadratic loss used in ridge regression and therefore no conclusion can be drawn from experiments reported in [11]. However they show that we might expect bad performances with the linear SVM applied directly to functional data. We will see in sections **??** and **??** that the efficiency of the ridge regularization seems to be linked with the actual dimension of the data: it does not behave very well when the number of discretization points is very big and thus leads to approximate the ridge penalty by a dot product in a very high dimensional space (see also section **??**).

It is therefore interesting to consider non linear SVM for functional data, by introducing adapted kernels. As pointed out in e.g. [10], (*PC*,H) is equivalent to

$$\mathbb{P}(\mathcal{R}\_{\lambda,\mathcal{H}}) \min\_{f \in \mathcal{H}} \frac{1}{N} \sum\_{i=1}^{N} \max\left(0, 1 - y\_i f(\mathbf{x}\_i)\right)) + \lambda \langle f, f \rangle\_{\mathcal{H}}.$$

Using a kernel corresponds therefore both to replace a linear classifier by a non linear one, but also to replace the ridge penalization by a penalization induced by the kernel which might be more adapted to the problem (see [9] for links between regularization operators and kernels). The applications presented in **??** illustrate this fact.

### **5. Proposed approach**

12 Image Fusion

section **??**).

with *λ* = <sup>1</sup>

section **??**).

optimization problem:

in X and *α*1,..., *α<sup>N</sup>* in , ∑*<sup>N</sup>*

*K*(*xi*, *xj*) = �*φ*(*xi*), *φ*(*xj*)�H.

*4.1.5. The case of functional data*

*<sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>N</sup>*

integrals (inner product in *L*2(*µ*), cf section **??**).

(*Rλ*) min *w*,*b*

1 *N*

*N* ∑ *i*=1

adapted kernels. As pointed out in e.g. [10], (*PC*,H) is equivalent to

1 *N*

*N* ∑ *i*=1

(*Rλ*,H) min *f*∈H

In order that *K* corresponds to an actual inner product in a Hilbert space, it has to fulfill some conditions. *K* has to be symmetric and positive definite, that is, for every *N*, *x*1,..., *xN*

to Moore-Aronszajn theorem [**?** ], there exists a Hilbert space H and feature map *φ* such that

The short introduction to SVM proposed in the previous section has clearly shown that defining linear SVM for data in a functional space is as easy as for data in d, because we only assumed that the input space was a Hilbert space. By the dual formulation of the optimization problem (*PC*), a software implementation of linear SVM on functional data is even possible, by relying on numerical quadrature methods to calculate the requested

However, the functional nature of the data has some effects. It should be first noted that in infinite dimensional Hilbert spaces, the hard margin problem (*P*0) has always a solution when the input data are in general positions, i.e., when *N* observations span a *N* dimensional subspace of X . A very naive solution would therefore consists in avoiding soft margins and non linear kernels. This would not give very interesting results in practice because of the lack of regularization (see [5] for some examples in very high dimension spaces, as well as

Moreover, the linear SVM with soft margin can also lead to bad performances. It is indeed well known (see e.g. [7]) that problem (*PC*) is equivalent to the following unconstrained

[8–10]) and links the SVM model to ridge regression [**?** ]. As shown in [11], the penalization used in ridge regression behaves poorly with functional data. Of course, the loss function used by SVM (the *hinge loss*, i.e., *h*(*u*, *v*) = max(0, 1 − *uv*)) is different from the quadratic loss used in ridge regression and therefore no conclusion can be drawn from experiments reported in [11]. However they show that we might expect bad performances with the linear SVM applied directly to functional data. We will see in sections **??** and **??** that the efficiency of the ridge regularization seems to be linked with the actual dimension of the data: it does not behave very well when the number of discretization points is very big and thus leads to approximate the ridge penalty by a dot product in a very high dimensional space (see also

It is therefore interesting to consider non linear SVM for functional data, by introducing

max (0, 1 − *yi f*(*xi*))) + *λ*�*f* , *f*�H.

max (0, 1 − *yi*(�*w*, *xi*� + *b*)) + *λ*�*w*, *w*�,

*CN* . This way of viewing (*PC*) emphasizes the regularization aspect (see also

*<sup>j</sup>*=<sup>1</sup> *αiαjK*(*xi*, *xj*) ≥ 0. If *K* satisfies those conditions, according

#### **5.1. Overview of the proposed fusion schema**

In this chapter, we propose a new technique in remote-sensing images classification by fusing heterogeneous representations. The proposed approach involve several steps including preprocessing; features extraction; features fusion; matching and classification stages. The block diagram of the proposed technique is shown in Fig. 1. In our previous work [12], we proposed a novel 3D model which design the spectral signature as a three dimensional function which are the time, reflectance, and wavelength band (equation 1). For each pixel, we generated a surface (3D Mesh) which generalizes the usual signature by adding a time dimension. We call this new representation the *multi-temporal spectral signature*. Interested readers can refer to [12].

**Figure 1.** General workflow of the proposed approach

#### **5.2. Images pre-processing and features extraction**

In this study multi-temporal hyperspectral images constitutes the source data. Spectral and textural features are the foundational data for this kind of images. The 3D spectral features are extracted from the relative mesh of a given pixel (multi-temporal spectral signature) while the textural ones are derived directly from images. Mainly, two features vectors are generated for each pixel as follows:

**Heat kernel signature (HKS) :** The HKS is a signature computed only from the intrinsic geometry of an object. Suppose (*m*, *g*) is a compelte Riemannian manifold, *g* is the Riemannian metric. *δ* is the Laplace-Beltrami operator. The eigenvalues {*λn*} and eigenfunctions {*φn*} of *δ* are *δφ<sup>n</sup>* = *λnφn*, where *φ<sup>n</sup>* is normalized to be orthonormal in *L*2(*M*). The Laplace spectrum is given by 0 = *λ*<sup>0</sup> < *λ*<sup>1</sup> ≤ *λ*<sup>2</sup> ≤ ..., *λ<sup>n</sup>* → ∞. △ is the Laplace-Beltrami operator. As a local shape descriptor, Sun et al. [**?** ] defined the heat kernel signature (HKS) by :

$$h(\mathbf{x}, t) = K\_{\mathbf{x}, t}(\mathbf{x}, \mathbf{x}) = \sum\_{i=0}^{\infty} e^{-\lambda\_i} \phi\_i^2(\mathbf{x}) \tag{25}$$

10.5772/56949

33

http://dx.doi.org/10.5772/56949

Since that each feature vector *X* may has an arbitrary dimension, the proposed method defines the fusion as a projection from one feature vector space (spectral bands) to another with a fixed dimentionnality. Accordingly, the feature-level fusion is done by projecting the vector *X* combining into one vector in the Fisher space. Thus, the generative model will have its impact on the final classification result through the projection of the extracted features in

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

SVM classifier is used to classify the fused features and the multi-temporal dataset of images. Given the generative model obtained by GDD with parameters *θ*, we compute for each sample *X* the Fisher score *Ud* = ▽*<sup>θ</sup> logP*(*x*|*θ*) (the gradient of the log likehood of *x* for model *θ*). The Fisher kernel operates in the gradient space of the generative mode and provides a natural similarity measure between data samples. For each sample, this score is a vector of fixed dimentionality. Using this score, the Fisher Information matrix is defined as

on the basis of the Euclidean distance between the scores of the new sample and the training

In the second stage, suppose our training set *S* consists of labels input vectors (*Xi*, *zj*), *i* = 1, . . . , *m* where *Xi* ∈ **R***<sup>n</sup>* and *zi* ∈ {±1}. Given a kernel matrix and a set of labels *zi* for each

*i*

where the coefficients *α<sup>i</sup>* are determined by solving a constrained quadratic program which aims to maximize the margin between the classes. In our experiments we used the LIBSVM package. Our research deals with multi-class problem. The One-Vs-One approach is adopted to extend the proposed approach to multi-temporal hyperspectral classification.

The proposed approach was tested on two different data sets. The datasets involve several types of information with dimensions ranging from 176 to 183 bands. The first dataset, *Hyperion*, contains vegetation type data, is divided into five classes, has 183 spectral bands and has a pixel size of 30*m*. The second set is from an airborne sensor (*AVIRIS*), divided into 7 classes, has 176 spectral bands and a pixel size of 18*m*. First, we present experiments that assess the classification accuracy of the proposed approach (PA). We also included the direct SVM fusion and a probabilistic fusion approach in our comparison as a baseline. Figure (2) summarizes the results obtained. At each level of label noise we carry out four experiments, and the figures show the mean performance. The strength of this approach is that it combines the rich modeling power of GDD with the discriminative power of the SVM algorithm.

) = *UXi* **<sup>I</sup>**−<sup>1</sup>*UX*′

*K*(*X*, *X*′

*<sup>z</sup>*(*x*) = *sign*(∑

sample, the SVM proceeds to learn a classifier of the form,

. After Fisher score normalization, we compute the Fisher kernel function

*i*

*<sup>T</sup>* (28)

*αizi*)*K*(*Xi*, *X*)) (29)

this new space.

**<sup>I</sup>** = *EXi*

samples :

 *UXi TUXi* 

**6. Results and discussion**

where *λ*0, *λ*1, ··· ≥ *O* are eigenvalues and 0, *φ*1, . . . are the corresponding eigenfuctions on the Laplace-Beltrami operator, satisfying *δXφ<sup>i</sup>* = *λiφi*. Let's denote this vector by *Y*.

**Spatio-temporal Gabor filters:** Texture is one of the important characteristics used in identifying objects or regions of interest. It contains important information about the structural arrangement of surfaces. Fusing texture with 3D spectral information is conducive to the interpretation of remote seeing image [13]. We use a method for dynamic texture modeling based on spatio-temporal Gabor filters. Briefly, the sequence of images is convolved with a bank of spatiotemporal Gabor filters and a feature vector is constructed with the energy of the responses as components. Let's denote this vector by *Y*′ .

#### **5.3. Multi-Features fusion based on a cooperative GDD/SVM classifier**

In this section, we present an approach that combines an SVM classifier [1] with a generatively trained GDD model and profits, accordingly, from the advantages of both techniques. The key idea here is to concatenate the extracted features into one vector and to project it in a new space. First, a straightforward feature combination approach is used to concatenate feature vectors (*Y* and *Y*′ ) to a single feature vector *X* = (*Xi*1,..., *Xidim*). The *dim* size may differ from one pixel to another making the fusion and classification a challenging tasks. To overcome this limit, we use the Generalized Dirichelet Distribution (GDD) model [14] to map each feature vector into its Fisher score. Therefore, the Fisher kernel function from the GDD is used to replace the Gaussian kernel in the classical SVM.

Let (*X*1,..., *XN*) denote a collection of *N* multi-temporal hyperspectral pixels. Each data *Xi* is assumed to have *dim* size, *X* = (*Xi*1,..., *Xidim*). Each data *Xi* is assumed to be drawn from the following finite mixture model :

$$p(X\_i/\theta) = \sum\_{j=1}^{M} p(X\_i/j, \theta\_j) P(j) \tag{26}$$

where *M* is the number of components, the *P*(*j*), (0 < *P*(*j*) < 1 and ∑*dim <sup>j</sup>*=<sup>1</sup> *P*(*j*) = 1) are the mixing proportions and *p*(*X*/*j*, *θj*) is the Probability Density Function PDF. *θ* is the set of parameters to be estimated : *θ* = (*α*1,..., *αM*, *P*(1),..., *P*(*M*)).

If the random vector *X* = (*Xi*1,..., *Xidim*) follows a Dirichelet distribution, the joint density function is given by :

$$X = (X\_{l1}, \dots, X\_{\text{idim}}) = \frac{\tau(|a|)}{\prod\_{i=1}^{\dim + 1} \tau(a\_i)} \prod\_{i=1}^{\dim + 1} X\_i^{a\_i - 1} \tag{27}$$

Since that each feature vector *X* may has an arbitrary dimension, the proposed method defines the fusion as a projection from one feature vector space (spectral bands) to another with a fixed dimentionnality. Accordingly, the feature-level fusion is done by projecting the vector *X* combining into one vector in the Fisher space. Thus, the generative model will have its impact on the final classification result through the projection of the extracted features in this new space.

SVM classifier is used to classify the fused features and the multi-temporal dataset of images. Given the generative model obtained by GDD with parameters *θ*, we compute for each sample *X* the Fisher score *Ud* = ▽*<sup>θ</sup> logP*(*x*|*θ*) (the gradient of the log likehood of *x* for model *θ*). The Fisher kernel operates in the gradient space of the generative mode and provides a natural similarity measure between data samples. For each sample, this score is a vector of fixed dimentionality. Using this score, the Fisher Information matrix is defined as **<sup>I</sup>** = *EXi UXi TUXi* . After Fisher score normalization, we compute the Fisher kernel function on the basis of the Euclidean distance between the scores of the new sample and the training samples :

$$K(X, \boldsymbol{X}') = \mathcal{U}\boldsymbol{X}\_{\boldsymbol{i}}\,\mathbb{I}^{-1}\mathcal{U}\_{\boldsymbol{X}\_{\boldsymbol{i}}'} \tag{28}$$

In the second stage, suppose our training set *S* consists of labels input vectors (*Xi*, *zj*), *i* = 1, . . . , *m* where *Xi* ∈ **R***<sup>n</sup>* and *zi* ∈ {±1}. Given a kernel matrix and a set of labels *zi* for each sample, the SVM proceeds to learn a classifier of the form,

$$z(\mathbf{x}) = \operatorname{sign}(\sum\_{i} a\_i z\_i) \mathcal{K}(X\_{i\prime} X) \,\big|\,\tag{29}$$

where the coefficients *α<sup>i</sup>* are determined by solving a constrained quadratic program which aims to maximize the margin between the classes. In our experiments we used the LIBSVM package. Our research deals with multi-class problem. The One-Vs-One approach is adopted to extend the proposed approach to multi-temporal hyperspectral classification.

#### **6. Results and discussion**

14 Image Fusion

signature (HKS) by :

concatenate feature vectors (*Y* and *Y*′

the following finite mixture model :

function is given by :

Laplace-Beltrami operator. As a local shape descriptor, Sun et al. [**?** ] defined the heat kernel

where *λ*0, *λ*1, ··· ≥ *O* are eigenvalues and 0, *φ*1, . . . are the corresponding eigenfuctions on

**Spatio-temporal Gabor filters:** Texture is one of the important characteristics used in identifying objects or regions of interest. It contains important information about the structural arrangement of surfaces. Fusing texture with 3D spectral information is conducive to the interpretation of remote seeing image [13]. We use a method for dynamic texture modeling based on spatio-temporal Gabor filters. Briefly, the sequence of images is convolved with a bank of spatiotemporal Gabor filters and a feature vector is constructed with the

In this section, we present an approach that combines an SVM classifier [1] with a generatively trained GDD model and profits, accordingly, from the advantages of both techniques. The key idea here is to concatenate the extracted features into one vector and to project it in a new space. First, a straightforward feature combination approach is used to

size may differ from one pixel to another making the fusion and classification a challenging tasks. To overcome this limit, we use the Generalized Dirichelet Distribution (GDD) model [14] to map each feature vector into its Fisher score. Therefore, the Fisher kernel function

Let (*X*1,..., *XN*) denote a collection of *N* multi-temporal hyperspectral pixels. Each data *Xi* is assumed to have *dim* size, *X* = (*Xi*1,..., *Xidim*). Each data *Xi* is assumed to be drawn from

> *M* ∑ *j*=1

mixing proportions and *p*(*X*/*j*, *θj*) is the Probability Density Function PDF. *θ* is the set of

If the random vector *X* = (*Xi*1,..., *Xidim*) follows a Dirichelet distribution, the joint density

∏*dim*+<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> *τ*(*αi*) *dim*+1 ∏ *i*=1

*Xαi*−<sup>1</sup>

∞ ∑ *i*=0 *e* <sup>−</sup>*λtφ*<sup>2</sup>

*<sup>i</sup>* (*x*) (25)

.

) to a single feature vector *X* = (*Xi*1,..., *Xidim*). The *dim*

*p*(*Xi*/*j*, *θj*)*P*(*j*) (26)

*<sup>j</sup>*=<sup>1</sup> *P*(*j*) = 1) are the

*<sup>i</sup>* (27)

*h*(*x*, *t*) = *Kx*,*t*(*x*, *x*) =

energy of the responses as components. Let's denote this vector by *Y*′

**5.3. Multi-Features fusion based on a cooperative GDD/SVM classifier**

from the GDD is used to replace the Gaussian kernel in the classical SVM.

*p*(*Xi*/*θ*) =

where *M* is the number of components, the *P*(*j*), (0 < *P*(*j*) < 1 and ∑*dim*

*<sup>X</sup>* = (*Xi*1,..., *Xidim*) = *<sup>τ</sup>*(|*α*|)

parameters to be estimated : *θ* = (*α*1,..., *αM*, *P*(1),..., *P*(*M*)).

the Laplace-Beltrami operator, satisfying *δXφ<sup>i</sup>* = *λiφi*. Let's denote this vector by *Y*.

The proposed approach was tested on two different data sets. The datasets involve several types of information with dimensions ranging from 176 to 183 bands. The first dataset, *Hyperion*, contains vegetation type data, is divided into five classes, has 183 spectral bands and has a pixel size of 30*m*. The second set is from an airborne sensor (*AVIRIS*), divided into 7 classes, has 176 spectral bands and a pixel size of 18*m*. First, we present experiments that assess the classification accuracy of the proposed approach (PA). We also included the direct SVM fusion and a probabilistic fusion approach in our comparison as a baseline. Figure (2) summarizes the results obtained. At each level of label noise we carry out four experiments, and the figures show the mean performance. The strength of this approach is that it combines the rich modeling power of GDD with the discriminative power of the SVM algorithm.

(a) Overall accuracy of the EKFD [Both two sets]

(c) Overall accuracy of the EKFD [Two sets]

10.5772/56949

35

http://dx.doi.org/10.5772/56949

[4] Nello Cristianini and John Shawe-Taylor. *An Introduction to Support Vector Machines*.

A Multi-Features Fusion of Multi-Temporal Hyperspectral Images Using a Cooperative GDD/SVM Method

[5] Trevor Hastie, Saharon Rosset, Robert Tibshirani, and Ji Zhu. The entire regularization path for the support vector machine. *Journal of Machine Learning Research*, 5:1391–1415,

[6] Chih-Jen Lin. Formulations of support vector machines: a note from an optimization

[7] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. *The Elements of Statistical*

[8] Alexander Smola and Bernhard SchÅ ˛alkopf. On a kernel-based method for pattern recognition, regression, approximation and operator inversion. *Algorithmica*,

[9] Alexander Smola, Bernhard SchÅ ˛alkopf, and Klaus-Robert MÅÿller. The connection between regularization operators and support vector kernels. *Neural Networks*,

[10] T. Evgeniou, M. Pontil, and T. Poggio. Regularization networks and support vector

[11] T. Hastie, A. Buja, and R. Tibshirani. Penalized discriminant analysis. *Annals of Statistics*,

[12] Imed Riadh Farah, Selim Hemissi, Karim Saheb Ettabaa, and Bassel Souleiman. Multi-temporal Hyperspectral Images Unmixing and Classification Based on 3D

[13] Y Wang and C Chua. Face recognition from 2d and 3d images using 3d gabor filters.

[14] Nizar Bouguila and Djemel Ziou. A dirichlet process mixture of generalized dirichlet distributions for proportional data modeling. *IEEE Transactions on Neural Networks*,

*Learning: Data Mining, Inference, and Prediction*. Springer-Verlag, 2001.

machines. *Advances in Computational Mathematics*, 13(1):1–50, 2000.

Signature Model and Matching. *Piers Online*, 6:480–484, 2010.

*Image and Vision Computing*, 23(11):1018–1028, 2005.

Cambridge University Press, Cambridge, UK, 2000.

point of view. *Neural Computation*, 2(13):307–317, 2001.

October 2004.

22(1064):211–231, 1998.

11:637–649, 1998.

23:73–102, 1995.

21(1):107–122, 2010.

(b-1) Map of ground truth

(b-2) Result of classificationwith EKFD [First set]

**Figure 2.** Experimental results.

#### **Author details**

Selim Hemissi and Imed Riadh Farah

RIADII-SIIVT, Tunisia

#### **References**


[4] Nello Cristianini and John Shawe-Taylor. *An Introduction to Support Vector Machines*. Cambridge University Press, Cambridge, UK, 2000.

16 Image Fusion

(b-1) Map of ground truth

Selim Hemissi and Imed Riadh Farah

**Figure 2.** Experimental results.

**Author details**

**References**

RIADII-SIIVT, Tunisia

pages 173–195, 2006.

(b-2) Result of classificationwith EKFD [First set]

*Data Min. Knowl. Discov.*, 2(2):121–167, June 1998.

*International Conference on*, pages 1613–1616, 2009.

(a) Overall accuracy of the EKFD [Both two sets]

[1] Christopher J. C. Burges. A tutorial on support vector machines for pattern recognition.

[2] Ilkay Ulusoy and Christopher M Bishop. Comparison of generative and discriminative techniques for object detection and classification. *Toward CategoryLevel Object Recognition*,

[3] John Paisley and Lawrence Carin. Dirichlet process mixture models with multiple modalities. In *Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE*

(c) Overall accuracy of the EKFD [Two sets]


#### 36 New Advances in Image Fusion

**Chapter 3**

**Multi-Frequency Image Fusion Based on MIMO UWB**

The principle idea behind synthetic aperture radar (SAR) stems from the desire for highresolution images. SAR transmits signal at spaced intervals called pulse repetition interval (PRI). The responses at each PRI are collected and processed to reconstruct a radar image of the terrain [1]. In general, high-resolution SAR images in range domain are generated using ultra-wideband (UWB) waveforms as radar transmitted pulse [2]. UWB pulses (500 MHz bandwidth and above) can enhance the range resolution considerably. UWB technology has dual advantages: good capacity of penetration and high-resolution target detection in range

Orthogonal frequency division multiplexing (OFDM), a modulation scheme commonly utilized in commercial communications, shows a great potential for use in forming radar waveforms. An OFDM signal is comprised of several orthogonal subcarriers, which are simultaneously emitted over a single transmission path. Each subcarrier occupies a small slice of the entire signal bandwidth [4]. Technology advances helped in increasing the sampling speed capabilities, allowing accurate generation of UWB-OFDM waveforms. This results in a diverse signal that is capable of high-resolution imaging. While OFDM has been elaborately studied and commercialized in the digital communication field, it has not yet been widely studied by the radar scientific community apart from a few efforts [5-7]. The advantages of using OFDM in radar applications include: (a) Transceiver system is based on digital imple‐ mentation using relatively inexpensive components (b) Ease of narrowband interference mitigation (c) High-resolution in UWB scale and good multi-path potential (d) Same architec‐ ture can be used to transmit large amounts of data in real time; and (e) Flexibility in pulse

> © 2013 Hossain et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**OFDM Synthetic Aperture Radar**

Md Anowar Hossain, Ibrahim Elshafiey and

Additional information is available at the end of the chapter

Majeed A. S. Alkanhal

http://dx.doi.org/10.5772/56943

domain for radar applications [3].

shaping using different subcarrier compositions.

**1. Introduction**

## **Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar**

Md Anowar Hossain, Ibrahim Elshafiey and Majeed A. S. Alkanhal

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56943

#### **1. Introduction**

36 New Advances in Image Fusion

The principle idea behind synthetic aperture radar (SAR) stems from the desire for highresolution images. SAR transmits signal at spaced intervals called pulse repetition interval (PRI). The responses at each PRI are collected and processed to reconstruct a radar image of the terrain [1]. In general, high-resolution SAR images in range domain are generated using ultra-wideband (UWB) waveforms as radar transmitted pulse [2]. UWB pulses (500 MHz bandwidth and above) can enhance the range resolution considerably. UWB technology has dual advantages: good capacity of penetration and high-resolution target detection in range domain for radar applications [3].

Orthogonal frequency division multiplexing (OFDM), a modulation scheme commonly utilized in commercial communications, shows a great potential for use in forming radar waveforms. An OFDM signal is comprised of several orthogonal subcarriers, which are simultaneously emitted over a single transmission path. Each subcarrier occupies a small slice of the entire signal bandwidth [4]. Technology advances helped in increasing the sampling speed capabilities, allowing accurate generation of UWB-OFDM waveforms. This results in a diverse signal that is capable of high-resolution imaging. While OFDM has been elaborately studied and commercialized in the digital communication field, it has not yet been widely studied by the radar scientific community apart from a few efforts [5-7]. The advantages of using OFDM in radar applications include: (a) Transceiver system is based on digital imple‐ mentation using relatively inexpensive components (b) Ease of narrowband interference mitigation (c) High-resolution in UWB scale and good multi-path potential (d) Same architec‐ ture can be used to transmit large amounts of data in real time; and (e) Flexibility in pulse shaping using different subcarrier compositions.

© 2013 Hossain et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Although SAR is a well-known remote sensing application which obtains high-resolution in range domain by transmitting wide-band waveform and high-resolution in azimuth domain by utilizing the relative motion between the target and the radar platform, current single antenna SAR is not able to provide some remote sensing performances, such as simultaneously high-resolution and wide-swath imaging. Multiple-Input Multiple-Output (MIMO) SAR provides a solution to resolve these problems and provides following advantages compared to traditional SAR: diversity in viewing angles on a particular target to improve identifiability, increased azimuth resolution or decreased pulse repetition frequency (PRF) which results in wider swath. Due to the larger number of degrees-of-freedom of a MIMO system, enhanced resolution can be achieved by coherently processing of multiple waveforms at multiple receivers simultaneously.

column, if we consider a 1, the other elements are 0's and the next column is filled with all 0's to prevent over-sampling. The spectrum of an OFDM signal is shown in Figure 1 where the width of the main-lobe depends on the duration of the pulse. In digital implementation of an OFDM signal, the pulse duration is related to the number of subcarriers. As the number of

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

39

1 2 3 - - - - - - - - - - - - N

Ψω1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Ψω2 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 Ψω3 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 Ψω4 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0

As an example, an OFDM signal is generated according to the scheme shown in Figure 2 by spreading the digital frequency domain vector shown in Table 1 and the modulation symbol from random integer generator. The order of modulation (*M*) is chosen as 4 for QPSK. Inverse Fast Fourier Transform (IFFT) is then applied to obtain the discrete time domain OFDM signal and finally Hanning window is imposed to minimize the side-lobes. The time-domain OFDM

subcarrier increases, the duration of the pulse increases.

**Table 1.** OFDM frequency-domain sample vector generation

**Figure 1.** OFDM signal spectrum

signal is given as

Several research works have been reported in recent years to overcome the trade-off between wide-swath and azimuth resolution in conventional SAR system [8]. However, MIMO SAR systems are only investigated by generalizing the theoretical modeling of MIMO communi‐ cation systems and discussed recently in radar communities [9 and 10]. The configuration of the proposed MIMO-SAR system in this chapter is considered as two co-located transmitters and two receivers along with image fusion technique.

In remote sensing applications, the increasing availability of spaceborne sensor gives a motivation to utilize image fusion algorithms. Several situations in image processing require high spatial and high spectral resolution in a single image. Image fusion is the process of combining relevant information from two or more images into a single image. The resulting image will be more informative than any of the input images [11].

The structure of the chapter is as follows. UWB-OFDM pulse shaping for MIMO SAR is described in section 2, while the comparison of auto-correlation and cross-correlation of different pulses in radar perspective is presented in section 3. Detailed analysis of MIMO wideswath SAR system and its functionality is discussed in section 4. MIMO wide-swath SAR imaging results based on UWB-OFDM waveforms are presented in section 5. Section 6 presents the optimized SAR image based on image fusion technique. Final conclusions are provided in section 7.

#### **2. MIMO UWB OFDM signal generation**

A widely studied approach in MIMO architecture involves the transmission of orthogonal signals from different antennas. This makes it possible to separate the reflected signals from the target arriving at the receiver. In particular, we develop a procedure to design the optimal waveform that ensures orthogonality by imposing the rules shown in Table 1. The key to our approach is to use a model for the received radar signals that explicitly includes the transmitted waveforms. To achieve lower cross-correlation between transmitted pulses with a common bandwidth for the same range resolution, OFDM frequency-domain sample vector for *N* subcarriers is generated using the sequences shown in Table 1. The sequence in Table 1 generates the orthogonal signals based on the placement of 1's and 0's. We observe that in each column, if we consider a 1, the other elements are 0's and the next column is filled with all 0's to prevent over-sampling. The spectrum of an OFDM signal is shown in Figure 1 where the width of the main-lobe depends on the duration of the pulse. In digital implementation of an OFDM signal, the pulse duration is related to the number of subcarriers. As the number of subcarrier increases, the duration of the pulse increases.


**Table 1.** OFDM frequency-domain sample vector generation

Although SAR is a well-known remote sensing application which obtains high-resolution in range domain by transmitting wide-band waveform and high-resolution in azimuth domain by utilizing the relative motion between the target and the radar platform, current single antenna SAR is not able to provide some remote sensing performances, such as simultaneously high-resolution and wide-swath imaging. Multiple-Input Multiple-Output (MIMO) SAR provides a solution to resolve these problems and provides following advantages compared to traditional SAR: diversity in viewing angles on a particular target to improve identifiability, increased azimuth resolution or decreased pulse repetition frequency (PRF) which results in wider swath. Due to the larger number of degrees-of-freedom of a MIMO system, enhanced resolution can be achieved by coherently processing of multiple waveforms at multiple

Several research works have been reported in recent years to overcome the trade-off between wide-swath and azimuth resolution in conventional SAR system [8]. However, MIMO SAR systems are only investigated by generalizing the theoretical modeling of MIMO communi‐ cation systems and discussed recently in radar communities [9 and 10]. The configuration of the proposed MIMO-SAR system in this chapter is considered as two co-located transmitters

In remote sensing applications, the increasing availability of spaceborne sensor gives a motivation to utilize image fusion algorithms. Several situations in image processing require high spatial and high spectral resolution in a single image. Image fusion is the process of combining relevant information from two or more images into a single image. The resulting

The structure of the chapter is as follows. UWB-OFDM pulse shaping for MIMO SAR is described in section 2, while the comparison of auto-correlation and cross-correlation of different pulses in radar perspective is presented in section 3. Detailed analysis of MIMO wideswath SAR system and its functionality is discussed in section 4. MIMO wide-swath SAR imaging results based on UWB-OFDM waveforms are presented in section 5. Section 6 presents the optimized SAR image based on image fusion technique. Final conclusions are provided in

A widely studied approach in MIMO architecture involves the transmission of orthogonal signals from different antennas. This makes it possible to separate the reflected signals from the target arriving at the receiver. In particular, we develop a procedure to design the optimal waveform that ensures orthogonality by imposing the rules shown in Table 1. The key to our approach is to use a model for the received radar signals that explicitly includes the transmitted waveforms. To achieve lower cross-correlation between transmitted pulses with a common bandwidth for the same range resolution, OFDM frequency-domain sample vector for *N* subcarriers is generated using the sequences shown in Table 1. The sequence in Table 1 generates the orthogonal signals based on the placement of 1's and 0's. We observe that in each

receivers simultaneously.

38 New Advances in Image Fusion

section 7.

and two receivers along with image fusion technique.

**2. MIMO UWB OFDM signal generation**

image will be more informative than any of the input images [11].

**Figure 1.** OFDM signal spectrum

As an example, an OFDM signal is generated according to the scheme shown in Figure 2 by spreading the digital frequency domain vector shown in Table 1 and the modulation symbol from random integer generator. The order of modulation (*M*) is chosen as 4 for QPSK. Inverse Fast Fourier Transform (IFFT) is then applied to obtain the discrete time domain OFDM signal and finally Hanning window is imposed to minimize the side-lobes. The time-domain OFDM signal is given as

$$\mathbf{W}\_{\rm txi}(\mathbf{t}) = \mathbf{F}^{-1} \mathbf{f} \mathbf{W}\_{\alpha i} \mathbf{J} \text{ w(n)} \qquad \mathbf{i} = 1, 2, \; \cdots \; \mathbf{4} \tag{1}$$

**Figure 4.** UWB-OFDM waveform in time-domain (a) before windowing (b) after windowing

**Figure 5.** Auto-correlation function (ACF) of OFDM pulses in time domain (a) before windowing (b) after windowing

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

41

**Figure 6.** Cross-correlation function (CCF) of OFDM pulses in time domain (a) before windowing (b) after windowing

where, Hanning window *w*(*n*)=0.5{1 - *cos*( <sup>2</sup>*π<sup>n</sup> <sup>N</sup>* )}, 0≤*n* ≤ *N* - 1 and *N* is the number of subcar‐ riers. The term *Ψωi* denotes the spreading sequences for ith sub-pulses. Each antenna transmits two sub-pulses simultaneously.

UWB-OFDM waveforms are generated using the following parameters: number of OFDM subcarriers, *N* = 256, sampling time, *Δt<sup>s</sup>* = 1ns results in baseband bandwidth, *B0* = 1/2*Δts* = 500 MHz, dividing by a factor of two to satisfy Nyquist criterion. UWB-OFDM waveform in frequency domain and time domain is shown in Figure 3 and Figure 4 respectively. We can observe that the Hanning window reasonably minimizes side-lobes which in turns improve the auto-correlation function (ACF) and cross-correlation function (CCF) of time-domain OFDM waveforms as shown in Figure 5 and Figure 6, respectively.

**Figure 3.** UWB-OFDM waveform in frequency-domain (a) before windowing (b) after windowing

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar http://dx.doi.org/10.5772/56943 41

**Figure 4.** UWB-OFDM waveform in time-domain (a) before windowing (b) after windowing

Ψtxi

OFDM waveforms as shown in Figure 5 and Figure 6, respectively.

**Figure 3.** UWB-OFDM waveform in frequency-domain (a) before windowing (b) after windowing

where, Hanning window *w*(*n*)=0.5{1 - *cos*( <sup>2</sup>*π<sup>n</sup>*

two sub-pulses simultaneously.

40 New Advances in Image Fusion

**Figure 2.** OFDM signal generator

(t)=*<sup>Ϝ</sup>* -1 Ψω<sup>i</sup> w(n) i=1,2, <sup>⋯</sup><sup>4</sup> (1)

riers. The term *Ψωi* denotes the spreading sequences for ith sub-pulses. Each antenna transmits

UWB-OFDM waveforms are generated using the following parameters: number of OFDM subcarriers, *N* = 256, sampling time, *Δt<sup>s</sup>* = 1ns results in baseband bandwidth, *B0* = 1/2*Δts* = 500 MHz, dividing by a factor of two to satisfy Nyquist criterion. UWB-OFDM waveform in frequency domain and time domain is shown in Figure 3 and Figure 4 respectively. We can observe that the Hanning window reasonably minimizes side-lobes which in turns improve the auto-correlation function (ACF) and cross-correlation function (CCF) of time-domain

*<sup>N</sup>* )}, 0≤*n* ≤ *N* - 1 and *N* is the number of subcar‐

**Figure 5.** Auto-correlation function (ACF) of OFDM pulses in time domain (a) before windowing (b) after windowing

**Figure 6.** Cross-correlation function (CCF) of OFDM pulses in time domain (a) before windowing (b) after windowing

#### **3. Comparison of auto-correlation and cross-correlation**

Cross-correlation is the measure of similarity between two different sequences and can be given as

$$\mathbf{R}\_{\mathbf{x}\mathbf{y}}(m) = \begin{cases} \sum\_{n=0}^{N-m-1} \mathbf{x}\_{n+m} \mathbf{y}\_n^\* & m \ge 0 \\ \mathbf{R}\_{\mathbf{y}\mathbf{x}}^\* \text{(-m)} & m \le 0 \end{cases} \tag{2}$$

Thus, the waveforms with lower cross-correlation and auto-correlation peak side-lobe are desired for MIMO SAR systems. The sequences with good auto-correlation property provide high-resolution target detection and lower cross-correlation mitigates the interference from

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

43

Orthogonality is the most important properties of Walsh-Hadamard sequences [12]. Because of this property, the cross-correlation between any two codes of the same set is zero as shown in Figure 7. Unfortunately, Walsh sequences are orthogonal only in the case of perfect synchronization, and have non-zero off-peak auto-correlations and cross–correlation in asynchronous case. To compare the performance of OFDM signal using Walsh-Hadamard sequences and proposed orthogonal sequences for radar application, we can analyze ACF and

**Figure 8.** Point target profile using Walsh-Hadamard sequences and proposed orthogonal sequences (a) ACF of Walsh-Hadamard (b) ACF of proposed orthogonal sequences (c) CCF of Walsh-Hadamard (d) CCF of proposed orthog‐

Figure 8(a) and Figure 8(b) show the auto-correlation, while Figure 8(c) and Figure 8(d) show the cross-correlation in terms of a point target using Walsh-Hadamard sequences and pro‐ posed orthogonal sequences respectively. The auto-correlation is measured as the correlation between the received signal and the transmitted signal at same antenna, while cross-correlation is measured as the correlation between transmitted signal in one antenna and the received signal in another antenna. We can observe that the significant improvement in ACF and CCF for a point target profile using proposed orthogonal pulses in comparison with Walsh-

CCF by assuming a point target at the center of the target area.

nearby sensors.

onal sequences

where, *xn* and *yn* are the elements of two different sequences and have period *N*. Autocorrelation shows the measure of similarity between the sequence and its cyclic shifted copy can be obtained from equation (2) as a special case (*x* = *y*) [12].

**Figure 7.** Ideal Walsh-Hadamard sequences (a) ACF (b) CCF

The auto-correlation and cross-correlation properties of the sequences used in generating the transmitted waveform play an important role in high-resolution SAR imaging based on MIMO architecture. Practically, lower cross-correlation between waveforms avoids interference which results in independent information gains from target signature at various angles. Similarly, low auto-correlation peak side-lobe ratio ensures high-resolution in range domain. Thus, the waveforms with lower cross-correlation and auto-correlation peak side-lobe are desired for MIMO SAR systems. The sequences with good auto-correlation property provide high-resolution target detection and lower cross-correlation mitigates the interference from nearby sensors.

**3. Comparison of auto-correlation and cross-correlation**

Rxy(*m*)={ <sup>∑</sup>

can be obtained from equation (2) as a special case (*x* = *y*) [12].

**Figure 7.** Ideal Walsh-Hadamard sequences (a) ACF (b) CCF

*n*=0 *N* -*m*-1

R*yx*

given as

42 New Advances in Image Fusion

Cross-correlation is the measure of similarity between two different sequences and can be

*xn*+*<sup>m</sup>yn*

*\** (-*m*) *m*<0

where, *xn* and *yn* are the elements of two different sequences and have period *N*. Autocorrelation shows the measure of similarity between the sequence and its cyclic shifted copy

The auto-correlation and cross-correlation properties of the sequences used in generating the transmitted waveform play an important role in high-resolution SAR imaging based on MIMO architecture. Practically, lower cross-correlation between waveforms avoids interference which results in independent information gains from target signature at various angles. Similarly, low auto-correlation peak side-lobe ratio ensures high-resolution in range domain.

*\* m*≥0

(2)

Orthogonality is the most important properties of Walsh-Hadamard sequences [12]. Because of this property, the cross-correlation between any two codes of the same set is zero as shown in Figure 7. Unfortunately, Walsh sequences are orthogonal only in the case of perfect synchronization, and have non-zero off-peak auto-correlations and cross–correlation in asynchronous case. To compare the performance of OFDM signal using Walsh-Hadamard sequences and proposed orthogonal sequences for radar application, we can analyze ACF and CCF by assuming a point target at the center of the target area.

**Figure 8.** Point target profile using Walsh-Hadamard sequences and proposed orthogonal sequences (a) ACF of Walsh-Hadamard (b) ACF of proposed orthogonal sequences (c) CCF of Walsh-Hadamard (d) CCF of proposed orthog‐ onal sequences

Figure 8(a) and Figure 8(b) show the auto-correlation, while Figure 8(c) and Figure 8(d) show the cross-correlation in terms of a point target using Walsh-Hadamard sequences and pro‐ posed orthogonal sequences respectively. The auto-correlation is measured as the correlation between the received signal and the transmitted signal at same antenna, while cross-correlation is measured as the correlation between transmitted signal in one antenna and the received signal in another antenna. We can observe that the significant improvement in ACF and CCF for a point target profile using proposed orthogonal pulses in comparison with WalshHadamard sequences. In MIMO SAR, ACF between transmitted and received signal of the same antenna should provide narrow main-lobe width for high-resolution. Lower CCF properties between transmitted signal of one antenna and received signal of another antenna are needed to avoid interference from nearby sensor. The main-lobe width of the proposed sequence shown in Figure 8(b) is reasonably narrow in comparison with Figure 8(a) which in turns improves the range resolution. In case of CCF, since all cross-correlation values, not just peak values, affect the system performance, we should consider the measure as mean crosscorrelation value. The mean CCF of the proposed sequence shown in Figure 8 (d) is much lower than the Walsh-Hadamard sequence shown in Figure 8(c).

#### **4. MIMO wide-swath SAR imaging system**

In MIMO SAR, independent signals are transmitted through different antennas, and these signals are received by multiple antennas after propagating through the environment. Each antenna transmits a unique waveform orthogonal to the waveforms transmitted by other antennas; the returns of each orthogonal signal will carry independent information about the targets. In the receiver, a matched filter-bank is used to extract the orthogonal waveform components. Consider the MIMO SAR system with a transmit array equipped with 2 colocated antennas and a receive array (possibly the same array) equipped with 2 co-located antennas. Suppose both transmit and receive arrays are close to each other in space but they see different target area at different directions. Figure 9 shows the MIMO wide-swath stripmap SAR imaging topology and Figure 10 shows the block diagram of the MIMO OFDM SAR imaging system.

The antenna beam A and B are illuminating the swath *A* and *B* respectively. At an specific PRI, *TxA* transmits pulse *ΨtxA*(*t*) via the antenna beam *A*, while *TxB* transmits the pulse *ΨtxB*(*t*) via the antenna beam B at the same time. Echoes from swath *A* and *B* will exist at the both receivers. To separate echoes from swath *A* and *B*, a careful design of transmit antenna pattern as well as transmitted pulse is required. It can further reduce the disturbance echoes from the temporal undesired swath.

The OFDM signal generator generates signal according to the scheme shown in Figure 2. The detailed of the block diagram components such as D/A converter, mixer and power amplifier shown in Figure 10 can be found in [5]. We consider four typical orthogonal sub-pulses based on the sample vectors shown in Table 1. Two different signals *ΨtxA*(*t*) and *ΨtxB*(*t*) are trans‐ mitted simultaneously from antenna *A* and *B* respectively at each PRI where each signal is the combination of two sub-pulses and are given as

$$
\Psi\_{t \ge A}(\mathbf{t}) = \Psi\_{t \ge 1}(\mathbf{t}) + \Psi\_{t \ge 2}(\mathbf{t}) \tag{3}
$$

$$
\Psi\_{txB}(\mathbf{t}) = \Psi\_{tx3}(\mathbf{t}) + \Psi\_{tx4}(\mathbf{t}) \tag{4}
$$

**Figure 10.** MIMO OFDM SAR imaging system

**Figure 9.** MIMO stripmap wide-swath SAR imaging topology

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

45

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar http://dx.doi.org/10.5772/56943 45

**Figure 9.** MIMO stripmap wide-swath SAR imaging topology

Hadamard sequences. In MIMO SAR, ACF between transmitted and received signal of the same antenna should provide narrow main-lobe width for high-resolution. Lower CCF properties between transmitted signal of one antenna and received signal of another antenna are needed to avoid interference from nearby sensor. The main-lobe width of the proposed sequence shown in Figure 8(b) is reasonably narrow in comparison with Figure 8(a) which in turns improves the range resolution. In case of CCF, since all cross-correlation values, not just peak values, affect the system performance, we should consider the measure as mean crosscorrelation value. The mean CCF of the proposed sequence shown in Figure 8 (d) is much

In MIMO SAR, independent signals are transmitted through different antennas, and these signals are received by multiple antennas after propagating through the environment. Each antenna transmits a unique waveform orthogonal to the waveforms transmitted by other antennas; the returns of each orthogonal signal will carry independent information about the targets. In the receiver, a matched filter-bank is used to extract the orthogonal waveform components. Consider the MIMO SAR system with a transmit array equipped with 2 colocated antennas and a receive array (possibly the same array) equipped with 2 co-located antennas. Suppose both transmit and receive arrays are close to each other in space but they see different target area at different directions. Figure 9 shows the MIMO wide-swath stripmap SAR imaging topology and Figure 10 shows the block diagram of the MIMO OFDM SAR

The antenna beam A and B are illuminating the swath *A* and *B* respectively. At an specific PRI, *TxA* transmits pulse *ΨtxA*(*t*) via the antenna beam *A*, while *TxB* transmits the pulse *ΨtxB*(*t*) via the antenna beam B at the same time. Echoes from swath *A* and *B* will exist at the both receivers. To separate echoes from swath *A* and *B*, a careful design of transmit antenna pattern as well as transmitted pulse is required. It can further reduce the disturbance echoes from the temporal

The OFDM signal generator generates signal according to the scheme shown in Figure 2. The detailed of the block diagram components such as D/A converter, mixer and power amplifier shown in Figure 10 can be found in [5]. We consider four typical orthogonal sub-pulses based on the sample vectors shown in Table 1. Two different signals *ΨtxA*(*t*) and *ΨtxB*(*t*) are trans‐ mitted simultaneously from antenna *A* and *B* respectively at each PRI where each signal is the

(t) + Ψ*tx*<sup>2</sup>

(t) + Ψ*tx*<sup>4</sup>

(t) (3)

(t) (4)

Ψ*txA*(t)=Ψ*tx*<sup>1</sup>

Ψ*txB*(t)=Ψ*tx*<sup>3</sup>

lower than the Walsh-Hadamard sequence shown in Figure 8(c).

**4. MIMO wide-swath SAR imaging system**

imaging system.

44 New Advances in Image Fusion

undesired swath.

combination of two sub-pulses and are given as

**Figure 10.** MIMO OFDM SAR imaging system

The received signal for radar at antenna *A* is given by

$$\mathbf{W}\_{r\mathbf{x}A}(\mathbf{t},\ \mathbf{u}) = \alpha \left[ \sum\_{n=1}^{N} \sigma\_n \mathbf{W}\_{\text{trA}} \begin{pmatrix} \mathbf{t} \ \mathbf{t} \end{pmatrix} + \beta \right] \sum\_{n=1}^{N} \sigma\_n \mathbf{W}\_{\text{trB}} \begin{pmatrix} \mathbf{t} \ \mathbf{t} \end{pmatrix} + \eta\_A \begin{pmatrix} \mathbf{t} \\ \end{pmatrix} \tag{5}$$

Finally, we will have a total of four extracted signals from two receiving antennas. Compared to the traditional phased array SAR where the same waveform is used at all the transmitting antennas and a total of 2 coefficients are obtained from the matched filtering, the MIMO OFDM SAR gives more coefficients and therefore provides more degrees of freedom [8]. Each matched filter output is then processed separately using SAR imaging algorithms such as Range-Doppler algorithm and image fusion technique is then applied to achieve final SAR recon‐

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

47

The scenario involves the wide-swath SAR imaging by considering four distinct orthogonal UWB-OFDM sub-pulses as SAR transmitted signals using two antennas. The objective is to investigate the performance of the proposed orthogonal waveforms in MIMO architecture. Let us consider 2 point targets reside in swath A at the positions [(x1, y1), (x2, y2)] = [(300, 100), (900, -50)] while 2 point targets reside at the positions [(x3, y3), (x4, y4)] = [(300, -50), (900, 100)] in swath B. Stripmap SAR imaging topology is considered for raw data generation based on the proposed UWB-OFDM waveforms [5] while Range-Doppler algorithm is used for SAR image reconstruction [13 and 14]. Processing of SAR raw data from multiple antennas can be done in parallel. Field Programmable Gate Array (FPGA) is a powerful tool for real-time imple‐ mentation of SAR image reconstruction from raw data [15 and 16]. Figure 11 and Figure 12 show the resolved images of swath A based on the output of matched filter 1 and 2 respectively while Figure 13 and Figure 14 show the reconstructed images of swath B based on the output

Observing a given scene from two SAR antennas with distinct trajectories allows one to determine the position of the scattering points. Unfortunately, SAR interferometry fails when the scenes imaged by the two antennas are not really the same scene, due to a too large distance between the trajectories of the two SAR antennas. In these cases, the two images may not be sufficiently correlated. SAR image fusion is presented here exploiting the data recorded by same antennas about the same scene using two sub-pulses simultaneously. The usefulness of the fusion technique is evaluated by estimating the noise level for the non-fused and the fused images in terms of entropy. In addition, the behavior of the back-scatterer, as a function of frequency, changes on the basis of the surface types. Therefore, if the images acquired in many regions of the spectrum are fused, the output image will carry useful information about specific back-scatterers. Furthermore, the fusion of multi-frequency images can allow us to fuse the information acquired about the object observed in many spectral bands, in the same spatial context. Complementary information about the same observed scene can be available in the following cases:– data recorded by the same sensor scanning the same scene at different dates (multi-temporal image fusion); – data recorded by the same sensor operating in different

structed image as described in the following sections.

**5. MIMO Wide-swath SAR Imaging**

of matched filter 3 and 4.

**6. Image fusion**

Similarly, the received signal at antenna *B* is given as

$$\Psi\_{rxB}(t,\ \mu) = \alpha \left[\sum\_{n=1}^{N} \sigma\_n \Psi\_{txB}(t \ -t\_{dnB})\right] + \beta \left[\sum\_{n=1}^{N} \sigma\_n \Psi\_{txA}(t \ -t\_{dnA})\right] + \eta\_B\Big|\mathbf{t}\rangle\tag{6}$$

where, *α* and *<sup>β</sup>* are the scale factor and is chosen as <sup>1</sup> 2 and <sup>1</sup> 10 respectively. The scale factor *α* is chosen to distribute the total power to two sub-pulses and *β* is chosen to model the outof-beam signal. The term *tdnA* <sup>=</sup> <sup>2</sup> *<sup>c</sup>* (X*cA* + *x*n) 2 + (*yn* - *u*)2 is the time-delay associated with the target position (*xn, yn*) in swath *A* and *tdnB* <sup>=</sup> <sup>2</sup> *<sup>c</sup>* (X*cB* + *x*n) 2 + (*yn* - *u*)<sup>2</sup> is the time-delay associated with swath *B*. *XcA* and *XcB* denote the range distance to the center of the swath *A* and *B* respectively, where, *n* = 1, 2, 3…N are the number of targets within the antenna beam at any given synthetic aperture position (*u*) in azimuth direction while *σ<sup>n</sup>* denotes the reflectivity of the nth target. The terms *ηA*(*t*) and *ηB*(*t*) denote the additive white Gaussian noise.

Next, the received radar echoes should be separated apart by matched filtering. As the transmitted signal matrix is known to both transmitter and receiver and the transmitted waveforms are designed to be orthogonal, they should satisfy the conditions

$$\begin{aligned} \stackrel{\circ}{\underset{0}{\mathbb{V}}} \Psi\_{rxm}(t) \Psi\_{txn}^\*(t) = \begin{vmatrix} \delta(t), & m = n \\ 0, & m \neq n \end{vmatrix} \end{aligned} \tag{7}$$

where, *Tp* is the sub-pulse duration and (.)\* denotes a conjugate operator. At receiving antenna *A*, two received orthogonal sub-pulses can be extracted by two matched filters and is given by

$$\Psi\_{MFn}(\mathbf{t}) = \mathbf{F}^{-1} \mathbf{[F} \{ \Psi\_{rxA}(\mathbf{t}) \} . \mathbf{F} \{ \Psi\_{txn}^\*(\mathbf{t}) \} \tag{8}$$

Similarly, at receiving antenna *B*, two sub-pulses can be separated as

$$\Psi\_{MFn}(\mathbf{t}) = \,\_F^{\cdot \cdot \cdot} \left[ \mathbf{F} \{ \Psi\_{rxB}(\mathbf{t}) \} . \mathbf{F} \{ \Psi\_{txn}^\*(\mathbf{t}) \} \right] \tag{9}$$

where, *n* = 1, 2 for equation (8) and *n* = 3, 4 for equation (9) while *Ϝ* -1 and F denote the inverse Fourier transform and Fourier transform operations respectively. Therefore, echoes from different swaths could be considered as well separated after the matched filtering.

Finally, we will have a total of four extracted signals from two receiving antennas. Compared to the traditional phased array SAR where the same waveform is used at all the transmitting antennas and a total of 2 coefficients are obtained from the matched filtering, the MIMO OFDM SAR gives more coefficients and therefore provides more degrees of freedom [8]. Each matched filter output is then processed separately using SAR imaging algorithms such as Range-Doppler algorithm and image fusion technique is then applied to achieve final SAR recon‐ structed image as described in the following sections.

#### **5. MIMO Wide-swath SAR Imaging**

The scenario involves the wide-swath SAR imaging by considering four distinct orthogonal UWB-OFDM sub-pulses as SAR transmitted signals using two antennas. The objective is to investigate the performance of the proposed orthogonal waveforms in MIMO architecture. Let us consider 2 point targets reside in swath A at the positions [(x1, y1), (x2, y2)] = [(300, 100), (900, -50)] while 2 point targets reside at the positions [(x3, y3), (x4, y4)] = [(300, -50), (900, 100)] in swath B. Stripmap SAR imaging topology is considered for raw data generation based on the proposed UWB-OFDM waveforms [5] while Range-Doppler algorithm is used for SAR image reconstruction [13 and 14]. Processing of SAR raw data from multiple antennas can be done in parallel. Field Programmable Gate Array (FPGA) is a powerful tool for real-time imple‐ mentation of SAR image reconstruction from raw data [15 and 16]. Figure 11 and Figure 12 show the resolved images of swath A based on the output of matched filter 1 and 2 respectively while Figure 13 and Figure 14 show the reconstructed images of swath B based on the output of matched filter 3 and 4.

#### **6. Image fusion**

The received signal for radar at antenna *A* is given by

*n*=1 N

Similarly, the received signal at antenna *B* is given as

*n*=1 N

where, *α* and *<sup>β</sup>* are the scale factor and is chosen as <sup>1</sup>

*∫* 0 *T p* *σn*ΨtxA(*t* - *tdnA*) + β ∑

*σn*ΨtxB(*t* - *tdnB*) + β ∑

*<sup>c</sup>* (X*cA* + *x*n)

the nth target. The terms *ηA*(*t*) and *ηB*(*t*) denote the additive white Gaussian noise.

\* (*t*)={

<sup>Ψ</sup>*MFn*(t)=*<sup>Ϝ</sup>* -1 F{Ψ*rxA*(t)}.F{Ψ*txn*

<sup>Ψ</sup>*MFn*(t)=*<sup>Ϝ</sup>* -1 F{Ψ*rxB*(t)}.F{Ψ*txn*

different swaths could be considered as well separated after the matched filtering.

waveforms are designed to be orthogonal, they should satisfy the conditions

Ψ*rxm*(*t*)Ψ*txn*

Similarly, at receiving antenna *B*, two sub-pulses can be separated as

where, *n* = 1, 2 for equation (8) and *n* = 3, 4 for equation (9) while *Ϝ* -1

*n*=1 N

*n*=1 N

*α* is chosen to distribute the total power to two sub-pulses and *β* is chosen to model the out-

with swath *B*. *XcA* and *XcB* denote the range distance to the center of the swath *A* and *B* respectively, where, *n* = 1, 2, 3…N are the number of targets within the antenna beam at any given synthetic aperture position (*u*) in azimuth direction while *σ<sup>n</sup>* denotes the reflectivity of

Next, the received radar echoes should be separated apart by matched filtering. As the transmitted signal matrix is known to both transmitter and receiver and the transmitted

where, *Tp* is the sub-pulse duration and (.)\* denotes a conjugate operator. At receiving antenna *A*, two received orthogonal sub-pulses can be extracted by two matched filters and is given by

Fourier transform and Fourier transform operations respectively. Therefore, echoes from

*δ*(*t*), *m*=*n* 0, *m*≠*n*

*<sup>c</sup>* (X*cB* + *x*n)

2

2

2

and <sup>1</sup>

*σn*ΨtxB(*t* - *tdnB*) + ηA(t) (5)

*σn*ΨtxA(*t* - *tdnA*) + ηB(t) (6)

+ (*yn* - *u*)2 is the time-delay associated with the

10 respectively. The scale factor

(7)

+ (*yn* - *u*)<sup>2</sup> is the time-delay associated

\* (t)} (8)

\* (t)} (9)

and F denote the inverse

Ψ*rxA*(*t*, *u*)=α ∑

46 New Advances in Image Fusion

Ψ*rxB*(*t*, *u*)=α ∑

target position (*xn, yn*) in swath *A* and *tdnB* <sup>=</sup> <sup>2</sup>

of-beam signal. The term *tdnA* <sup>=</sup> <sup>2</sup>

Observing a given scene from two SAR antennas with distinct trajectories allows one to determine the position of the scattering points. Unfortunately, SAR interferometry fails when the scenes imaged by the two antennas are not really the same scene, due to a too large distance between the trajectories of the two SAR antennas. In these cases, the two images may not be sufficiently correlated. SAR image fusion is presented here exploiting the data recorded by same antennas about the same scene using two sub-pulses simultaneously. The usefulness of the fusion technique is evaluated by estimating the noise level for the non-fused and the fused images in terms of entropy. In addition, the behavior of the back-scatterer, as a function of frequency, changes on the basis of the surface types. Therefore, if the images acquired in many regions of the spectrum are fused, the output image will carry useful information about specific back-scatterers. Furthermore, the fusion of multi-frequency images can allow us to fuse the information acquired about the object observed in many spectral bands, in the same spatial context. Complementary information about the same observed scene can be available in the following cases:– data recorded by the same sensor scanning the same scene at different dates (multi-temporal image fusion); – data recorded by the same sensor operating in different spectral bands (multi-frequency image fusion); – data recorded by the same sensor at different polarizations (multi-polarization image fusion); – data recorded by the same sensor located on platforms flying at different heights (multi-resolution image fusion).

Many methods exist to perform image fusion. The very basic one is based on discrete wavelet transform (DWT) has become a very useful tool for fusion. DWT is a wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information (location in time). Figure 15 shows the block diagram of wavelet transform based image fusion technique. The principle of image fusion using wavelets is to merge the wavelet decompositions of the two original images using fusion methods applied to approximations coefficients and details coefficients [11]. The DWT is a spatial frequency decomposition that provides a flexible multi-resolution analysis of an image. The inverse discrete wavelet transform (IDWT) is applied to the combined coefficient map to produce the fused image from

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

49

In all wavelet based image fusion schemes the DWT of the two registered input images *I1(x, y)* and *I2(x, y)* are computed and these transforms are combined using some kind of fusion rule. Then the inverse discrete wavelet transform (IDWT) is computed and the fused image *I(x, y)*

(*x*, *y*)), *W* (*I*<sup>2</sup>

where, *W* and *W-1* denotes the DWT and IDWT respectively. The term *ϕ* denotes the rules imposed in fusion such as wavelet function, level, approximation, and detail coefficients. Figure 16 shows the single level decomposition of the image shown in Figure 14 using Haar

(*x*, *y*))} (10)

the two input images.

is reconstructed as

wavelet function.

**Figure 13.** Reconstructed image of matched filter 3 (swath B)

<sup>Ι</sup>(*x*, *<sup>y</sup>*)=*<sup>W</sup>* -1 *<sup>ϕ</sup>*{*<sup>W</sup>* (*I*<sup>1</sup>

**Figure 11.** Reconstructed image from matched filter 1 (swath A)

**Figure 12.** Reconstructed image from matched filter 2 (swath A)

Many methods exist to perform image fusion. The very basic one is based on discrete wavelet transform (DWT) has become a very useful tool for fusion. DWT is a wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information (location in time). Figure 15 shows the block diagram of wavelet transform based image fusion technique. The principle of image fusion using wavelets is to merge the wavelet decompositions of the two original images using fusion methods applied to approximations coefficients and details coefficients [11]. The DWT is a spatial frequency decomposition that provides a flexible multi-resolution analysis of an image. The inverse discrete wavelet transform (IDWT) is applied to the combined coefficient map to produce the fused image from the two input images.

**Figure 13.** Reconstructed image of matched filter 3 (swath B)

spectral bands (multi-frequency image fusion); – data recorded by the same sensor at different polarizations (multi-polarization image fusion); – data recorded by the same sensor located

on platforms flying at different heights (multi-resolution image fusion).

48 New Advances in Image Fusion

**Figure 11.** Reconstructed image from matched filter 1 (swath A)

**Figure 12.** Reconstructed image from matched filter 2 (swath A)

In all wavelet based image fusion schemes the DWT of the two registered input images *I1(x, y)* and *I2(x, y)* are computed and these transforms are combined using some kind of fusion rule. Then the inverse discrete wavelet transform (IDWT) is computed and the fused image *I(x, y)* is reconstructed as

$$\mathbf{I}(\mathbf{x},\ \mathbf{y}) = \mathbf{W}^{-1} \mathbf{I} \boldsymbol{\phi} \{ \mathbf{W} (\mathbf{I}\_1(\mathbf{x},\ \mathbf{y})) , \mathbf{W} (\mathbf{I}\_2(\mathbf{x},\ \mathbf{y})) \} \mathbf{I} \tag{10}$$

where, *W* and *W-1* denotes the DWT and IDWT respectively. The term *ϕ* denotes the rules imposed in fusion such as wavelet function, level, approximation, and detail coefficients. Figure 16 shows the single level decomposition of the image shown in Figure 14 using Haar wavelet function.

**Figure 14.** Reconstructed image of matched filter 4 (swath B)

**Figure 16.** Single level decomposition (a) Approximation (b) Horizontal detail (c) Vertical detail (d) Diagonal detail

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

51

**Figure 17.** Fused image of swath A

**Figure 15.** Wavelet transform based image fusion

Figure 17 shows the fused image obtained using the reconstructed images from matched filters 1 and 2 while Figure 18 shows the fused image using the reconstructed images from matched filters 3 and 4. The image fusion output shown in Figure 17 and Figure 18 is achieved by taking the 'maximum' for 'approximations' and the 'minimum' for the 'details' using level 5 based on Haar wavelet. Haar wavelet is chosen because of its simplicity and good reconstruction capability. Since wavelet coefficients with large absolute values contain the information about the salient features of the images such as edges and lines, a good fusion rule is to choose the 'maximum' for 'approximation' values, while 'minimum' is chosen for the 'details' to suppress the noise. Final reconstructed wide-swath SAR image shown in Figure 19 with all resolved point targets of swath A and B is the horizontal concatenation of fused images of Figure 17 and Figure 18.

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar http://dx.doi.org/10.5772/56943 51

**Figure 16.** Single level decomposition (a) Approximation (b) Horizontal detail (c) Vertical detail (d) Diagonal detail

**Figure 17.** Fused image of swath A

**Figure 14.** Reconstructed image of matched filter 4 (swath B)

50 New Advances in Image Fusion

**Figure 15.** Wavelet transform based image fusion

and Figure 18.

Figure 17 shows the fused image obtained using the reconstructed images from matched filters 1 and 2 while Figure 18 shows the fused image using the reconstructed images from matched filters 3 and 4. The image fusion output shown in Figure 17 and Figure 18 is achieved by taking the 'maximum' for 'approximations' and the 'minimum' for the 'details' using level 5 based on Haar wavelet. Haar wavelet is chosen because of its simplicity and good reconstruction capability. Since wavelet coefficients with large absolute values contain the information about the salient features of the images such as edges and lines, a good fusion rule is to choose the 'maximum' for 'approximation' values, while 'minimum' is chosen for the 'details' to suppress the noise. Final reconstructed wide-swath SAR image shown in Figure 19 with all resolved point targets of swath A and B is the horizontal concatenation of fused images of Figure 17

transform based image fusion is identified with entropy value that serves as a measure of the roughness present in the image space. Entropy is used as a metric of noise level in non-fused and fused images. Table 2 summarizes the entropy of the input images of swath A and B as well as the fused images for different wavelet families. We observe that the Haar wavelet gives

> **Entropy of image 2**

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

(Swath A) 4.6072 (Swath B) 4.5362

**Entropy of fused image (Swath A)**

3.9572 3.8627

http://dx.doi.org/10.5772/56943

**Entropy of fused image (Swath B)**

53

**1**

(Swath A) 4.6353 (Swath B) 4.5606

Daubechies1 3.9574 3.8630 Symlets2 3.9594 3.8696 Coiflets2 3.9605 3.8723

An image fusion based MIMO UWB-OFDM SAR system has been presented which is able to provide wide-swath imaging. Pulse shaping is an important component in OFDM applica‐ tions. As orthogonal transmission waveforms are required for the proposed MIMO OFDM SAR system, a new approach to generate OFDM waveforms is explored and investigated. It is shown that the proposed MIMO UWB-OFDM SAR indeed provides a potential solution to high-resolution remote sensing as well as wide-swath imaging. The usefulness of the devel‐ oped approach has been demonstrated by fusing SAR images. Image fusion techniques provide a powerful tool to reduce clutter and certain types of noise such as AWGN, and thus can be used to enhance the quality of SAR images. The performance of the system is estimated by testing the proposed technique on SAR data acquired on multiple sensors. The results are evaluated by estimating the information flow from the input data to the output image, in terms of automatic recognition and detection of features present in the acquired images. Each SAR sensor acquires data about the inspected region using more than one frequency, and a processor that exploits the information carried by multiple frequencies is thus needed. Future work may include investigation of the proposed system by exploiting sensors that scan the

This work is funded by the National Plan for Science and Technology, Kingdom of Saudi

the best reduction in noise level.

Haar

**7. Conclusions**

**Wavelet Parameters Entropy of image**

Level : 5 Approx.: Max Details: Min

from multiple heights using various platforms.

Arabia, under project number: 08-ELE262-2.

**Acknowledgements**

**Table 2.** Entropy of fused SAR images using different wavelet families.

**Figure 18.** Fused Image of Swath B

**Figure 19.** Final SAR image

To assess the reduction in noise level due to image fusion technique, we can analyze both input images and fused image in terms of entropy. Entropy is a good measure for information content (uncertainty) present in the image space. Information content in SAR images after wavelet transform based image fusion is identified with entropy value that serves as a measure of the roughness present in the image space. Entropy is used as a metric of noise level in non-fused and fused images. Table 2 summarizes the entropy of the input images of swath A and B as well as the fused images for different wavelet families. We observe that the Haar wavelet gives the best reduction in noise level.


**Table 2.** Entropy of fused SAR images using different wavelet families.

#### **7. Conclusions**

**Figure 18.** Fused Image of Swath B

52 New Advances in Image Fusion

**Figure 19.** Final SAR image

To assess the reduction in noise level due to image fusion technique, we can analyze both input images and fused image in terms of entropy. Entropy is a good measure for information content (uncertainty) present in the image space. Information content in SAR images after wavelet

An image fusion based MIMO UWB-OFDM SAR system has been presented which is able to provide wide-swath imaging. Pulse shaping is an important component in OFDM applica‐ tions. As orthogonal transmission waveforms are required for the proposed MIMO OFDM SAR system, a new approach to generate OFDM waveforms is explored and investigated. It is shown that the proposed MIMO UWB-OFDM SAR indeed provides a potential solution to high-resolution remote sensing as well as wide-swath imaging. The usefulness of the devel‐ oped approach has been demonstrated by fusing SAR images. Image fusion techniques provide a powerful tool to reduce clutter and certain types of noise such as AWGN, and thus can be used to enhance the quality of SAR images. The performance of the system is estimated by testing the proposed technique on SAR data acquired on multiple sensors. The results are evaluated by estimating the information flow from the input data to the output image, in terms of automatic recognition and detection of features present in the acquired images. Each SAR sensor acquires data about the inspected region using more than one frequency, and a processor that exploits the information carried by multiple frequencies is thus needed. Future work may include investigation of the proposed system by exploiting sensors that scan the from multiple heights using various platforms.

#### **Acknowledgements**

This work is funded by the National Plan for Science and Technology, Kingdom of Saudi Arabia, under project number: 08-ELE262-2.

#### **Author details**

Md Anowar Hossain\* , Ibrahim Elshafiey and Majeed A. S. Alkanhal

\*Address all correspondence to: ahossain@ksu.edu.sa

Electrical Engineering Department, King Saud University, Riyadh, Kingdom of Saudi Arabia

[12] DU K-l. and Swamy M. N. Wireless Communication Systems. New York: Cambridge

Multi-Frequency Image Fusion Based on MIMO UWB OFDM Synthetic Aperture Radar

http://dx.doi.org/10.5772/56943

55

[13] Cumming I. G. and Wong F. H. Digital Processing of Synthetic Aperture Radar Data.

[14] Woo J. C., Lim B. G., and Kim Y. S. Modification of the Recursive Sidelobe Minimiza‐ tion Technique for the Range-Doppler Algorithm of SAR Imaging. *Journal of Electro‐*

[15] Hossain M. A., Elshafiey I., Alkanhal M., and Mabrouk A. Real-time implementation of UWB-OFDM synthetic aperture radar imaging. *IEEE Int. Conf. on Signal and Image*

[16] Atoche A. C. and Castillo J. V. Dual Super-Systolic Core for Real-Time Reconstruc‐ tive Algorithms of High-Resolution Radar/SAR Imaging Systems. *Sensors,* 12, pp.

*Processing Applications (ICSIPA), Kuala lumpur, Malaysia,* pp.450-455, 2011.

*magnetic Waves and Applications (JEMWA),* 25, pp.1783-1794, 2011.

University Press, 2010.

2539-2560, 2012.

London: Artech House, 2004.

#### **References**


[12] DU K-l. and Swamy M. N. Wireless Communication Systems. New York: Cambridge University Press, 2010.

**Author details**

54 New Advances in Image Fusion

**References**

Md Anowar Hossain\*

\*Address all correspondence to: ahossain@ksu.edu.sa

New York: Wiley, 1999.

*Sensors,* 8, pp.8224-8236, 2008.

Kingdom, pp.313-316, 2011.

121, pp.19-37, 2011.

*tions (JEMWA),* 25, pp.1168-1178, 2011.

, Ibrahim Elshafiey and Majeed A. S. Alkanhal

Electrical Engineering Department, King Saud University, Riyadh, Kingdom of Saudi Arabia

[1] Soumekh M. Synthetic Aperture Radar Signal Processing with MATLAB Algorithms.

[2] Lim B-G, Woo J-C, Lee H-Y, and Kim Y-S. A Modified Subpulse SAR Processing Pro‐ cedure Based on the Range-Doppler Algorithm for Synthetic Wideband Waveforms.

[5] Hossain M. A., Elshafiey I., Alkanhal M., and Mabrouk A. Adaptive UWB-OFDM Synthetic Aperture Radar. *Proc. of Saudi International Electronics, Communications and*

[6] Hossain M. A., Elshafiey I., and Alkanhal M. High-resolution UWB SAR based on OFDM architecture. *Proc. of 3rd Asia-Pacific International Conference on Synthetic Aper‐*

[7] Hossain M. A., Elshafiey I., Alkanhal M., and Mabrouk A. Anti-jamming Capabilities of UWB-OFDM SAR. *Proc. of European Radar Conference (EuRad),* Manchester, United

[8] Wang W-Q. Space-Time Coding MIMO-OFDM SAR for High-Resolution Imaging.

[9] Lim S. H., Hwang C. G., Kim S. Y., and Myung N H. Shifting MIMO SAR System for High-resolution Wide-swath Imaging. *Journal of Electromagnetic Waves and Applica‐*

[10] Xu W., Huang P. P., and Deng Y. K. MIMO-Tops Mode for High-resolution Ultra-Wide-Swath Full Polarimetric Imaging," *Progress in Electromagnetics Research (PIER),*

*IEEE Trans. on Geosciences and Remote Sensing*, 49(8), pp.3094-3104, 2011.

[11] Zeeuw P. M. Wavelet and Image Fusion. Amsterdam: CWI, 1998.

[3] Aiello R. and Wood S. Essentials of UWB. New York: Cambridge, 2008.

*Photonics Conference (SIECPC),* Riyadh, Saudi Arabia, pp. 1-6, 2011.

*ture Radar (APSAR)*. Seoul, South Korea, pp.1-4, 2011.

[4] Hanzo L. and Keller T. OFDM and MC-CDMA. West Sussex: Wiley, 2006.


**Chapter 4**

**High-Resolution and Hyperspectral Data Fusion for**

Resolution can be defined as the fineness with which an instrument can distinguish between the different values of some measured attribute. In the context of remotely sensed data references are made to four types of resolutions i.e. spatial resolution, spectral resolution, radiometric resolution and temporal resolution. The spatial resolution refers to the area of smallest resolvable element (e.g. pixel); spectral resolution refers to the smallest wavelength which can be detected in the spectral measurement (Lillesand and Kiefer, 2000). Technically these two types of resolution can be inter-related so that one can be improved at the expense of the others. The information content of an image is based on spatial and spectral resolution of an imaging system. To exploit and explore the benefit of enhanced spatial capability and spectral capability in, fusion techniques were developed to merge complementary informa‐ tion. Fusion of multispectral and panchromatic image has been done in past several times by many researchers for different purposes i.e. for feature extraction, 3D modelling (building extraction etc.). "Image *fusion is the combination of two or more images to form a new image by using*

"The "hyper" in the hyperspectral means "over" as in "too many" and refers to the large number of measured wavelength bands" (Shippert, 2008). Hyperspectral imaging in remote sensing was a major breakthrough that opened the avenues of research in various fields like mineralogy mapping for oil exploration, environmental geology, vegetation sciences, hydrol‐ ogy, tsunami-aids, biomass estimation and many more due to its ample spectral information

The fusion of hyperspectral with multispectral image results in a new image which has the spatial resolution of the high resolution image and preserves the spectral characteristics of the hyperspectral image. There are some algorithms used specifically to fuse and classify the

> © 2013 Pande and Tiwari; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

distribution, and reproduction in any medium, provided the original work is properly cited.

**Classification**

Hina Pande and Poonam S. Tiwari

http://dx.doi.org/10.5772/56944

**1. Introduction**

Additional information is available at the end of the chapter

*a certain algorithm*". (Pohl and Genderen Van, 1998).

contained in hundreds of co-registered bands.

## **High-Resolution and Hyperspectral Data Fusion for Classification**

Hina Pande and Poonam S. Tiwari

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56944

### **1. Introduction**

Resolution can be defined as the fineness with which an instrument can distinguish between the different values of some measured attribute. In the context of remotely sensed data references are made to four types of resolutions i.e. spatial resolution, spectral resolution, radiometric resolution and temporal resolution. The spatial resolution refers to the area of smallest resolvable element (e.g. pixel); spectral resolution refers to the smallest wavelength which can be detected in the spectral measurement (Lillesand and Kiefer, 2000). Technically these two types of resolution can be inter-related so that one can be improved at the expense of the others. The information content of an image is based on spatial and spectral resolution of an imaging system. To exploit and explore the benefit of enhanced spatial capability and spectral capability in, fusion techniques were developed to merge complementary informa‐ tion. Fusion of multispectral and panchromatic image has been done in past several times by many researchers for different purposes i.e. for feature extraction, 3D modelling (building extraction etc.). "Image *fusion is the combination of two or more images to form a new image by using a certain algorithm*". (Pohl and Genderen Van, 1998).

"The "hyper" in the hyperspectral means "over" as in "too many" and refers to the large number of measured wavelength bands" (Shippert, 2008). Hyperspectral imaging in remote sensing was a major breakthrough that opened the avenues of research in various fields like mineralogy mapping for oil exploration, environmental geology, vegetation sciences, hydrol‐ ogy, tsunami-aids, biomass estimation and many more due to its ample spectral information contained in hundreds of co-registered bands.

The fusion of hyperspectral with multispectral image results in a new image which has the spatial resolution of the high resolution image and preserves the spectral characteristics of the hyperspectral image. There are some algorithms used specifically to fuse and classify the

© 2013 Pande and Tiwari; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Hyperspectral data with the multispectral data. Some of the algorithms are transformation based (e.g. Intensity, Hue, Saturation), wavelet decomposition, neural networks, knowledgebased image fusion, Colour Normalised Transform (CNT), Principal Component Transform (PCT) and the Gram-Schmidt Transform. (Ali Darvishi et al., 2005). Combining hyperspectral and multispectral images can enhance the information content of the image thus help in geospatial data extraction. Fusion of multi-sensor image data has been widely used procedures now-a-days for complementing and enhancing the information content. The present work primarily focuses on the qualitative assessment of the fused image in terms of the spatial and spectral improvement.

*1.1.3. Wavelets-Transform image fusion*

*1.1.4. Gram-Schmidt Transform*

*1.1.5. Principal Component Transform*

Where *Wpc* = transformation matrix *PC* = transformed data (uncorrelated)

fused output image.

matrix is as follows:

PC = Wpc \* DN

*DN* = original data

**2. Literature review**

but suffer from poor enhancement and low sharpness.

According to Gomez et al., 2001 the wavelet concept is utilized to fuse the two spectral levels of a hyperspectral image with one band of multispectral image. Wavelets generally mean "waves". Image fusion by Wavelet-based method involves two processing steps: first step consists of extracting the details or the structures. The extracted structures are decomposed into three wavelet coefficients based upon the direction that is the vertical, horizontal and the diagonal. Thus, in combining the high resolution image with a low-resolution image, the highresolution image is first reference stretched three times, each time to match one of the lowresolution band histograms while, the second step necessitates the introduction of these structures/details into each low-resolution image band through the inverse wavelet transform. Thus, the spectral content from the low-resolution band image is preserved because only the scale structures between the two different resolution images are added. (Sanjeevi, 2008)

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

59

Aiazzi et al., 2006 described that the Gram-Schmidt Transform (GST) is another fusion algorithm which is used to fuse a multispectral image with a panchromatic image. The Gram-Schmidt Transform was invented by Brover and Laben in 1998 and patented by Eastman Kodak. This algorithm works in two modes: "mode1" and "mode2". The "mode1" takes the pixel average of the multispectral (MS) bands. The spatial quality in "mode1" is better but suffers from the spectral distortions due to the radiometric difference of the average of the MS bands and the panchromatic image. While, in "mode2" the spectral distortions are not present

The Principal Component Transform (PCT) used to enhance a low resolution image using a high resolution data. The PC band1 is replaced with a high resolution band, which is scaled to match the PC band1. Hence, there is almost no distortion in the spectral information in the

The mathematical operation that applies a linear transformation, based on an image-specific

Pohl and Genderen Van, 1998 proposed that image fusion is a tool to combine the multisource imagery using the advanced image processing techniques. According to Pohl and Genderen

The main objective of the present work is the analysis of the high resolution and hyperspectral data fusion using three different approaches (Gram-Schmidt, Principal Component, and Colour Normalised Transform), analyzing the spectral variation due to fusion and its effect on classification and feature extraction.

#### **1.1. Theoretical concepts: Different fusion algorithms**

#### *1.1.1. IHS (Intensity Hue Saturation)*

According to Chen et al., 2003 in the IHS transformation image fusion, the Intensity (I), the spatial component and the Hue (H) and the Saturation (S), the spectral components of an image are generated from the RGB image. The Intensity (I) component is then substituted by the high resolution panchromatic image to render a new image in RGB, which is referred as the fused image.

#### *1.1.2. Colour Normalised Transform*

The Colour Normalised Transform is another fusion technique that uses a mathematical grouping of the colour image and a high resolution image. The Colour Normalised Transform is also named as the Energy Subdivision Transform that employs a high resolution image to sharpen a low resolution image. This algorithm is also called as Brovey Transform. The Brovey Transform algorithm uses a formula that normalises multispectral bands used for a RGB (Red Green Blue) display and multiplies the result by high resolution data to add the intensity or the brightness component of the image. Brovey Transform is used to increase the contrast and intensity in the low and high ends of the histogram and for producing visually appealing images. (Sanjeevi, 2006)

Brovey Transform works as:

DNf = A (w1\* DNa + w2\* DNb )+ B

DNf = A\* DNa\* DNb +B

A and B are scaling and additive factors respectively and w1 and w1 weighting parameters. DNf, DNa and DNb refer to digital numbers of the final fused image and the input images a and b respectively.

#### *1.1.3. Wavelets-Transform image fusion*

Hyperspectral data with the multispectral data. Some of the algorithms are transformation based (e.g. Intensity, Hue, Saturation), wavelet decomposition, neural networks, knowledgebased image fusion, Colour Normalised Transform (CNT), Principal Component Transform (PCT) and the Gram-Schmidt Transform. (Ali Darvishi et al., 2005). Combining hyperspectral and multispectral images can enhance the information content of the image thus help in geospatial data extraction. Fusion of multi-sensor image data has been widely used procedures now-a-days for complementing and enhancing the information content. The present work primarily focuses on the qualitative assessment of the fused image in terms of the spatial and

The main objective of the present work is the analysis of the high resolution and hyperspectral data fusion using three different approaches (Gram-Schmidt, Principal Component, and Colour Normalised Transform), analyzing the spectral variation due to fusion and its effect

According to Chen et al., 2003 in the IHS transformation image fusion, the Intensity (I), the spatial component and the Hue (H) and the Saturation (S), the spectral components of an image are generated from the RGB image. The Intensity (I) component is then substituted by the high resolution panchromatic image to render a new image in RGB, which is referred as the fused

The Colour Normalised Transform is another fusion technique that uses a mathematical grouping of the colour image and a high resolution image. The Colour Normalised Transform is also named as the Energy Subdivision Transform that employs a high resolution image to sharpen a low resolution image. This algorithm is also called as Brovey Transform. The Brovey Transform algorithm uses a formula that normalises multispectral bands used for a RGB (Red Green Blue) display and multiplies the result by high resolution data to add the intensity or the brightness component of the image. Brovey Transform is used to increase the contrast and intensity in the low and high ends of the histogram and for producing visually appealing

A and B are scaling and additive factors respectively and w1 and w1 weighting parameters. DNf, DNa and DNb refer to digital numbers of the final fused image and the input images a

spectral improvement.

58 New Advances in Image Fusion

image.

DNf

DNf

on classification and feature extraction.

*1.1.1. IHS (Intensity Hue Saturation)*

*1.1.2. Colour Normalised Transform*

images. (Sanjeevi, 2006)

Brovey Transform works as:

= A\* DNa\* DNb +B

and b respectively.

= A (w1\* DNa + w2\* DNb )+ B

**1.1. Theoretical concepts: Different fusion algorithms**

According to Gomez et al., 2001 the wavelet concept is utilized to fuse the two spectral levels of a hyperspectral image with one band of multispectral image. Wavelets generally mean "waves". Image fusion by Wavelet-based method involves two processing steps: first step consists of extracting the details or the structures. The extracted structures are decomposed into three wavelet coefficients based upon the direction that is the vertical, horizontal and the diagonal. Thus, in combining the high resolution image with a low-resolution image, the highresolution image is first reference stretched three times, each time to match one of the lowresolution band histograms while, the second step necessitates the introduction of these structures/details into each low-resolution image band through the inverse wavelet transform. Thus, the spectral content from the low-resolution band image is preserved because only the scale structures between the two different resolution images are added. (Sanjeevi, 2008)

#### *1.1.4. Gram-Schmidt Transform*

Aiazzi et al., 2006 described that the Gram-Schmidt Transform (GST) is another fusion algorithm which is used to fuse a multispectral image with a panchromatic image. The Gram-Schmidt Transform was invented by Brover and Laben in 1998 and patented by Eastman Kodak. This algorithm works in two modes: "mode1" and "mode2". The "mode1" takes the pixel average of the multispectral (MS) bands. The spatial quality in "mode1" is better but suffers from the spectral distortions due to the radiometric difference of the average of the MS bands and the panchromatic image. While, in "mode2" the spectral distortions are not present but suffer from poor enhancement and low sharpness.

#### *1.1.5. Principal Component Transform*

The Principal Component Transform (PCT) used to enhance a low resolution image using a high resolution data. The PC band1 is replaced with a high resolution band, which is scaled to match the PC band1. Hence, there is almost no distortion in the spectral information in the fused output image.

The mathematical operation that applies a linear transformation, based on an image-specific matrix is as follows:

#### PC = Wpc \* DN

Where *Wpc* = transformation matrix

*PC* = transformed data (uncorrelated)

*DN* = original data

#### **2. Literature review**

Pohl and Genderen Van, 1998 proposed that image fusion is a tool to combine the multisource imagery using the advanced image processing techniques. According to Pohl and Genderen Van, 1998, the main objectives of image fusion are to sharpen images, improve geometric corrections, enhance certain features that are not visible in either of the images, replace the defective data, complement the data sets for the improved classification, detect changes using multitemporal data and, substitute the missing information in one of the image with the signals from another source image.

assessment and mitigation of these hazards in the area. The results of the fusion of the AVIRIS and TOPSTAR data show better enhancement in the urban features. The spectral resolution of the AVIRIS data helped in better discriminating among various urban features like the buildings and the mining tailings. The MNF-transformed bands of the AVIRIS data also improved the discriminability among the various features. The combined use of the HIS fused data, the MNFtransformed bands and the DEM of the area provided for better understanding

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

61

Ling et al., 2006 has analysed the results of fusing the high resolution data like the IKONOS and Quickbird using the Fast Fourier Transform - enhanced IHS method. The study aimed at evaluating the ability of the traditional methods like the HIS and the PCA (Principal Compo‐ nent Analysis) in fusing the high resolution data to preserve the colour and spectral informa‐ tion in the fused product. The study integrated the IHS transform with the FFT filtering of both the panchromatic and the intensity component of the multispectral image. The study has been done using the IKONOS and the Quickbird data. The analysis prove that the HIS transform using the FFT filtering improved the results in preserving the high spatial quality and the

Hyperion is an EO-1 (Earth Observation-1) sensor which was developed under NASA's new millennium program in November, 2000. The level 1 product used in the present study has 242 bands in the range of 355-2577 nm at 10 nm bandwidth (Table 1). Out of these 242 bands

IKONOS was the first commercial high resolution satellite to be positioned into the orbit. The IKONOS (MSS) image has 4 bands (red, green, blue, NIR) with 4m spatial resolution and

For the present study datasets of two areas were selected- Dehradun and Udaipur city area.

The city of Dehradun is situated in the south central part of Dehradun district. Dehradun city lies at 30°19' N and 78°20' E. The city is located at an altitude of 640 m above MSL. The lowest altitude is 600 m in the southern board is 38.04 sq. Km. The highest altitude is 1000 m in the northern part of the city. The site where the city is located slopes gently from north to south direction. The northern part of the region is heavily dissected by a number of seasonal streams

**1.** The western portion is dominated with varied vegetation of Sal, Teak, Bamboo, etc.

only 198 bands are calibrated. The bands which are not calibrated are set to zero.

IKONOS (Pan) has one band (.4-.9 µm) with 1m spatial resolution (Table 1).

(Fig 1a). The study strip can be divided into two distinct land cover classes:

of the urban features.

spectral characteristics.

**3.1. Hyperion**

**3.3. Study area**

**3. Data used and study area**

**3.2. IKONOS (MSS & Panchromatic)**

According to Kasetkasem, Arora and Varshney (2004), merging methods are often divided into two categories: first method simultaneously takes into account all bands in the merging process e.g. Hue- Saturation-Value transformation, Principle-Component transformation, Gram-Schmidt transformation technique; the second category deal separately with the spatial information and each spectral band e.g. Brovey transformation, High-Pass-Filter transforma‐ tion technique.

Ali Darvishi et al., 2005 analysed the capability of the two algorithms that is Gram-Schmidt and the Principal Component transform in the spectral domain. For this purpose two datasets have been taken (Hyperion/ Quickbird-MS and Hyperion/ Spot-Pan). The main objective of the study was the investigation of the two algorithms in the spectral domain and the statistical interpretation of the fused images with the raw Hyperion. The study area was Central Sulawesi in Indonesia. The results of the fusion show that the GST and PCT has almost similar ability in protecting the statistics as compared to the raw Hyperion. The correlation analysis show poor correlation between the raw Hyperion and the fused image bands. The results of the analysis show that the bands located in the high-frequency area of the spectrum better preserve the statistics as compared to the bands located in the low-frequency region. Different statistical parameters like the standard deviation, mean, median, and mode, maximum, minimum values of the raw Hyperion and the two fused images (GST & PCT) were compared for the analysis.

Gomez et al., 2001 has studied the fusion of the hyperspectral data with the multispectral data using the Wavelet-based image fusion. In the present study, two levels of hyperspectral data were used in fusion with one band of multispectral data. The fused image obtained had a RMSE (Root Mean Square Error) of 2.8 per pixel with a SNR (Signal to Noise Ration) of 36 dB. The results show that the fusion of hyperspectral data with the multispectral data produced a composite image of high spatial resolution of the multispectral data with all the spectral characteristics of the hyperspectral data with minimum artifacts. The study concluded that more than two datasets can be fused using the Wavelet transform image fusion technique.

Chen et al., 2003 carried out a study which took the hyperspectral data, AVIRIS ( Airborne Visible/ Infrared Imaging Spectrometer) to fuse with TOPSTAR ( Topographic Synthetic Aperture Radar) which provides the textural information to get a composite image to study the urban scene. The study has been conducted for the urban area of Park city, Utah. The composite image obtained has been superimposed on the DEM (Digital Elevation Model) generated from the TOPSTAR data to get a 3D perspective. The transformed image obtained was interpreted for the visual discrimination among various urban types. This was possible after fusion of AVIRIS and TOPSTAR data using IHS (Intensity Hue Saturation) transform, which resulted in an image having high spatial and spectral resolution. The objective of the study was to study the areas which are at a risk due to the geological hazards like the ava‐ lanches, mudflows etc. The fused image was interpreted for information extraction for assessment and mitigation of these hazards in the area. The results of the fusion of the AVIRIS and TOPSTAR data show better enhancement in the urban features. The spectral resolution of the AVIRIS data helped in better discriminating among various urban features like the buildings and the mining tailings. The MNF-transformed bands of the AVIRIS data also improved the discriminability among the various features. The combined use of the HIS fused data, the MNFtransformed bands and the DEM of the area provided for better understanding of the urban features.

Ling et al., 2006 has analysed the results of fusing the high resolution data like the IKONOS and Quickbird using the Fast Fourier Transform - enhanced IHS method. The study aimed at evaluating the ability of the traditional methods like the HIS and the PCA (Principal Compo‐ nent Analysis) in fusing the high resolution data to preserve the colour and spectral informa‐ tion in the fused product. The study integrated the IHS transform with the FFT filtering of both the panchromatic and the intensity component of the multispectral image. The study has been done using the IKONOS and the Quickbird data. The analysis prove that the HIS transform using the FFT filtering improved the results in preserving the high spatial quality and the spectral characteristics.

#### **3. Data used and study area**

#### **3.1. Hyperion**

Van, 1998, the main objectives of image fusion are to sharpen images, improve geometric corrections, enhance certain features that are not visible in either of the images, replace the defective data, complement the data sets for the improved classification, detect changes using multitemporal data and, substitute the missing information in one of the image with the signals

According to Kasetkasem, Arora and Varshney (2004), merging methods are often divided into two categories: first method simultaneously takes into account all bands in the merging process e.g. Hue- Saturation-Value transformation, Principle-Component transformation, Gram-Schmidt transformation technique; the second category deal separately with the spatial information and each spectral band e.g. Brovey transformation, High-Pass-Filter transforma‐

Ali Darvishi et al., 2005 analysed the capability of the two algorithms that is Gram-Schmidt and the Principal Component transform in the spectral domain. For this purpose two datasets have been taken (Hyperion/ Quickbird-MS and Hyperion/ Spot-Pan). The main objective of the study was the investigation of the two algorithms in the spectral domain and the statistical interpretation of the fused images with the raw Hyperion. The study area was Central Sulawesi in Indonesia. The results of the fusion show that the GST and PCT has almost similar ability in protecting the statistics as compared to the raw Hyperion. The correlation analysis show poor correlation between the raw Hyperion and the fused image bands. The results of the analysis show that the bands located in the high-frequency area of the spectrum better preserve the statistics as compared to the bands located in the low-frequency region. Different statistical parameters like the standard deviation, mean, median, and mode, maximum, minimum values of the raw Hyperion and the two fused images (GST & PCT) were compared for the analysis. Gomez et al., 2001 has studied the fusion of the hyperspectral data with the multispectral data using the Wavelet-based image fusion. In the present study, two levels of hyperspectral data were used in fusion with one band of multispectral data. The fused image obtained had a RMSE (Root Mean Square Error) of 2.8 per pixel with a SNR (Signal to Noise Ration) of 36 dB. The results show that the fusion of hyperspectral data with the multispectral data produced a composite image of high spatial resolution of the multispectral data with all the spectral characteristics of the hyperspectral data with minimum artifacts. The study concluded that more than two datasets can be fused using the Wavelet transform image fusion technique. Chen et al., 2003 carried out a study which took the hyperspectral data, AVIRIS ( Airborne Visible/ Infrared Imaging Spectrometer) to fuse with TOPSTAR ( Topographic Synthetic Aperture Radar) which provides the textural information to get a composite image to study the urban scene. The study has been conducted for the urban area of Park city, Utah. The composite image obtained has been superimposed on the DEM (Digital Elevation Model) generated from the TOPSTAR data to get a 3D perspective. The transformed image obtained was interpreted for the visual discrimination among various urban types. This was possible after fusion of AVIRIS and TOPSTAR data using IHS (Intensity Hue Saturation) transform, which resulted in an image having high spatial and spectral resolution. The objective of the study was to study the areas which are at a risk due to the geological hazards like the ava‐ lanches, mudflows etc. The fused image was interpreted for information extraction for

from another source image.

60 New Advances in Image Fusion

tion technique.

Hyperion is an EO-1 (Earth Observation-1) sensor which was developed under NASA's new millennium program in November, 2000. The level 1 product used in the present study has 242 bands in the range of 355-2577 nm at 10 nm bandwidth (Table 1). Out of these 242 bands only 198 bands are calibrated. The bands which are not calibrated are set to zero.

#### **3.2. IKONOS (MSS & Panchromatic)**

IKONOS was the first commercial high resolution satellite to be positioned into the orbit. The IKONOS (MSS) image has 4 bands (red, green, blue, NIR) with 4m spatial resolution and IKONOS (Pan) has one band (.4-.9 µm) with 1m spatial resolution (Table 1).

#### **3.3. Study area**

For the present study datasets of two areas were selected- Dehradun and Udaipur city area.

The city of Dehradun is situated in the south central part of Dehradun district. Dehradun city lies at 30°19' N and 78°20' E. The city is located at an altitude of 640 m above MSL. The lowest altitude is 600 m in the southern board is 38.04 sq. Km. The highest altitude is 1000 m in the northern part of the city. The site where the city is located slopes gently from north to south direction. The northern part of the region is heavily dissected by a number of seasonal streams (Fig 1a). The study strip can be divided into two distinct land cover classes:

**1.** The western portion is dominated with varied vegetation of Sal, Teak, Bamboo, etc.


of the Aravalli range corresponding to path and row number 146/40 corresponding to full scene

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

63

The methodology adopted was chosen to analyze the performance of the hyperspectral and high-resolution data fusion for classification. The major objective of the study was to compa‐ ratively evaluate the three algorithms i.e. the GS (Gram-Schmidt), PC (Principal Component) and the CN (Colour Normalised) Transform on the fusion of Hyperion data with the highspatial resolution IKONOS (mss) data. The fused images were analysed for pros and cons of the spectral domain image fusion models. For analyzing the spectral variation due to fusion, major land cover areas were identified. The original Hyperion spectra over these landcover areas was compared visually with the fused spectra over same area. The analysis was carried out visually and statistically by comparing spectral profiles of different features with the original Hyperion profiles. (Fig 2).Overall classification accuracy has been used to evaluate the Hyperion data, multispectral IKONOS and the fused data for the two study areas. The

The Hyperion Level 1R product used was having many bad lines and columns in the different bands. Thus radiometric correction for removal of bad columns was performed by calculating the average of the DN values of the adjacent columns. Atmospheric correction techniques have been developed in order to allow the retrieval of pure ground radiances from the target materials. The haziness in atmosphere accounts for the reduced radiation from Sun reaching the Earth surface causing blurriness in the image. Due to this reason, the atmospheric correc‐ tion for the Hyperion image was considered important in the present study. In the present work, the FLAASH model in ENVI 4.5 is chosen which is a first-principles atmospheric

of Hyperion (Fig 1 b).

**Figure 1.** Study area (a) Dehradun (b) Udaipur

methodology is divided in 3 broad steps (Fig 2):

**4. Methodology**

**4.1. Pre-processing stage**

**Table 1.** Technical Specification of the (a) Hyperion and (b)IKONOS (XS and PAN) sensors

#### **2.** The southern part with the urban and some patches of vegetation.

The urban pattern in Dehradun city is rather scattered and irregular. The northern part again consists of varied LULC classes like the crop fields, fallow land, urban, grassland, shrubs, and vegetation of mix type. A seasonal river named Tons flows from North East to South West direction. Geomorphically, the northern part of the region is occupied by piedmont fan of post-Siwalik Dun gravels called the Donga fan. Donga fan is a region in the Dehradun city that consists of the varied LULC classes.

Udaipur, Rajasthan, India has been selected as the second study area. Rajasthan is one of the mineral rich states of India. This north-western state of India occupies a place of pride in production and marketing of metallic and non metallic minerals in India. The Aravalli range, one of the oldest mountain ranges of the world runs along the NE–SW direction for more than 720 km, covering nearly 40, 000 km2. The study area (Longitude 73° 32′ 58″ to 73° 49′ 35″ E and Latitude 24° 08′ 18″ to 24° 59′ 53″ N) covers an area of about 750 km2 of this main block of the Aravalli range corresponding to path and row number 146/40 corresponding to full scene of Hyperion (Fig 1 b).

**Figure 1.** Study area (a) Dehradun (b) Udaipur

#### **4. Methodology**

The methodology adopted was chosen to analyze the performance of the hyperspectral and high-resolution data fusion for classification. The major objective of the study was to compa‐ ratively evaluate the three algorithms i.e. the GS (Gram-Schmidt), PC (Principal Component) and the CN (Colour Normalised) Transform on the fusion of Hyperion data with the highspatial resolution IKONOS (mss) data. The fused images were analysed for pros and cons of the spectral domain image fusion models. For analyzing the spectral variation due to fusion, major land cover areas were identified. The original Hyperion spectra over these landcover areas was compared visually with the fused spectra over same area. The analysis was carried out visually and statistically by comparing spectral profiles of different features with the original Hyperion profiles. (Fig 2).Overall classification accuracy has been used to evaluate the Hyperion data, multispectral IKONOS and the fused data for the two study areas. The methodology is divided in 3 broad steps (Fig 2):

#### **4.1. Pre-processing stage**

**2.** The southern part with the urban and some patches of vegetation.

**Table 1.** Technical Specification of the (a) Hyperion and (b)IKONOS (XS and PAN) sensors

consists of the varied LULC classes.

Sensor Altitude 705Km Spatial Resolution 30 mt Radiometric Resolution 16 bit Swath 7.2 Km IFOV (mrad) 0.043 No.Of Rows 256 No.of Columns 3128

62 New Advances in Image Fusion

VNIR Spectral Range 0.45-1.35 µm SWIR Spectral Range 1.40-2.48 µm Altitude 681 Km Inclination 98.20 Repeat Cycle 14 day

Swath width 11Km

Revisit time 1-3 days

Spectral Bands (µm) 0.45-0.52

Sensor Optical Sensor Assembly

Off Nadir Viewing +-Omnidirectional

Spatial Resolution 1m (PAN), 4m (MSS)

The urban pattern in Dehradun city is rather scattered and irregular. The northern part again consists of varied LULC classes like the crop fields, fallow land, urban, grassland, shrubs, and vegetation of mix type. A seasonal river named Tons flows from North East to South West direction. Geomorphically, the northern part of the region is occupied by piedmont fan of post-Siwalik Dun gravels called the Donga fan. Donga fan is a region in the Dehradun city that

0.52-0.60 0.63-0.69 0.76-0.90 0.45-0.90 (PAN)

Udaipur, Rajasthan, India has been selected as the second study area. Rajasthan is one of the mineral rich states of India. This north-western state of India occupies a place of pride in production and marketing of metallic and non metallic minerals in India. The Aravalli range, one of the oldest mountain ranges of the world runs along the NE–SW direction for more than 720 km, covering nearly 40, 000 km2. The study area (Longitude 73° 32′ 58″ to 73° 49′ 35″ E and Latitude 24° 08′ 18″ to 24° 59′ 53″ N) covers an area of about 750 km2 of this main block The Hyperion Level 1R product used was having many bad lines and columns in the different bands. Thus radiometric correction for removal of bad columns was performed by calculating the average of the DN values of the adjacent columns. Atmospheric correction techniques have been developed in order to allow the retrieval of pure ground radiances from the target materials. The haziness in atmosphere accounts for the reduced radiation from Sun reaching the Earth surface causing blurriness in the image. Due to this reason, the atmospheric correc‐ tion for the Hyperion image was considered important in the present study. In the present work, the FLAASH model in ENVI 4.5 is chosen which is a first-principles atmospheric correction modelling tool for retrieving spectral reflectance from hyperion. The spectral subsets for the Hyperion data have been created in the same wavelength range as that of the IKONOS i.e. 400-900 nm and so band number 12 to 55 are used. So, in total, the Hyperion image file has been reduced from resized 117 bands to 36 bands. Co-registration of hyperion image has been done within IKONOS MSS. The amount of RMS in the registration processes was about 0.823 pixels.

**Fig2:** *Methodology for comparative evaluation of fusion algorithms on classification accuracy* **Figure 2.** Methodology for comparative evaluation of fusion algorithms on classification accuracy

**4.2 Image fusion** 

#### **4.2. Image fusion**

for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR bands) with the IKONOS (R, G, B and NIR bands) produced four separate. These separate images were then stacked to get one single 36 band image which achieved the spatial resolution of IKONOS and the spectral characteristics of the Hyperion image (Fig 3 & 4). Image fusion is the phenomenon of combination of one or more images using an algorithm to acquire a composite image which caters with better and enhanced spatial and spectral informa‐ tion. After developing the spectral subsets for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR

*Fusion using PC Transform, c: Fusion using GS Transform)* 

Image fusion is the phenomenon of combination of one or more images using an algorithm to acquire a composite image which caters with better and enhanced spatial and spectral information. After developing the spectral subsets

bands)withtheIKONOS(R,G,BandNIRbands)producedfourseparate.Theseseparateimages were then stacked to get one single 36 band image which achieved the spatial resolution of

**Fig2:** *Methodology for comparative evaluation of fusion algorithms on classification accuracy*

Image fusion is the phenomenon of combination of one or more images using an algorithm to acquire a composite image which caters with better and enhanced spatial and spectral information. After developing the spectral subsets for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR bands) with the IKONOS (R, G, B and NIR bands) produced four separate. These separate images were then stacked to get one single 36 band image which achieved the spatial resolution of IKONOS and the spectral characteristics of the

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

65

**Figure 3:** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform, b:* 

**Figure 3.** Figure 3: Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform,

 **(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

 **(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

**Figure 4:** *Hyperion and IKONOS MSS fused images for part of Udaipur area(a: Fusion using CN Transform, b: Fusion using PC Transform, c: Fusion using GS Transform)*  **4.3: Spectra Comparision:** The spectral profiles of the various land cover classes present in the scene area like vegetation, bare soil, crop land, fallow land etc. and three fused products have been compared with the Hyperion. **4.3.1** *Vegetation:* In the Hyperion image, we observe that in the beginning there is a slow rise in the curve starting from the wavelength of 500 nm to a value of more than 250 and there exists a short peak in the blue region. At a wavelength of 700 nm there is a sharp rise in the curve that reaches to a value of 2000 and then there exists some small peaks in the NIR region which establishes that vegetation is best discriminated in this region. In the CN fused image, the outcomes of the spectral profile is almost similar with only one remarkable difference that is in the reflectance value. The vegetation here shows a rise in reflectance value only up to 450-750. In the PC fused image, the results are different with respect to Hyperion and CN fused image. The spectral profile of the GS fused image is

**Figure 4.** Hyperion and IKONOS MSS fused images for part of Udaipur area (a: Fusion using CN Transform, b: Fusion

The spectral profiles of the various land cover classes present in the scene area like vegetation, bare soil, crop land, fallow land etc. and three fused products have been compared with the

In the Hyperion image, we observe that in the beginning there is a slow rise in the curve starting from the wavelength of 500 nm to a value of more than 250 and there exists a short peak in the

**Fig 5:Spectral profiles for Sal Vegetation** 

*Fusion using PC Transform, c: Fusion using GS Transform)* 

b: Fusion using PC Transform, c: Fusion using GS Transform)

almost similar to the PC fused image (Fig 5).

using PC Transform, c: Fusion using GS Transform)

**4.3. Spectra comparision**

Hyperion.

*4.3.1. Vegetation*

IKONOS and the spectral characteristics of the Hyperion image (Fig 3 & 4).

**4.2 Image fusion** 

Hyperion image (Fig 3 & 4).

Georefrencing

Ikonos MSS Hyperion

Image Fusion

GST PCT CN

Spectra comparisoion over major land use categories

Classification and Accuracy Assessment

Atmospheric Correction

Spectral subsetting

Radiometric Corrections

**(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

**Figure 3:** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform, b:* 

bands)withtheIKONOS(R,G,BandNIRbands)producedfourseparate.Theseseparateimages were then stacked to get one single 36 band image which achieved the spatial resolution of IKONOS and the spectral characteristics of the Hyperion image (Fig 3 & 4). for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR bands) with the IKONOS (R, G, B and NIR bands) produced four separate. These separate images were then stacked to get one single 36 band image which achieved the spatial resolution of IKONOS and the spectral characteristics of the

**Fig2:** *Methodology for comparative evaluation of fusion algorithms on classification accuracy*

image which caters with better and enhanced spatial and spectral information. After developing the spectral subsets

 **(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

**4.2 Image fusion** 

Hyperion image (Fig 3 & 4).

Georefrencing

Ikonos MSS Hyperion

Image Fusion

GST PCT CN

Spectra comparisoion over major land use categories

Classification and Accuracy Assessment

Atmospheric Correction

Spectral subsetting

Radiometric Corrections

correction modelling tool for retrieving spectral reflectance from hyperion. The spectral subsets for the Hyperion data have been created in the same wavelength range as that of the IKONOS i.e. 400-900 nm and so band number 12 to 55 are used. So, in total, the Hyperion image file has been reduced from resized 117 bands to 36 bands. Co-registration of hyperion image has been done within IKONOS MSS. The amount of RMS in the registration processes was

Georefrencing

Ikonos MSS Hyperion

Image Fusion

GST PCT CN

Spectra comparisoion over major land use categories

Classification and Accuracy Assessment

Image fusion is the phenomenon of combination of one or more images using an algorithm to acquire a composite image which caters with better and enhanced spatial and spectral informa‐ tion. After developing the spectral subsets for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR

Atmospheric Correction

Spectral subsetting

Radiometric Corrections

**4.2 Image fusion** 

Hyperion image (Fig 3 & 4).

**Figure 2.** Methodology for comparative evaluation of fusion algorithms on classification accuracy

**Fig2:** *Methodology for comparative evaluation of fusion algorithms on classification accuracy*

Normalized Transformation. The merging of the spectral subsets of the Hyperion image file (R, G, B and NIR bands)

**Figure 3:** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform, b:* 

**(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

*Fusion using PC Transform, c: Fusion using GS Transform)* 

about 0.823 pixels.

64 New Advances in Image Fusion

**4.2. Image fusion**

**Figure 3:** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform, b: Fusion using PC Transform, c: Fusion using GS Transform)*  **Figure 3.** Figure 3: Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN Transform, b: Fusion using PC Transform, c: Fusion using GS Transform)

 **(a)CN Fused Image (b) PC Fused Image (c)GST Fused Image** 

**Figure 4:** *Hyperion and IKONOS MSS fused images for part of Udaipur area(a: Fusion using CN Transform, b: Fusion using PC Transform, c: Fusion using GS Transform)*  **4.3: Spectra Comparision:** The spectral profiles of the various land cover classes present in the scene area like **Figure 4.** Hyperion and IKONOS MSS fused images for part of Udaipur area (a: Fusion using CN Transform, b: Fusion using PC Transform, c: Fusion using GS Transform)

vegetation, bare soil, crop land, fallow land etc. and three fused products have been compared with the Hyperion.

#### **4.3.1** *Vegetation:* In the Hyperion image, we observe that in the beginning there is a slow rise in the curve starting from the wavelength of 500 nm to a value of more than 250 and there exists a short peak in the blue region. At a **4.3. Spectra comparision**

Image fusion is the phenomenon of combination of one or more images using an algorithm to acquire a composite image which caters with better and enhanced spatial and spectral information. After developing the spectral subsets for the Hyperion image its individual (R, G, B and NIR) band has been fused with the R, G, B and NIR band of IKONOS data utilizing Principal component Transformation, Gram-Schmidt Transformation (GST) and Color small peaks in the NIR region which establishes that vegetation is best discriminated in this region. In the CN fused image, the outcomes of the spectral profile is almost similar with only one remarkable difference that is in the reflectance value. The vegetation here shows a rise in reflectance value only up to 450-750. In the PC fused image, the results are different with respect to Hyperion and CN fused image. The spectral profile of the GS fused image is almost similar to the PC fused image (Fig 5). The spectral profiles of the various land cover classes present in the scene area like vegetation, bare soil, crop land, fallow land etc. and three fused products have been compared with the Hyperion.

wavelength of 700 nm there is a sharp rise in the curve that reaches to a value of 2000 and then there exists some

#### with the IKONOS (R, G, B and NIR bands) produced four separate. These separate images were then stacked to get one single 36 band image which achieved the spatial resolution of IKONOS and the spectral characteristics of the *4.3.1. Vegetation*

In the Hyperion image, we observe that in the beginning there is a slow rise in the curve starting from the wavelength of 500 nm to a value of more than 250 and there exists a short peak in the

**Fig 5:Spectral profiles for Sal Vegetation** 

blue region. At a wavelength of 700 nm there is a sharp rise in the curve that reaches to a value of 2000 and then there exists some small peaks in the NIR region which establishes that vegetation is best discriminated in this region. In the CN fused image, the outcomes of the spectral profile is almost similar with only one remarkable difference that is in the reflectance value. The vegetation here shows a rise in reflectance value only up to 450-750. In the PC fused image, the results are different with respect to Hyperion and CN fused image. The spectral profile of the GS fused image is almost similar to the PC fused image (Fig 5).

**Figure 6.** Spectral profiles for Bulding with terracotta roof

The spectral profile of the bare soil shows variations in the Hyperion image. The curve rises with a high slope and one can observe short peaks at wavelengths of 580 nm, 680 nm and highest rise at 820 nm. The rise in the curve is not uniform as there are some dumpy peaks and dips in the curve. Only, between blue and the green region, the rise is somewhat linear with one peak at 580 nm. In between the green and the red region, one enhanced peak can be observed at 680 nm but near to the red region at a value of about 3360 there is a flat dip. In the NIR region, some undulations are present in the curve with a peak at a value of about 3680 at 820 nm. In the CN fused image, one can observe a number of dips and peaks in the curve. The curve initially rises and meets a pointed peak with a value near to 500 at 515 nm. Then, in between the blue and the green region, the curve rises with a steep slope with two peaks at a value of 612.5 and 662.5 at 580 nm and 650 nm. There is a small dip after the green end. In between the green and the red region, there is a peak at a value of about 637.5 at 680 nm. After the rise, there is a flat dip at the red region but after this dip the curve rises again in the NIR region with some flat peaks and a sharp dip at a value of 637.5 at 875 nm wavelength. In the PC and the GS fused image, the range of the reflectance value is from 2000-4500. The spectral profile of the bare soil in the PC fused image shows some remarkable outcomes. The curve first rises slowly till the green end and then there is a sudden and sharp fall in the curve till it reaches a value below 2000 at 630 nm. The dip in the curve remains constant till it reaches 2000

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

67

*4.3.3. Bare soil*

**Figure 5.** Spectral profiles for Sal Vegetation

#### *4.3.2. Building*

In the Hyperion image, the curve starts rising slowly from the blue region up to a value of 1000 and then suddenly at the green edges the slope of the curve increases and there is a linear rise in the curve up to a value of 2500. In the NIR end small peaks are observed. Similar sort of consequences are observed in the CN fused image but the value is limited only up to 800. The results of the PC fused image are somewhat different. The building feature seen above is enhanced with values ranging more than 4000. The rise starts from the blue region that reaches to a peak at 1625. In between green and the red region of the spectrum, the rise is continuous. The curve is flattened at a value of 1500 but suddenly the curve rises at 680 nm with a steep slope that reaches a maximum value of 3900. Then after the red region, small peaks are observed. The spectrum observed for the GS fused image is same as the PC fused image (Fig 6).

**Figure 6.** Spectral profiles for Bulding with terracotta roof

#### *4.3.3. Bare soil*

blue region. At a wavelength of 700 nm there is a sharp rise in the curve that reaches to a value of 2000 and then there exists some small peaks in the NIR region which establishes that vegetation is best discriminated in this region. In the CN fused image, the outcomes of the spectral profile is almost similar with only one remarkable difference that is in the reflectance value. The vegetation here shows a rise in reflectance value only up to 450-750. In the PC fused image, the results are different with respect to Hyperion and CN fused image. The spectral

In the Hyperion image, the curve starts rising slowly from the blue region up to a value of 1000 and then suddenly at the green edges the slope of the curve increases and there is a linear rise in the curve up to a value of 2500. In the NIR end small peaks are observed. Similar sort of consequences are observed in the CN fused image but the value is limited only up to 800. The results of the PC fused image are somewhat different. The building feature seen above is enhanced with values ranging more than 4000. The rise starts from the blue region that reaches to a peak at 1625. In between green and the red region of the spectrum, the rise is continuous. The curve is flattened at a value of 1500 but suddenly the curve rises at 680 nm with a steep slope that reaches a maximum value of 3900. Then after the red region, small peaks are observed. The spectrum observed for the GS fused image is same as the PC fused image (Fig 6).

profile of the GS fused image is almost similar to the PC fused image (Fig 5).

**Figure 5.** Spectral profiles for Sal Vegetation

*4.3.2. Building*

66 New Advances in Image Fusion

The spectral profile of the bare soil shows variations in the Hyperion image. The curve rises with a high slope and one can observe short peaks at wavelengths of 580 nm, 680 nm and highest rise at 820 nm. The rise in the curve is not uniform as there are some dumpy peaks and dips in the curve. Only, between blue and the green region, the rise is somewhat linear with one peak at 580 nm. In between the green and the red region, one enhanced peak can be observed at 680 nm but near to the red region at a value of about 3360 there is a flat dip. In the NIR region, some undulations are present in the curve with a peak at a value of about 3680 at 820 nm. In the CN fused image, one can observe a number of dips and peaks in the curve. The curve initially rises and meets a pointed peak with a value near to 500 at 515 nm. Then, in between the blue and the green region, the curve rises with a steep slope with two peaks at a value of 612.5 and 662.5 at 580 nm and 650 nm. There is a small dip after the green end. In between the green and the red region, there is a peak at a value of about 637.5 at 680 nm. After the rise, there is a flat dip at the red region but after this dip the curve rises again in the NIR region with some flat peaks and a sharp dip at a value of 637.5 at 875 nm wavelength. In the PC and the GS fused image, the range of the reflectance value is from 2000-4500. The spectral profile of the bare soil in the PC fused image shows some remarkable outcomes. The curve first rises slowly till the green end and then there is a sudden and sharp fall in the curve till it reaches a value below 2000 at 630 nm. The dip in the curve remains constant till it reaches 2000 at 690 nm but after that the curve rises sharply with a steep slope till the value of about 4750 at the red end. After the red end, in the NIR region the curve is runs almost flat with a little dip at 4500 at 875 nm. The result of the spectral profile of the bare soil in the GS fused image is almost similar to the profile in the PC fused image with minor differences at certain points. Initially, the curve rises and a small peak is encountered at a value of about 2250 at 520 nm wavelength. The curve rises again in the same way as in the PC fused image. The curve runs almost flat below the value of about 2000 in between 630-695 nm. After this the curve rises sharply with a steep slope till the red region. In the NIR region, the outcomes are almost comparable as the PC fused image (Fig 7).

up to 750 and then there is a small dip. The dip is not remarkable and then again the curve rises. The curve rises linearly with a steep slope in between green and the red region up to value of 2500 and then again in the NIR region (beyond 750 nm) small pronounced peaks are present. The spectral profile of the fallow land in the GS fused image is almost comparable in

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

69

The spectral profile of river in the Hyperion image show lots of undulations i.e. a number of peaks and dips are observed. The curve rises sharply until it reaches a value of about 1820 at 520 nm. At 520 nm there is a pointed peak and then there is a small dip at about 530 nm. Again, the curve rises till it reaches a value of about 2050 at 590 nm. Then, suddenly the curve falls down till it reaches a value of about 1900 at 665 nm. After the fall in the curve, the curve again starts rising till it reach the red end. In the NIR region, the curve consists of two pointed peaks at 775 nm and 600 nm. After this the curve again drops to a value of about 1965 at 890 nm. In the CN fused image, the curve rises gently up to low values. There is a flattened peak at the blue region and then after the blue region the curve drops down almost flat till 690 nm. After

the PC fused image (Fig 8).

**Figure 8.** Spectral profiles for Fallow Land

*4.3.5. Dry river*

**Figure 7.** Spectral profiles for Bare soil

#### *4.3.4. Fallow land*

In the Hyperion image, we observe a continuous rise in the curve up to a value of 3625. For fallow land, the rise starts from wavelength of 500 nm and the slope of the curve is not that much steep. The rise in the curve is continuous with no noteworthy dips. In the CN fused image, the events are similar to the Hyperion but there is a difference in the range of the reflectance value up to which the curve rises. The range of the values in the CN fused image is limited to 400. In the PC fused image, the curve starts rising from the wavelength of 500 nm up to 750 and then there is a small dip. The dip is not remarkable and then again the curve rises. The curve rises linearly with a steep slope in between green and the red region up to value of 2500 and then again in the NIR region (beyond 750 nm) small pronounced peaks are present. The spectral profile of the fallow land in the GS fused image is almost comparable in the PC fused image (Fig 8).

**Figure 8.** Spectral profiles for Fallow Land

#### *4.3.5. Dry river*

at 690 nm but after that the curve rises sharply with a steep slope till the value of about 4750 at the red end. After the red end, in the NIR region the curve is runs almost flat with a little dip at 4500 at 875 nm. The result of the spectral profile of the bare soil in the GS fused image is almost similar to the profile in the PC fused image with minor differences at certain points. Initially, the curve rises and a small peak is encountered at a value of about 2250 at 520 nm wavelength. The curve rises again in the same way as in the PC fused image. The curve runs almost flat below the value of about 2000 in between 630-695 nm. After this the curve rises sharply with a steep slope till the red region. In the NIR region, the outcomes are almost

In the Hyperion image, we observe a continuous rise in the curve up to a value of 3625. For fallow land, the rise starts from wavelength of 500 nm and the slope of the curve is not that much steep. The rise in the curve is continuous with no noteworthy dips. In the CN fused image, the events are similar to the Hyperion but there is a difference in the range of the reflectance value up to which the curve rises. The range of the values in the CN fused image is limited to 400. In the PC fused image, the curve starts rising from the wavelength of 500 nm

comparable as the PC fused image (Fig 7).

68 New Advances in Image Fusion

**Figure 7.** Spectral profiles for Bare soil

*4.3.4. Fallow land*

The spectral profile of river in the Hyperion image show lots of undulations i.e. a number of peaks and dips are observed. The curve rises sharply until it reaches a value of about 1820 at 520 nm. At 520 nm there is a pointed peak and then there is a small dip at about 530 nm. Again, the curve rises till it reaches a value of about 2050 at 590 nm. Then, suddenly the curve falls down till it reaches a value of about 1900 at 665 nm. After the fall in the curve, the curve again starts rising till it reach the red end. In the NIR region, the curve consists of two pointed peaks at 775 nm and 600 nm. After this the curve again drops to a value of about 1965 at 890 nm. In the CN fused image, the curve rises gently up to low values. There is a flattened peak at the blue region and then after the blue region the curve drops down almost flat till 690 nm. After 690 nm the curve rises sharply till the red region is encountered at 660 nm. In the NIR region, the curve runs almost flat with wide contiguous bands. In the PC fused image, the curve starts at the value of about 1500 in the blue region. The curve runs parallel to the ground till the value of about 1750 at 620 nm. After this the curve suddenly drops down and the dip encountered in the region between the blue and the red end is almost flat. At 690 nm, suddenly there is a steep rise in the curve till it reaches a value of about 3375 at 775 nm. After this in the NIR region, the curve again runs flat with small flattened peaks. The spectral profile of river in the GS fused image is almost similar to the profile in the PC fused image (Fig 9).

some dips. Initially, the curve rises slowly till 1350 at a wavelength of 610 nm (approximated) then after the green end, the curve shows some variations. After the green end, the curve rises suddenly with a high slope till it reaches a value of 2750 at a wavelength of 680 nm. After 680 nm, the curve shows a decrease in the values till it reaches 2375 at the red region. In the NIR region, again the curve rises with some small peaks. The spectral profile of the grounds with grass in the GS fused image show almost the same outcomes as the PC fused image (Fig 10).

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

71

In the context of the present work, the original and the three fused datasets were classified by SAM (Spectral Angle Mapper) method of supervised classification. The Spectral Angle Mapper Classification (SAM) is an automated method for directly comparing image spectra to a reference or an endmember. This method treats both spectra as vectors and calculates the spectral angle between them. This method is insensitive to illumination since the SAM algorithm uses only the vector direction and not the vector length. The result of the SAM classification is an image showing the best match at each pixel. The selection of the classification

**Figure 10.** Spectral profiles for Ground with grass

**4.4. Classification**

**Figure 9.** Spectral profiles for Dry River Bed

#### *4.3.6. Land with grass*

In the Hyperion image, the curve rises slowly with almost flattened slope till it reaches a value of 1500 at a wavelength of 700 nm but then after that the curve rises linearly with a steep slope. This slope is between green and the red region. At the red region, this sharp rise in the curve is slowed down and then after that in the NIR region the curve rises slowly with one enhanced peak at 4500 at a wavelength of 875 nm. In the CN fused image, the curve rises slowly with a stumpy slope till it reaches 475 at a wavelength of 775 nm till the red region of the spectrum. In the NIR region, we observe some peaks in the curve. In the PC fused image, the curve shows some dips. Initially, the curve rises slowly till 1350 at a wavelength of 610 nm (approximated) then after the green end, the curve shows some variations. After the green end, the curve rises suddenly with a high slope till it reaches a value of 2750 at a wavelength of 680 nm. After 680 nm, the curve shows a decrease in the values till it reaches 2375 at the red region. In the NIR region, again the curve rises with some small peaks. The spectral profile of the grounds with grass in the GS fused image show almost the same outcomes as the PC fused image (Fig 10).

**Figure 10.** Spectral profiles for Ground with grass

#### **4.4. Classification**

690 nm the curve rises sharply till the red region is encountered at 660 nm. In the NIR region, the curve runs almost flat with wide contiguous bands. In the PC fused image, the curve starts at the value of about 1500 in the blue region. The curve runs parallel to the ground till the value of about 1750 at 620 nm. After this the curve suddenly drops down and the dip encountered in the region between the blue and the red end is almost flat. At 690 nm, suddenly there is a steep rise in the curve till it reaches a value of about 3375 at 775 nm. After this in the NIR region, the curve again runs flat with small flattened peaks. The spectral profile of river in the GS fused

In the Hyperion image, the curve rises slowly with almost flattened slope till it reaches a value of 1500 at a wavelength of 700 nm but then after that the curve rises linearly with a steep slope. This slope is between green and the red region. At the red region, this sharp rise in the curve is slowed down and then after that in the NIR region the curve rises slowly with one enhanced peak at 4500 at a wavelength of 875 nm. In the CN fused image, the curve rises slowly with a stumpy slope till it reaches 475 at a wavelength of 775 nm till the red region of the spectrum. In the NIR region, we observe some peaks in the curve. In the PC fused image, the curve shows

image is almost similar to the profile in the PC fused image (Fig 9).

**Figure 9.** Spectral profiles for Dry River Bed

*4.3.6. Land with grass*

70 New Advances in Image Fusion

In the context of the present work, the original and the three fused datasets were classified by SAM (Spectral Angle Mapper) method of supervised classification. The Spectral Angle Mapper Classification (SAM) is an automated method for directly comparing image spectra to a reference or an endmember. This method treats both spectra as vectors and calculates the spectral angle between them. This method is insensitive to illumination since the SAM algorithm uses only the vector direction and not the vector length. The result of the SAM classification is an image showing the best match at each pixel. The selection of the classification algorithm was also based on the characteristics of the image and the training data. The SAM decision rule of classification classified the image into 9 classes i.e. vegetation type1, vegetation type 2, river, shrubs, urban features, grassland, fallow land, bare soil, and crops.

After the classification was performed the classification accuracy has been computed for the IKONOS, Hyperion and the three merged images (Fig 11 & 12). Samples of each of the class from different locations in the Dehradun and Udaipur city have been collected for accuracy assessment.

**Figure 11: Classified Product for** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN T, b: Fusion using PC T, c: Fusion using GS T)*  **Figure 11.** Classified Product for Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN T, b: Fusion using PC T, c: Fusion using GS T)  **(a) (b) (c) Figure 11: Classified Product for** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion* 

*using CN T, b: Fusion using PC T, c: Fusion using GS T)* 

**5. Results and Discussions:**  Hyperion Classified CN Fused Classified GST Fused Classified PCT Fused Classified

(Table 2).

(Table 2).

correlation coefficients between the original hyperion bands and the equivalent fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated. The statistical parameters for various fused products were plotted along with the raw Hyper‐ ion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

73

After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image

The comparison of the separability analysis done to the original data sets and the three fused products, show that the separability for some of the classes increases after fusion and hence the classification accuracy achieved is higher (Fig 14). The classified images show some black pixels not belonging to any of the specified classes. Such pixels are left unclassified as they did not match with the pixel spectrum of any of the land cover class specified, or they are exhibiting a large angular difference (greater than.1 radians) between the known and the unknown pixel

**Data Product Overall Accuracy Achieved (Dehradun) Overall Accuracy Achieved (Udaipur)**

IKONOS 75.86% 79.72% HYPERION 68.15% 63.14% PCT FUSED IMAGE 80.23% 83.34% GST FUSED IMAGE 81.12% 80.23% CNT FUSED IMAGE 65.14% 68.57%

than the raw Hyperion image (Fig 13 a & b).

gain in the spatial quality.

(Table 2).

spectrum.

**Table 2.** Classification Accuracy

Although many studies focus on the development of fusion techniques, fewer studies concentrate on the **5. Results and Discussions: Figure 12.** Classified images (Udaipur)

#### The statistical parameters for various fused products were plotted along with the raw Hyperion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The correlation coefficients between the original hyperion bands and the equivalent **5. Results and discussions**

deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values than the raw Hyperion image (Fig 13 a & b). After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the The statistical parameters for various fused products were plotted along with the raw Hyperion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values than the raw Hyperion image (Fig 13 a & b). Although many studies focus on the development of fusion techniques, fewer studies con‐ centrate on the development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The

various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no gain in the spatial quality.

After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no gain in the spatial quality.

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image

and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard

development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The correlation coefficients between the original hyperion bands and the equivalent

*Figure 12: Classified images (Udaipur)* 

Although many studies focus on the development of fusion techniques, fewer studies concentrate on the

fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated.

fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated.

correlation coefficients between the original hyperion bands and the equivalent fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated.

The statistical parameters for various fused products were plotted along with the raw Hyper‐ ion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values than the raw Hyperion image (Fig 13 a & b).

After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no gain in the spatial quality.

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image (Table 2).

The comparison of the separability analysis done to the original data sets and the three fused products, show that the separability for some of the classes increases after fusion and hence the classification accuracy achieved is higher (Fig 14). The classified images show some black pixels not belonging to any of the specified classes. Such pixels are left unclassified as they did not match with the pixel spectrum of any of the land cover class specified, or they are exhibiting a large angular difference (greater than.1 radians) between the known and the unknown pixel spectrum.


**Table 2.** Classification Accuracy

algorithm was also based on the characteristics of the image and the training data. The SAM decision rule of classification classified the image into 9 classes i.e. vegetation type1, vegetation

After the classification was performed the classification accuracy has been computed for the IKONOS, Hyperion and the three merged images (Fig 11 & 12). Samples of each of the class from different locations in the Dehradun and Udaipur city have been collected for accuracy

 **(a) (b) (c) Figure 11: Classified Product for** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN T, b: Fusion using PC T, c: Fusion using GS T)* 

**Figure 11.** Classified Product for Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using

 **(a) (b) (c) Figure 11: Classified Product for** *Hyperion and IKONOS MSS fused images for part of Dehradun area (a: Fusion using CN T, b: Fusion using PC T, c: Fusion using GS T)* 

Hyperion Classified CN Fused Classified GST Fused Classified PCT Fused Classified

fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated.

fused bands and the other three parameters ie Mean, Standard Deviation, and Median were calculated.

*Figure 12: Classified images (Udaipur)* 

Hyperion Classified CN Fused Classified GST Fused Classified PCT Fused Classified

Although many studies focus on the development of fusion techniques, fewer studies concentrate on the development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The correlation coefficients between the original hyperion bands and the equivalent

*Figure 12: Classified images (Udaipur)* 

Although many studies focus on the development of fusion techniques, fewer studies concentrate on the development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The correlation coefficients between the original hyperion bands and the equivalent

The statistical parameters for various fused products were plotted along with the raw Hyperion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values

After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no gain in the spatial quality.

The statistical parameters for various fused products were plotted along with the raw Hyperion image. The graph depicts that there is no noticeable change in the statistics in the original Hyperion and fused products. PCT fused and the GST fused images demonstrate some comparable values for the mean, maximum, minimum and standard deviation but roughly have the same ability in preserving the statistics. The CNT fused image show very low values

After evaluating the spectral profiles we observe that although the range of values in the CN fused image is not comparable to the Hyperion but in most of the cases the shape of the profile closely matches with the profile of the feature in Hyperion. So, we can infer that spectrally the CN (Colour Normalised) approach better preserves the spectral characteristics in the fused image. In terms of the visual discreteness or the spatial characteristics of the various LULC classes in the fused images, GS (Gram-Schmidt) and the PC (Principal Component) transform are best suitable if compared to Hyperion while if compared to IKONOS there is almost no gain in the spatial quality.

Although many studies focus on the development of fusion techniques, fewer studies con‐ centrate on the development of image assessment methods. This study concentrates on statistical measures and classification accuracy for fusion performance. Statistical evaluation procedures have the advantage that they are objective, quantitative, and repeatable. The

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image

For performance analysis of fusion the classified images were analyzed using reference data form ground. The results of classification of the PCT and GST fused image are almost similar though for CN fused image results are deteriorated because of the artificial pixels that hinder in the classification process (Fig 11 & 12). The overall classification accuracy was calculated for the IKONOS, Hyperion and the three merged products. It was observed that the accuracy is improving in PCT fused image and GST fused image while deteriorating in CNT fused Image

CN T, b: Fusion using PC T, c: Fusion using GS T)

**5. Results and Discussions:** 

**Figure 12.** Classified images (Udaipur)

**5. Results and Discussions:** 

**5. Results and discussions**

than the raw Hyperion image (Fig 13 a & b).

than the raw Hyperion image (Fig 13 a & b).

(Table 2).

(Table 2).

type 2, river, shrubs, urban features, grassland, fallow land, bare soil, and crops.

assessment.

72 New Advances in Image Fusion

**Figure 14.** Class seperability analysis for original and Fused images

and Poonam S. Tiwari

\*Address all correspondence to: hina@iirs.gov.in

fused Images in spectral Domain

Indian Institute of Remote Sensing (ISRO), Dehradun, India

[1] Aarthy, R.S. and Sanjeevi, S., 2007. Spectral studies of lunar equivalent rocks-A prel‐ ude to lunar mineral mapping. Indian society of Remote Sensing, 35 (2): 141- 152.

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

75

[2] Aiazzi, B., Alparone, L., Baronti, S. and Selva, M., 2006. MS + Pan Image Fusion by an Enhanced Gram-Schmidt Spectral Sharpening, Italy. www.igik.edu.pl/earsel2006/

[3] Ali Darvishi, B., Kuppas, M. and Erasmi, S., 2005. Hyper-spectral/High resolution Data fusion: Assessing the quality of EO1 - Hyperion/spot-Pan and Quickbird-MS

abstracts/data\_fusion/aiazzi\_baronti.pdf (Last Accessed on Nov., 2007)

**Author details**

Hina Pande\*

**References**

Fig 13( a & b): Statistical Comparision of fused images (a:Dehradun, b: Udaipur) The comparison of the separability analysis done to the original data sets and the three fused products, show that the separability for some of the classes increases after fusion and hence the classification accuracy achieved is higher (Fig 14). The classified images show some black pixels not belonging to any of the specified classes. Such pixels are left unclassified as they did not match with the pixel spectrum of any of the land cover class specified, or

**Data Product** Overall Accuracy Overall Accuracy

they are exhibiting a large angular difference (greater than .1 radians) between the known and the unknown pixel spectrum. **Figure 13.** (a & b): Statistical Comparision of fused images (a: Dehradun, b: Udaipur)

(b)

High-Resolution and Hyperspectral Data Fusion for Classification http://dx.doi.org/10.5772/56944 75

**Figure 14.** Class seperability analysis for original and Fused images

#### **Author details**

(a)

Fig 13( a & b): Statistical Comparision of fused images (a:Dehradun, b: Udaipur) The comparison of the separability analysis done to the original data sets and the three fused products, show that the separability for some of the classes increases after fusion and hence the classification accuracy achieved is higher (Fig 14). The classified images show some black pixels not belonging to any of the specified classes. Such pixels are left unclassified as they did not match with the pixel spectrum of any of the land cover class specified, or they are exhibiting a large angular difference (greater than .1 radians) between the known and the unknown pixel

**Data Product** Overall Accuracy Overall Accuracy

(b)

**Figure 13.** (a & b): Statistical Comparision of fused images (a: Dehradun, b: Udaipur)

spectrum.

74 New Advances in Image Fusion

Hina Pande\* and Poonam S. Tiwari

\*Address all correspondence to: hina@iirs.gov.in

Indian Institute of Remote Sensing (ISRO), Dehradun, India

#### **References**


[4] Ali Darvishi, B., Kuppas, M. and Erasmi, S., 2005. Hyper-spectral/High resolution Data fusion: Assessing the quality of EO1 - Hyperion/spot-Pan and Quickbird-MS fused Images in spectral Domain. (http://www.ipi.uni-hannover.de/fileadmin/insti‐ tut/pdf/073-darvishi.pdf)

[16] Sanjeevi, S., 2008. Chapter on Multisensor Image Fusion. Lecture notes on Advanced Image Processing . Photogrammetry and remote sensing Division Indian institute of

High-Resolution and Hyperspectral Data Fusion for Classification

http://dx.doi.org/10.5772/56944

77

[17] Shippert, P., 2008. Introduction to Hyperspectral Image Analysis. http://satjour‐

nal.tcom.ohiou.edu/pdf/shippert.pdf (Last Accessed on Nov., 2007)

remote sensing, Dehradun


[16] Sanjeevi, S., 2008. Chapter on Multisensor Image Fusion. Lecture notes on Advanced Image Processing . Photogrammetry and remote sensing Division Indian institute of remote sensing, Dehradun

[4] Ali Darvishi, B., Kuppas, M. and Erasmi, S., 2005. Hyper-spectral/High resolution Data fusion: Assessing the quality of EO1 - Hyperion/spot-Pan and Quickbird-MS fused Images in spectral Domain. (http://www.ipi.uni-hannover.de/fileadmin/insti‐

[5] Alparone L., Baronti S., Garzelli A., Nencini F. (2004), Landsat ETM+ and SAR Image Fusion Based on Generalized Intensity Modulation, IEEE Transactions on Geoscience

[6] Chavez, P.S., Sides, S.C. and Anderson, J.A. (1991): Comparison of three different methods to merge multi-resolution and multi-spectral data: Landsat TM and SPOT Panchromatic, Photogrammetric Engineering and Remote Sensing, 57 (3): 295-303. [7] Chen, C.-M., Hepner, G.F. and Forster, R.R., 2003. Fusion of Hyperspectral and radar data using the HIS transformation to enhance urban surface features. ISPRS Journal

[8] Gomez, B.R., Jazaeri, A. and Kafatos, M., 2001. Wavelet-based hyperspectral and multispectral image fusion. www.scs.gmu.edu/~rgomez/fall01/fusionpaper.pdf (Last

[9] Kasetkasem, T., Arora, M. K., and Varshney, P. K., "An MRF Model Based Approach for Sub-pixel Mapping from Hyperspectral Data, " Advanced Image Processing Techniques for Remotely Sensed Hyperspectral Data, ed. P. K. Varshney and M. K.

[10] Li J., Luo J., Ming D., Shen Z. (2005), A New Method for Merging IKONOS Panchro‐ matic and Multispectral Image Data, Geoscience and Remote Sensing Symposium

[11] Lillesand, M.T. and Kiefer, W.R., 2000. Remote sensing and image interpretation.

[12] Ling, Y., Ehlers, M., Usery, l.E. and Marguerite, M., 2006. FFT-enhanced IHS trans‐ form method for fusing high-resolution satellite images. ISPRS Journal of photo‐

[13] Photogrammetric Engineering and Remote Sensing, Vol. 57, No.3, pp. 295–303. Pho‐

[14] Pohl C., Van Genderen J. L. (1998), Multisensor image fusion in remote sensing: Con‐ cepts, methods and applications (Review article), International Journal of Remote

[15] Sanjeevi, S., 2006.Chapter on Multisensor Image Fusion. Lecture notes on Advanced Image Processing . Photogrammetry and remote sensing Division Indian institute of

tut/pdf/073-darvishi.pdf)

76 New Advances in Image Fusion

Accessed on Nov., 2007)

IGARSS, Vol. 6, pp. 3916-3919

John Wiley and sons, New york.

Sensing, Vol. 19, No. 5, pp. 823-854

remote sensing, Dehradun

and Remote Sensing, Vol. 42, No. 12, pp. 2832-2839

of Photogrammetry and Remote Sensing, 58 (2003): 19-30.

Arora, Chapter 11, pp. 279-307, Springer Verlag, 2004.

grammetry and Remote Sensing, 61 (2007): 381-392.

togrammetry & Remote Sensing, Vol. 58, pp. 19-30

[17] Shippert, P., 2008. Introduction to Hyperspectral Image Analysis. http://satjour‐ nal.tcom.ohiou.edu/pdf/shippert.pdf (Last Accessed on Nov., 2007)

**Chapter 5**

**The Objective Evaluation Index (OEI) for Evaluation of**

A *night vision colorization* technique can produce colorized imagery with a naturalistic and stable color appearance by processing multispectral night vision (NV) imagery. The multi‐ spectral images typically include visual-band (e.g., red, green, and blue (RGB), or intensified) imagery and infrared imagery (e.g., near infrared (NIR) and long wave infrared (LWIR)). Although appropriately false-colored imagery is often helpful for human observers in improving their performance on scene classification and reaction time tasks (Waxman *et al.*, 1996; Essock *et al.*, 1999), inappropriate color mappings can also be detrimental to human performance (Toet et al., 2001; Varga, 1999). A possible reason is lack of physical color constancy. Another drawback with false coloring is that observers need specific training with each of the false color schemes so that they can correctly and quickly recognize objects; whereas with colorized nighttime imagery rendered with natural colors, users should be able to readily

There are several night vision (NV) colorization techniques developed in past decades. Toet (2003) proposed a NV colorization method that transfers the color characteristics of daylight imagery into multispectral NV images. Essentially, this color-mapping method matches the statistical properties (i.e., mean and standard deviation) of the NV imagery to that of a natural daylight color image (manually selected as the "target" color distribution). Zheng and Essock (2008) presented a "local coloring" method that can colorize the NV images more like daylight imagery by using histogram matching. The local-coloring method renders the multispectral images with natural colors segment by segment (i.e., "segmentationbased"), and also provides automatic association between the source and target images. Zheng (2011) recently introduced a *channel-based color fusion* method, which is fast enough

> © 2013 Zheng et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Night Vision Colorization Techniques**

Yufeng Zheng, Wenjie Dong, Genshe Chen and

Additional information is available at the end of the chapter

recognize and identify objects without any training.

Erik P. Blasch

**1. Introduction**

http://dx.doi.org/10.5772/56948

## **The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques**

Yufeng Zheng, Wenjie Dong, Genshe Chen and Erik P. Blasch

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56948

#### **1. Introduction**

A *night vision colorization* technique can produce colorized imagery with a naturalistic and stable color appearance by processing multispectral night vision (NV) imagery. The multi‐ spectral images typically include visual-band (e.g., red, green, and blue (RGB), or intensified) imagery and infrared imagery (e.g., near infrared (NIR) and long wave infrared (LWIR)). Although appropriately false-colored imagery is often helpful for human observers in improving their performance on scene classification and reaction time tasks (Waxman *et al.*, 1996; Essock *et al.*, 1999), inappropriate color mappings can also be detrimental to human performance (Toet et al., 2001; Varga, 1999). A possible reason is lack of physical color constancy. Another drawback with false coloring is that observers need specific training with each of the false color schemes so that they can correctly and quickly recognize objects; whereas with colorized nighttime imagery rendered with natural colors, users should be able to readily recognize and identify objects without any training.

There are several night vision (NV) colorization techniques developed in past decades. Toet (2003) proposed a NV colorization method that transfers the color characteristics of daylight imagery into multispectral NV images. Essentially, this color-mapping method matches the statistical properties (i.e., mean and standard deviation) of the NV imagery to that of a natural daylight color image (manually selected as the "target" color distribution). Zheng and Essock (2008) presented a "local coloring" method that can colorize the NV images more like daylight imagery by using histogram matching. The local-coloring method renders the multispectral images with natural colors segment by segment (i.e., "segmentationbased"), and also provides automatic association between the source and target images. Zheng (2011) recently introduced a *channel-based color fusion* method, which is fast enough

© 2013 Zheng et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

for real-time applications. Note that the term "color fusion" in this chapter refers to combing multispectral images into a color-version image with the purpose of resembling natural scenes. Hogervorst and Toet (2008 & 2012) recently proposed a new color mapping method using a lookup table (LUT). The LUT is created between a false-colored image (formed with multispectral NV images) and its color reference image (aiming at the same scene but taken at daytime). The colors in the resulting colored NV image resemble the colors in the daytime color image. This LUT-mapping method runs fast for real-time implementations. The LUTmapping method and the statistic-matching method are also summarized in their recent paper (Toet & Hogervorst, 2012). Most recently Zheng (2012) developed a joint-histogram matching method for NV colorization.

**2. Overview of night vision colorization techniques**

registered target (reference) image.

images.

where S[0.1,I\_Gmax]

**2.1. Channel-based color fusion (CBCF)**

color fusion trades the realism of colors with speed.

A color fusion of NIR and LWIR is formulated by,

gray-fusion image, and finally transform back to RGB space.

All color mapping methods described in Subsections 2.2-2.6 are performed in *lαβ* color space. Thus the color space conversion from RGB to *lαβ* must be done prior to color mapping, and then the inverse transformation to RGB space is necessary after the mapping. The details of *lαβ* color space transformation are given elsewhere (Toet, 2003; Zheng & Essock, 2008). Certainly, two images, a *source* image and a *target* image, are involved in a color mapping process. The source image is usually a color fusion image (in Subsections 2.2-2.5) or a falsecolored image (in Subsection 2.6); while the target image is normally a daylight picture containing the similar scene. The target image may have a different resolution as depicted in Subsections 2.2-2.5; however, the LUT described in Subsection 2.6 is established using the

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

http://dx.doi.org/10.5772/56948

81

A fast color fusion method, termed as *channel-based color fusion* (CBCF), was introduced to facilitate realtime applications (Zheng, 2011). Notice that the term of "color fusion" means combing multispectral images into a color-version image with the purpose of resembling natural scenes. Relative to the "segmentation-based colorization" (Zheng & Essock, 2008),

The general framework of channel-based color fusion is as follows, (i) prepare for color fusion, preprocessing (denoising, normalization and enhancement) and image registration; (ii) form a color fusion image by properly assigning multispectral images to red, green, and blue channels; (iii) then fuse multispectral images (gray fusion) using *a*DWT algorithm (Zheng *et al.*, 2005); and, (iv) replace the *value* component of color fusion in HSV color space with the

In NV imaging, there may be several bands of images available, for example, visible (RGB), image intensified (II), near infrared (NIR), medium wave infrared (MWIR), long wave infrared (LWIR). Upon the available images and the context, we only discuss two of two-band color fusions of (II ⊕ LWIR), (NIR ⊕ LWIR). The symbol '⊕ ' denotes the fusion of multiband

( )

[0.2,0.9] [0,1.0] [0.2,1] [0.1,I R LWIR G NIR

S S S

=

=

( )

B LWIR N

= û

\_Gmax] [0.1,0.7]

(

( )

F I, (a) F I , (b) F 1.0- I I ; (c)

I , I ; (

*min*([*μ*NIR + 2*σ*NIR],0.8), *min*() is an operation to get the minimal number; [1.0- ILWIR] is to invert

·

)

[0.2,1] denotes *piecewise contrast stretching* defined in Eq. (2) and I\_Gmax =

)

<sup>=</sup> (1)

[0,1.0] IR NIR LWIR

d *V Fus <sup>F</sup>* é ù ë

The quality of colorized images can be assessed by subjective and/or objective measures. However, subjective evaluation normally costs time and resources. Moreover, the subjec‐ tive evaluation methods cannot be readily and routinely used for real-time and automat‐ ed systems. On the other hand, objective evaluation metrics can automatically and quantitatively measure the image qualities (Liu *et al.*, 2012 & Blasch *et al.*, 2008). Over the past decade, many objective metrics for grayscale image evaluations have been proposed (Alparone *et al.*, 2004; Wald *et al.*, 1997; Tsagaris & Anastassopoulos, 2006). However, the metrics for grayscale images cannot be directly extended to the evaluations of colorized images. Recently, some objective evaluations of color images have been reported in the literature. To objectively assess a color fusion method, Tsagaris (2009) proposed a color image fusion measure (CIFM) by using the amount of common information between the source images and the colorized image, and also the distribution of color information. Yuan *et al*. (2011) presented an objective evaluation method for visible and infrared color fusion utilizing four metrics: image sharpness metric, image contrast metric, color colorfulness metric, and color naturalness metric. In this chapter, we introduce an *objective evaluation index* (OEI) to quantitatively evaluate the colorized images. Given a reference (daylight color) image and several versions of the colorized NV images from different coloring techniques, all color images are first converted into International Commission on Illumina‐ tion (CIE) LAB space, with dimension *L* for lightness and *a* and *b* for the color-opponent dimensions (Malacara, 2002). Then the OEI metric is computed with the four established metrics, phase congruency metric (PCM), gradient magnitude metric (GMM), image contrast metric (ICM), and color natural metric (CNM).

Certainly, a color presentation of multispectral night vision images can provide a better visual result for human users. We would prefer the color images resembling natural daylight pictures that we are used to; meanwhile the coloring process shall be efficient enough ideally for real time applications. In this chapter, we will discuss and explore how to objectively evaluate the image qualities of colorized images. The remainder of this chapter is organized as follows. Six NV colorization techniques are briefly reviewed in Section 2. Next, four image quality metrics are described in Section 3. A new colorization metric, *objective evaluation index* (OEI), is introduced in Section 4. The experiments and discussions are presented in Section 5. Conclu‐ sions are finally drawn in Section 6.

### **2. Overview of night vision colorization techniques**

All color mapping methods described in Subsections 2.2-2.6 are performed in *lαβ* color space. Thus the color space conversion from RGB to *lαβ* must be done prior to color mapping, and then the inverse transformation to RGB space is necessary after the mapping. The details of *lαβ* color space transformation are given elsewhere (Toet, 2003; Zheng & Essock, 2008). Certainly, two images, a *source* image and a *target* image, are involved in a color mapping process. The source image is usually a color fusion image (in Subsections 2.2-2.5) or a falsecolored image (in Subsection 2.6); while the target image is normally a daylight picture containing the similar scene. The target image may have a different resolution as depicted in Subsections 2.2-2.5; however, the LUT described in Subsection 2.6 is established using the registered target (reference) image.

#### **2.1. Channel-based color fusion (CBCF)**

for real-time applications. Note that the term "color fusion" in this chapter refers to combing multispectral images into a color-version image with the purpose of resembling natural scenes. Hogervorst and Toet (2008 & 2012) recently proposed a new color mapping method using a lookup table (LUT). The LUT is created between a false-colored image (formed with multispectral NV images) and its color reference image (aiming at the same scene but taken at daytime). The colors in the resulting colored NV image resemble the colors in the daytime color image. This LUT-mapping method runs fast for real-time implementations. The LUTmapping method and the statistic-matching method are also summarized in their recent paper (Toet & Hogervorst, 2012). Most recently Zheng (2012) developed a joint-histogram

The quality of colorized images can be assessed by subjective and/or objective measures. However, subjective evaluation normally costs time and resources. Moreover, the subjec‐ tive evaluation methods cannot be readily and routinely used for real-time and automat‐ ed systems. On the other hand, objective evaluation metrics can automatically and quantitatively measure the image qualities (Liu *et al.*, 2012 & Blasch *et al.*, 2008). Over the past decade, many objective metrics for grayscale image evaluations have been proposed (Alparone *et al.*, 2004; Wald *et al.*, 1997; Tsagaris & Anastassopoulos, 2006). However, the metrics for grayscale images cannot be directly extended to the evaluations of colorized images. Recently, some objective evaluations of color images have been reported in the literature. To objectively assess a color fusion method, Tsagaris (2009) proposed a color image fusion measure (CIFM) by using the amount of common information between the source images and the colorized image, and also the distribution of color information. Yuan *et al*. (2011) presented an objective evaluation method for visible and infrared color fusion utilizing four metrics: image sharpness metric, image contrast metric, color colorfulness metric, and color naturalness metric. In this chapter, we introduce an *objective evaluation index* (OEI) to quantitatively evaluate the colorized images. Given a reference (daylight color) image and several versions of the colorized NV images from different coloring techniques, all color images are first converted into International Commission on Illumina‐ tion (CIE) LAB space, with dimension *L* for lightness and *a* and *b* for the color-opponent dimensions (Malacara, 2002). Then the OEI metric is computed with the four established metrics, phase congruency metric (PCM), gradient magnitude metric (GMM), image contrast

Certainly, a color presentation of multispectral night vision images can provide a better visual result for human users. We would prefer the color images resembling natural daylight pictures that we are used to; meanwhile the coloring process shall be efficient enough ideally for real time applications. In this chapter, we will discuss and explore how to objectively evaluate the image qualities of colorized images. The remainder of this chapter is organized as follows. Six NV colorization techniques are briefly reviewed in Section 2. Next, four image quality metrics are described in Section 3. A new colorization metric, *objective evaluation index* (OEI), is introduced in Section 4. The experiments and discussions are presented in Section 5. Conclu‐

matching method for NV colorization.

80 New Advances in Image Fusion

metric (ICM), and color natural metric (CNM).

sions are finally drawn in Section 6.

A fast color fusion method, termed as *channel-based color fusion* (CBCF), was introduced to facilitate realtime applications (Zheng, 2011). Notice that the term of "color fusion" means combing multispectral images into a color-version image with the purpose of resembling natural scenes. Relative to the "segmentation-based colorization" (Zheng & Essock, 2008), color fusion trades the realism of colors with speed.

The general framework of channel-based color fusion is as follows, (i) prepare for color fusion, preprocessing (denoising, normalization and enhancement) and image registration; (ii) form a color fusion image by properly assigning multispectral images to red, green, and blue channels; (iii) then fuse multispectral images (gray fusion) using *a*DWT algorithm (Zheng *et al.*, 2005); and, (iv) replace the *value* component of color fusion in HSV color space with the gray-fusion image, and finally transform back to RGB space.

In NV imaging, there may be several bands of images available, for example, visible (RGB), image intensified (II), near infrared (NIR), medium wave infrared (MWIR), long wave infrared (LWIR). Upon the available images and the context, we only discuss two of two-band color fusions of (II ⊕ LWIR), (NIR ⊕ LWIR). The symbol '⊕ ' denotes the fusion of multiband images.

A color fusion of NIR and LWIR is formulated by,

$$\begin{aligned} \mathbf{F\_R} &= \mathbf{S}\_{[0,1.0]}^{[0,2,0.9]} \{ \mathbf{I\_{LWR}} \}\_{\prime} & \text{(a)}\\ \mathbf{F\_G} &= \mathbf{S}\_{[0,1.1]}^{[0,2,1]} \{ \mathbf{I\_{NIR}} \}\_{\prime} & \text{(b)}\\ \mathbf{F\_B} &= \mathbf{S}\_{[0,1.0]}^{[0,1.0,7]} \{ \left[ \mathbf{1.0} \cdot \mathbf{I\_{LWR}} \right] \bullet \mathbf{I\_{NIR}} \}; \quad \text{(c)}\\ \mathbf{V\_F} &= Fus \{ \mathbf{I\_{NIR}} \cdot \mathbf{I\_{LWR}} \}; \quad \text{(d)} \end{aligned} \tag{1}$$

where S[0.1,I\_Gmax] [0.2,1] denotes *piecewise contrast stretching* defined in Eq. (2) and I\_Gmax = *min*([*μ*NIR + 2*σ*NIR],0.8), *min*() is an operation to get the minimal number; [1.0- ILWIR] is to invert LWIR image; symbol '•' means element-by-element multiplication; *V*<sup>F</sup> is the *value* component of FC in HSV space, *Fus*() means image fusion operation using *a*DWT algorithm (Zheng *et al.*, 2005). Although the limits given in contrast stretching are obtained empirically according to the night vision images that we had, it is viable to formulate the expressions and automate the fusion based upon a set of conditions (imaging devices, imaging time, and application location). Notice the transform parameters in Eqs. (1) were applied to all color fusions in our experiments (see Fig. 3d).

$$\mathbf{I}\_S = \mathbf{S}\_{\left[I\_{\rm Min}, I\_{\rm Min}\right]}^{\left[L\_{\rm Min}, L\_{\rm Max}\right]} = \left(\mathbf{I}\_0 - I\_{\rm Min}\right) \frac{L\_{\rm Max} - L\_{\rm Min}}{I\_{\rm Max} - I\_{\rm Min}} + L\_{\rm Min} \tag{2}$$

1


*N*

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

= = -×å (4)

http://dx.doi.org/10.5772/56948

83

[ ( )], 0,1,2,..., 1. *k k v T Su k L* - <sup>=</sup> = - (5)

0

( ) ( 1) , *L k*

where *N* is the total number of pixels in the image, *nk* is the number of pixels that have gray level *uk*, and *L* is the number of gray (bin) levels in the image. Typically, *L* = 256 for a digital image. But we can round the image down to *m* (*m* < *L*, e.g., *m* = 64) levels, and thus its histogram is called *m*-bin histogram. Clearly, *S*(*uk*) is a non-decreasing function. Similarly, *h*T = *T*(*vk*) can

Second, considering *h*S *= h*T (i.e., *S*(*uk*) = *T*(*vk*)) for histogram matching, the matched image is

It is straightforward to find a discrete solution of the inverse transform, *T*-1[*S*()] as both *T*() and

Similar to the statistic matching (described in Subsection 2.2), histogram matching also serves for color mapping (see Fig. 3f) and is performed component-by-component in *lαβ* space. Specifically, with each color component (say the *α* component, treated as a grayscale image) of a false-colored image, we can compute *S*(*uk*). With a selected target image, *T*(*vk*) can be calculated with regard to the same color component (say *α*). Using Eq. (5) the histogram matching can be completed regarding the color component (*α*). Histogram matching and statistic matching can be applied separately or jointly. When applied together, for instance, it is referred to as "statistic matching then histogram matching" (Zheng & Essock, 2008).

As described in Subsection 2.3, histogram matching is applied to each color component (plane) separately. It is highly possible to distort the color distributions of the mapped image (see Fig. 3f). To avoid color distortion, we introduce a new color mapping method, joint histogram

In *lαβ* space, *α* and *β* represent the color distributions; while *l* is the intensity component. A *joint histogram* (also called 2D histogram) of two color planes (*α* versus *β*) is calculated and then matched from source to target. The intensity component (*l*) is matched individually. The joint histogram is actually the joint (2D) intensity distribution of the two images, which is often used

How to calculate the normalized cumulative histogram (denoted as *h*) from a 2D joint

to be a non-decreasing function. We propose to form a one-dimensional (1D) histogram by stacking *H*<sup>J</sup> column-by-column and then perform histogram matching as defined in Eq. (10). Of course, to correctly index a 1D transform (*T*-1()), the proper calculation of *um* (with *m* bins)

) needs further discussion. To do histogram matching, *h* is expected

to compute the joint entropy (Hill & Batchelor, 2001) for image registration.

S

1

be computed (see the "Target" curve in Fig. 1c).

*S*() can be implemented with look up tables.

**2.4. Joint histogram matching (JHM)**

matching (joint-HM).

histogram (denoted as *H*<sup>J</sup>

accordingly computed as

*k <sup>n</sup> h Su L*

where **I**S is the scaled image, **I**0 is the original image; *I*Min and *I*Max are the maximum and minimum pixel values in **I**0, respectively; *L*Min and *L*Max are the expected minimum and maximum pixel values in **I**S, respectively. After the image contrast stretching, **I**<sup>S</sup> ∈ [ *L*Min, *L*Max].

#### **2.2. Statistic matching**

A *statistic matching* (stat-match) is used to transfer the color characteristics from natural daylight imagery to false color night-vision imagery, which is formulated as:

$$\mathbf{I}\_{\mathcal{C}}^{k} = (\mathbf{I}\_{\mathcal{S}}^{k} - \boldsymbol{\mu}\_{\mathcal{S}}^{k}) \cdot \frac{\boldsymbol{\sigma}\_{T}^{k}}{\sigma\_{\mathcal{S}}^{k}} + \boldsymbol{\mu}\_{T'}^{k} \quad \text{for} \quad k = \begin{vmatrix} \boldsymbol{l}, & \boldsymbol{\alpha}, & \boldsymbol{\beta} \end{vmatrix} . \tag{3}$$

where *IC* is the colored image, *IS* is the source (false-color) image in *lαβ* space; *μ* denotes the mean and *σ* denotes the standard deviation; the subscripts 'S' and 'T' refer to the source and target images, respectively; and the superscript '*k*' is one of the color components: {*l, α, β*}.

After this transformation, the pixels comprising the multispectral source image have means and standard deviations that conform to the target daylight color picture in *lαβ* space. The colored image is transformed back to the RGB space through the inverse transforms (Zheng & Essock, 2008; see Fig. 3e).

#### **2.3. Histogram matching (HM)**

*Histogram matching* (i.e., histogram specification) is usually used to enhance an image when histogram equalization fails (Gonzalez & Woods, 2002). Given the shape of the histogram that we want the enhanced image to have, histogram matching can generate a processed (i.e., matched) image that has the specified histogram. In particular, by specifying the histogram of a target image (with daylight natural colors), a source image (with false colors) resembles the target image in terms of histogram distribution after histogram matching.

Histogram matching (hist-match) can be implemented as follows. First, the *normalized cumulative histograms* of source image and target image (*h*S and *h*T) are calculated, respectively. The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques http://dx.doi.org/10.5772/56948 83

$$h\_{\mathbb{S}} = \mathbb{S}(\boldsymbol{\mu}\_{k}) = (L - 1) \cdot \sum\_{0}^{L-1} \frac{\boldsymbol{n}\_{k}}{N} \tag{4}$$

where *N* is the total number of pixels in the image, *nk* is the number of pixels that have gray level *uk*, and *L* is the number of gray (bin) levels in the image. Typically, *L* = 256 for a digital image. But we can round the image down to *m* (*m* < *L*, e.g., *m* = 64) levels, and thus its histogram is called *m*-bin histogram. Clearly, *S*(*uk*) is a non-decreasing function. Similarly, *h*T = *T*(*vk*) can be computed (see the "Target" curve in Fig. 1c).

Second, considering *h*S *= h*T (i.e., *S*(*uk*) = *T*(*vk*)) for histogram matching, the matched image is accordingly computed as

$$
\upsilon\_k = T^{-1}[S(\mu\_k)] \lrcorner k = 0, 1, 2, \ldots, L - 1. \tag{5}
$$

It is straightforward to find a discrete solution of the inverse transform, *T*-1[*S*()] as both *T*() and *S*() can be implemented with look up tables.

Similar to the statistic matching (described in Subsection 2.2), histogram matching also serves for color mapping (see Fig. 3f) and is performed component-by-component in *lαβ* space. Specifically, with each color component (say the *α* component, treated as a grayscale image) of a false-colored image, we can compute *S*(*uk*). With a selected target image, *T*(*vk*) can be calculated with regard to the same color component (say *α*). Using Eq. (5) the histogram matching can be completed regarding the color component (*α*). Histogram matching and statistic matching can be applied separately or jointly. When applied together, for instance, it is referred to as "statistic matching then histogram matching" (Zheng & Essock, 2008).

#### **2.4. Joint histogram matching (JHM)**

LWIR image; symbol '•' means element-by-element multiplication; *V*<sup>F</sup> is the *value* component of FC in HSV space, *Fus*() means image fusion operation using *a*DWT algorithm (Zheng *et al.*, 2005). Although the limits given in contrast stretching are obtained empirically according to the night vision images that we had, it is viable to formulate the expressions and automate the fusion based upon a set of conditions (imaging devices, imaging time, and application location). Notice the transform parameters in Eqs. (1) were applied to all color fusions in our

> [,] Max Min [,] 0 Min Min

*L L SI L*

*Min Max L L S I I*

*IC <sup>k</sup>* =(*I<sup>S</sup> <sup>k</sup>* <sup>−</sup>*μ<sup>S</sup> k* ) ⋅ *σT k*

() , *Min Max*

where **I**S is the scaled image, **I**0 is the original image; *I*Min and *I*Max are the maximum and minimum pixel values in **I**0, respectively; *L*Min and *L*Max are the expected minimum and maximum pixel values in **I**S, respectively. After the image contrast stretching, **I**<sup>S</sup> ∈ [ *L*Min, *L*Max].

A *statistic matching* (stat-match) is used to transfer the color characteristics from natural

where *IC* is the colored image, *IS* is the source (false-color) image in *lαβ* space; *μ* denotes the mean and *σ* denotes the standard deviation; the subscripts 'S' and 'T' refer to the source and target images, respectively; and the superscript '*k*' is one of the color components: {*l, α, β*}.

After this transformation, the pixels comprising the multispectral source image have means and standard deviations that conform to the target daylight color picture in *lαβ* space. The colored image is transformed back to the RGB space through the inverse transforms (Zheng

*Histogram matching* (i.e., histogram specification) is usually used to enhance an image when histogram equalization fails (Gonzalez & Woods, 2002). Given the shape of the histogram that we want the enhanced image to have, histogram matching can generate a processed (i.e., matched) image that has the specified histogram. In particular, by specifying the histogram of a target image (with daylight natural colors), a source image (with false colors) resembles the

Histogram matching (hist-match) can be implemented as follows. First, the *normalized cumulative histograms* of source image and target image (*h*S and *h*T) are calculated, respectively.

daylight imagery to false color night-vision imagery, which is formulated as:

*σS <sup>k</sup>* + *μ<sup>T</sup> k*

target image in terms of histogram distribution after histogram matching.

Max Min


, for *k* ={ *l*, *α*, *β* }, (3)

*I I*

experiments (see Fig. 3d).

82 New Advances in Image Fusion

**2.2. Statistic matching**

& Essock, 2008; see Fig. 3e).

**2.3. Histogram matching (HM)**

As described in Subsection 2.3, histogram matching is applied to each color component (plane) separately. It is highly possible to distort the color distributions of the mapped image (see Fig. 3f). To avoid color distortion, we introduce a new color mapping method, joint histogram matching (joint-HM).

In *lαβ* space, *α* and *β* represent the color distributions; while *l* is the intensity component. A *joint histogram* (also called 2D histogram) of two color planes (*α* versus *β*) is calculated and then matched from source to target. The intensity component (*l*) is matched individually. The joint histogram is actually the joint (2D) intensity distribution of the two images, which is often used to compute the joint entropy (Hill & Batchelor, 2001) for image registration.

How to calculate the normalized cumulative histogram (denoted as *h*) from a 2D joint histogram (denoted as *H*<sup>J</sup> ) needs further discussion. To do histogram matching, *h* is expected to be a non-decreasing function. We propose to form a one-dimensional (1D) histogram by stacking *H*<sup>J</sup> column-by-column and then perform histogram matching as defined in Eq. (10). Of course, to correctly index a 1D transform (*T*-1()), the proper calculation of *um* (with *m* bins) using two gray (bin) levels is expected. If *H*<sup>J</sup> is computed as (*β* vs. *α*), its matching process is denoted as joint-HM(*βα*). Eventually, the histogram of the mapped image is sort of tradeoff between two histograms, "Source" and "Target". This is expected since we want no color distortion (i.e., preserving its own colors to some extent) during color mapping (see Fig. 3g).

specified by the International Commission on Illumination. The perceptually uniform CIELAB space consists of an achromatic luminosity component *L*\* (black-white) and two chromatic

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

The *phase congruency* (PC) model is also called the "local energy model" developed by Morrone *et al*. (1986). This model postulates that the features in an image are perceived at the points where the Fourier components are maximal in phase. Based on the physiological and psycho‐ physical evidences, the PC theory provides a simple but biologically plausible model of how mammalian visual systems detect and identify the features in an image. PC can be considered

According to the definition of PC (Morrone *et al*., 1986), there are many different implemen‐ tations of PC map developed so far. A widely-used method developed by Kovesi (1999) is

> ( ) ( )\* = , ( ) ( )\*

*n n <sup>o</sup> <sup>n</sup> <sup>n</sup> e x fx M o x fx M* é ù é ù ê ú ê ú ë û ê ú ë û

( ) = ( ), ( ) = ( ). *n n n n*

2 2 () () ( )= , ( ) *<sup>n</sup>*

+

e

*n*

*Fx Hx PC x A x*

The one-dimensional (1D) *phase congruency metric* (PCM) can be computed as

*e* and *Mn o*

*e* and *Mn o*

*e*

2 2 ( ) = ( ) ( ). *Ax ex ox n nn* <sup>+</sup> (7)

*Fx e x Hx o x* å å (8)

å <sup>+</sup> (9)

*a*\*

represent the even-symmetric and

form a quadrature pair: *e*n(*x*) and

(6)

*b*\* (CIE 1976) can be

http://dx.doi.org/10.5772/56948

85

(green-magenta) and *b*\* (blue-yellow). The coordinates *L*\*

calculated using the CIE XYZ tristimulus values (Malacara, 2002).

as a significance measure of local structures in an image.

*o*n(*x*). Responses of the quadrature pair form a response vector:

adopted in this chapter. Given a 1D image *f*(*x*), *Mn*

odd-symmetric filters at scale *n*, respectively. *Mn*

and the local amplitude at scale *n* is

where *ε* is a small positive constant.

Let

**3.1. Phase Congruency Metric (PCM)**

values *a*\*

#### **2.5. Statistic matching then joint-histogram matching (SM-JHM)**

The joint-HM can be applied together with statistic matching such as "stat-match then joint-HM", which usually result a better NV colorization. The statistic matching globally "paints" the image, while the joint-HM colors is more like the daylight picture in details (see Fig. 3h).

#### **2.6. Lookup table (LUT)**

Hogervorst and Toet (2008) proposed a color mapping method using a lookup table (LUT). The LUT is created using a false-colored image (formed with two-band NV images) and the reference (i.e., target) daylight image. This method yields a colored NV image similar to the daytime image in colors. The implementation of this LUT method is described as follows.

	- **a.** Locate all corresponding pixels in the reference (i.e., target) color image (that must be strictly aligned with the false-colored image);
	- **b.** Calculate the averaged *lαβ* values of those corresponding pixels and then convert them back to RGB values;
	- **c.** Assign the RGB values to *index 0* in the lookup table;

Once the LUT is created, the LUT-based mapping procedure is simple and fast (see Fig. 3i), and thus can be deployed in realtime. However, the LUT creation thoroughly relies on the aligned reference image aiming at the same scene. Any misalignment, using a different reference color image, or coloring a different NV imagery (i.e., aiming at different direction), will usually result a poor colorization (see Fig. 5i).

#### **3. Four image quality metrics**

Three image quality metrics for grayscale images and one metric for color images are reviewed in this section. The color-related metrics are defined in the CIELAB space (Malacara, 2002) specified by the International Commission on Illumination. The perceptually uniform CIELAB space consists of an achromatic luminosity component *L*\* (black-white) and two chromatic values *a*\* (green-magenta) and *b*\* (blue-yellow). The coordinates *L*\* *a*\* *b*\* (CIE 1976) can be calculated using the CIE XYZ tristimulus values (Malacara, 2002).

#### **3.1. Phase Congruency Metric (PCM)**

The *phase congruency* (PC) model is also called the "local energy model" developed by Morrone *et al*. (1986). This model postulates that the features in an image are perceived at the points where the Fourier components are maximal in phase. Based on the physiological and psycho‐ physical evidences, the PC theory provides a simple but biologically plausible model of how mammalian visual systems detect and identify the features in an image. PC can be considered as a significance measure of local structures in an image.

According to the definition of PC (Morrone *et al*., 1986), there are many different implemen‐ tations of PC map developed so far. A widely-used method developed by Kovesi (1999) is adopted in this chapter. Given a 1D image *f*(*x*), *Mn e* and *Mn o* represent the even-symmetric and odd-symmetric filters at scale *n*, respectively. *Mn e* and *Mn o* form a quadrature pair: *e*n(*x*) and *o*n(*x*). Responses of the quadrature pair form a response vector:

$$
\begin{bmatrix} e\_n(\mathbf{x}) \\ o\_n(\mathbf{x}) \end{bmatrix} = \begin{bmatrix} f(\mathbf{x}) \ \ast \ M\_n^{\epsilon} \\ f(\mathbf{x}) \ \ast \ M\_n^{\vartheta} \end{bmatrix}' \tag{6}
$$

and the local amplitude at scale *n* is

$$A\_n(\mathbf{x}) = \sqrt{e\_n^2(\mathbf{x}) + o\_n^2(\mathbf{x})}.\tag{7}$$

Let

using two gray (bin) levels is expected. If *H*<sup>J</sup> is computed as (*β* vs. *α*), its matching process is denoted as joint-HM(*βα*). Eventually, the histogram of the mapped image is sort of tradeoff between two histograms, "Source" and "Target". This is expected since we want no color distortion (i.e., preserving its own colors to some extent) during color mapping (see Fig. 3g).

The joint-HM can be applied together with statistic matching such as "stat-match then joint-HM", which usually result a better NV colorization. The statistic matching globally "paints" the image, while the joint-HM colors is more like the daylight picture in details (see Fig. 3h).

Hogervorst and Toet (2008) proposed a color mapping method using a lookup table (LUT). The LUT is created using a false-colored image (formed with two-band NV images) and the reference (i.e., target) daylight image. This method yields a colored NV image similar to the daytime image in colors. The implementation of this LUT method is described as follows.

**1.** Create a false-colored image (of 3 color planes) by assigning LWIR image to R, NIR image

**2.** Build RG colormap (i.e., a 256×256 LUT) and convert the false-colored image to an indexed

**a.** Locate all corresponding pixels in the reference (i.e., target) color image (that must

**b.** Calculate the averaged *lαβ* values of those corresponding pixels and then convert

**4.** Vary the index value from 2 to 65535 and repeat the processes described in Step 3. At the

Once the LUT is created, the LUT-based mapping procedure is simple and fast (see Fig. 3i), and thus can be deployed in realtime. However, the LUT creation thoroughly relies on the aligned reference image aiming at the same scene. Any misalignment, using a different reference color image, or coloring a different NV imagery (i.e., aiming at different direction),

Three image quality metrics for grayscale images and one metric for color images are reviewed in this section. The color-related metrics are defined in the CIELAB space (Malacara, 2002)

**3.** For all pixels in the indexed false-colored image whose index value equals 0:

**2.5. Statistic matching then joint-histogram matching (SM-JHM)**

**2.6. Lookup table (LUT)**

84 New Advances in Image Fusion

to G plane, and zeros to B, respectively;

them back to RGB values;

end, the LUT will be established.

**3. Four image quality metrics**

will usually result a poor colorization (see Fig. 5i).

image (0 to 65535) associated with the RG colormap;

be strictly aligned with the false-colored image);

**c.** Assign the RGB values to *index 0* in the lookup table;

$$F(\mathbf{x}) = \sum\_{n} e\_n(\mathbf{x})\_\prime \ H(\mathbf{x}) = \sum\_{n} o\_n(\mathbf{x}). \tag{8}$$

The one-dimensional (1D) *phase congruency metric* (PCM) can be computed as

$$PC(\mathbf{x}) = \frac{\sqrt{F^2(\mathbf{x}) + H^2(\mathbf{x})}}{\sum\_{\mathbf{n}} A\_{\mathbf{n}}(\mathbf{x}) + \varepsilon},\tag{9}$$

where *ε* is a small positive constant.

In order to calculate the quadrature pair of filters *Mn e* and *Mn o* , Gabor filters (Gabor, 1946) or log-Gabor filters (Mancas-Thillou & Gosselin, 2006) can be applied. In this chapter, we use log-Gabor filters (e.g., wavelets at scale *n* = 4) due to its following two features: (i) log-Gabor filters, by definition, have no direct current (DC) component; and (ii) the transfer function of the log-Gabor filter has an extended tail at the high frequency end, which makes it more capable to encode natural images than ordinary Gabor filters (Zhang *et al.*, 2011). The transfer function of a log-Gabor filter in the frequency domain is

$$\mathbf{G}(\alpha) = e^{-\frac{\left[\log\left(\alpha \left\{ \alpha\_0 \right\} \right) \right]^2}{2\sigma\_r^2}} \,\,\,\,\,\tag{10}$$

2

*D*

[0,1]. The phase congruency metric (PCM) of an image is defined as

2

where *M*×*N* is the size of the image. The range of *PCM* is [0,1].

**3.2. Gradient Magnitude Metric (GMM)**

The GM of *f*(*x*, *y*) at pixel (*x*, *y*) is defined as

are

*D*

,

where *ε* is a small positive constant. It should be noted that *PC*2*D*(*x*,*y*) is a real number within

(,) (,) ,

*PCM PC x y MN MN A x y*

*x y x y <sup>n</sup> <sup>j</sup> n j*

1 1 <sup>=</sup> ( , )= , (,)

The image *gradient magnitude* (GM) is computed to encode contrast information. PC and GM are complementary and they reflect different aspects of the HVS (human visual system) in assessing the local image quality. The GM measures the sharpness of an image. The perception of sharpness is related to the clarity of detail of an image. Image gradient computation is a traditional topic in image processing. Gradient operators can be expressed by convolution masks. One of commonly used gradient operators is the Sobel operator. The partial derivatives of image *f*(*x*, *y*), *Gx* and *Gy*, along horizontal and vertical directions using the Sobel operators

> 10 1 121 1 1 = 2 0 2 \* ( , ), = 0 0 0 \* ( , ) 4 4 10 1 121 *Gx y fxy G fxy* éù é ù - êú ê ú -


The averaged GM over all pixels is called image *gradient magnitude metric* (GMM),

, , 1 1 = ( , )= , *x y x y x y GMM G x y G G*

q+

q

( , )= , (,) *<sup>j</sup> <sup>j</sup>*

*<sup>n</sup> <sup>j</sup> n j*

å

*PC x y A xy*

(,)

e

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

åå (15)

http://dx.doi.org/10.5772/56948

87

(,)

2 2 ( , )= . *Gxy G G x y* <sup>+</sup> (18)

2 2

*MN MN* å å <sup>+</sup> (19)

e

(17)

*E xy*

*<sup>j</sup> <sup>j</sup>*

å

q

q+

å å åå (16)

*E xy*

where *ω*0 is the filter's center frequency and *σr* controls the filter's bandwidth.

To compute the PCM of two-dimensional (2D) grayscale images, we can apply the 1D analysis over several orientations and then combine the results according to some rules. The 1D log-Gabor filters described above can be extended to 2D ones by applying Gaussian function across the filter perpendicular to its orientation (Kovesi, 1999; Fischer *et al.*, 2007; Wang *et al.*, 2008). The 2D log-Gabor function has the following transfer function

$$\mathbf{G}\_2(\alpha \theta, \theta\_j) = e^{-\frac{\|\log(\alpha \theta \rho\_0)\|^2}{2\sigma\_r^2}} \cdot e^{-\frac{(\theta - \theta\_j)^2}{2\sigma\_\theta^2}} \,\,\_{\,\,\prime} \tag{11}$$

where *θ<sup>j</sup>* =( *jπ*)/ (2*J*) and *j* = 0, 1, 2,..., *J‒*1. *J* is the number of orientations and *σθ* determines the filter's angular bandwidth. By modulating *ω*0 and *θ<sup>j</sup>* and convolving *G*2 with the 2D image, we get a set of responses at each point (*x*, *y*) as *en*,*θ<sup>j</sup>* (*x*,*y* ), *on*,*θ<sup>j</sup>* (*x*,*y* )]. The local amplitude at scale *n* and orientation *θ<sup>j</sup>* is

$$A\_{n, \theta\_{\hat{j}}} = \sqrt{o\_{n, \theta\_{\hat{j}}}^2(\mathbf{x}, y) + o\_{n, \theta\_{\hat{j}}}^2(\mathbf{x}, y)}. \tag{12}$$

and the local energy along orientation *θ<sup>j</sup>* is

$$E\_{\theta\_{\vec{j}}} = \sqrt{F\_{\theta\_{\vec{j}}}^2(\mathbf{x}, \mathbf{y}) + H\_{\theta\_{\vec{j}}}^2(\mathbf{x}, \mathbf{y})}. \tag{13}$$

where

$$F\_{\theta\_{\hat{j}}}(\mathbf{x}, y) = \sum\_{n} e\_{n, \theta\_{\hat{j}}}(\mathbf{x}, y), \; H\_{\theta\_{\hat{j}}}(\mathbf{x}, y) = \sum\_{n} o\_{n, \theta\_{\hat{j}}}(\mathbf{x}, y). \tag{14}$$

The two-dimensional *PCM* at (*x*, *y*) is defined as

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques http://dx.doi.org/10.5772/56948 87

$$PC\_{2D}(\mathbf{x}, \mathbf{y}) = \frac{\sum\_{j} E\_{\theta\_{j}}(\mathbf{x}, \mathbf{y})}{\sum\_{n} \sum\_{j} A\_{n, \theta\_{j}}(\mathbf{x}, \mathbf{y}) + \varepsilon},\tag{15}$$

where *ε* is a small positive constant. It should be noted that *PC*2*D*(*x*,*y*) is a real number within [0,1]. The phase congruency metric (PCM) of an image is defined as

$$PCM = \frac{1}{MN} \sum\_{\{\mathbf{x}, \mathbf{y}\}} PC\_{2D}(\mathbf{x}, \mathbf{y}) = \frac{1}{MN} \sum\_{\{\mathbf{x}, \mathbf{y}\}} \frac{\sum\_{j} E\_{\theta\_{j}}(\mathbf{x}, \mathbf{y})}{\sum\_{n} \sum\_{j} A\_{n, \theta\_{j}}(\mathbf{x}, \mathbf{y}) + \varepsilon},\tag{16}$$

where *M*×*N* is the size of the image. The range of *PCM* is [0,1].

#### **3.2. Gradient Magnitude Metric (GMM)**

In order to calculate the quadrature pair of filters *Mn*

86 New Advances in Image Fusion

of a log-Gabor filter in the frequency domain is

*e* and *Mn o*

2 0 2 [log( )]

2 2

=( *jπ*)/ (2*J*) and *j* = 0, 1, 2,..., *J‒*1. *J* is the number of orientations and *σθ* determines the

(*x*,*y* ), *on*,*θ<sup>j</sup>*

 q q q

*j*

q

 s


<sup>×</sup> (11)

and convolving *G*2 with the 2D image, we

+ (12)

+ (13)

(*x*,*y* )]. The local amplitude at scale *n*

w w

> s

To compute the PCM of two-dimensional (2D) grayscale images, we can apply the 1D analysis over several orientations and then combine the results according to some rules. The 1D log-Gabor filters described above can be extended to 2D ones by applying Gaussian function across the filter perpendicular to its orientation (Kovesi, 1999; Fischer *et al.*, 2007; Wang *et al.*, 2008).

> 0 2 2 ( ) [log( )] 2 2


<sup>2</sup>( )= , ,

2 2 ,, , = ( , ) ( , ). *nn n jj j A e xy o xy*

2 2 = ( , ) ( , ). *jj j E F xy H xy*

, , ( , ) = ( , ), ( , ) = ( , ). *n n j jj j n n F xy e xy H xy o xy*

qq

 q

 qå å (14)

is

qq

qq

w w

*<sup>j</sup> G e e*

*r*

s

log-Gabor filters (Mancas-Thillou & Gosselin, 2006) can be applied. In this chapter, we use log-Gabor filters (e.g., wavelets at scale *n* = 4) due to its following two features: (i) log-Gabor filters, by definition, have no direct current (DC) component; and (ii) the transfer function of the log-Gabor filter has an extended tail at the high frequency end, which makes it more capable to encode natural images than ordinary Gabor filters (Zhang *et al.*, 2011). The transfer function

<sup>2</sup> ( ) = , *<sup>r</sup> G e*

w

The 2D log-Gabor function has the following transfer function

w q

filter's angular bandwidth. By modulating *ω*0 and *θ<sup>j</sup>*

get a set of responses at each point (*x*, *y*) as *en*,*θ<sup>j</sup>*

is

and the local energy along orientation *θ<sup>j</sup>*

q

The two-dimensional *PCM* at (*x*, *y*) is defined as

where *θ<sup>j</sup>*

where

and orientation *θ<sup>j</sup>*

where *ω*0 is the filter's center frequency and *σr* controls the filter's bandwidth.

, Gabor filters (Gabor, 1946) or

The image *gradient magnitude* (GM) is computed to encode contrast information. PC and GM are complementary and they reflect different aspects of the HVS (human visual system) in assessing the local image quality. The GM measures the sharpness of an image. The perception of sharpness is related to the clarity of detail of an image. Image gradient computation is a traditional topic in image processing. Gradient operators can be expressed by convolution masks. One of commonly used gradient operators is the Sobel operator. The partial derivatives of image *f*(*x*, *y*), *Gx* and *Gy*, along horizontal and vertical directions using the Sobel operators are

$$\mathbf{G}\_x = \frac{1}{4} \begin{bmatrix} 1 & 0 & -1 \\ 2 & 0 & -2 \\ 1 & 0 & -1 \end{bmatrix} \ast f(\mathbf{x}, y), \quad \mathbf{G}\_y = \frac{1}{4} \begin{bmatrix} 1 & 2 & 1 \\ 0 & 0 & 0 \\ -1 & -2 & -1 \end{bmatrix} \ast f(\mathbf{x}, y) \tag{17}$$

The GM of *f*(*x*, *y*) at pixel (*x*, *y*) is defined as

$$G(x, y) = \sqrt{G\_x^2 + G\_y^2}.\tag{18}$$

The averaged GM over all pixels is called image *gradient magnitude metric* (GMM),

$$\text{GMM} = \frac{1}{\text{MN}} \sum\_{\mathbf{x}, \mathbf{y}} \text{G}(\mathbf{x}, \mathbf{y}) = \frac{1}{\text{MN}} \sum\_{\mathbf{x}, \mathbf{y}} \sqrt{\mathbf{G}\_{\mathbf{x}}^{2} + \mathbf{G}\_{\mathbf{y}}^{2}}. \tag{19}$$

where *M*×*N* is the size of the image.

#### **3.3. Image Contrast Metric (ICM)**

An image with excellent contrast has a wide dynamic range of intensity level and appropriate intensity. Both the dynamic range of intensity level or the overall intensity distribution of the image can be provided by a histogram. A global contrast metric is proposed using the histogram character. The histogram of image with levels in the range [0, *N*-1] is a frequencydistribution function defined as the overall intensity distribution of an image

$$h(X\_{k'}) = n\_{k'} \tag{20}$$

is determined by the histogram of gray intensity and the histogram of color luminance *L*\*

1


*k gI k k <sup>I</sup> <sup>C</sup> P I*

å (26)

å (27)

+ (28)

Combine to

channel. The global *image contrast*

*N*

where *αI* and *P*(*Ik* ) can be calculated as above for gray intensity. For *L*\* channel, the color

*k c c k <sup>k</sup> <sup>L</sup> <sup>L</sup> <sup>C</sup> P L N*

2 2

Gray contrast metric (gray image)

Color contrast metric ( *L*\* channel)

 w

where *ω*1 and *ω*2 are the weights of *Cg* and *Cc*. For simplicity, we choose *ω*1=*ω*2= 0.5. *ICM* varies within [0,1]. The evaluation of image contrast metric of color fusion image is shown in Fig. 1.

global ICM ICM Colorized

Given a daylight image *f*1(*x*, *y*) and a colorized image *f*2(*x*, *y*), if a colorized image is similar to the daylight image then the colorized image is considered as of a good quality. Since a human is sensitive to hue in addition to luminance, we compare the *a*\* and *b*\* channels of the reference image with that of the colorized image using the gray relational analysis (GRA) theory (Ma *et*

\*

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

=0 = ( ), *NI*

<sup>1</sup> \* \*


=0 \* = ( ),

1 2 = , *g c ICM C C* w

a

*N L*

a

where *αc* and *P*(*L <sup>k</sup>* ) can be calculated as above for *L*\*

Convert to Gray image

Convert to CIELAB space

**Figure 1.** Diagram of calculation of the contrast metric.

**3.4. Color Natural Metric (CNM)**

Fig. 1). For the gray intensity *I*, the gray contrast metric is defined as

contrast metric is

*metric* (ICM) is defined as

Image

*al.*, 2005).

(see

89

http://dx.doi.org/10.5772/56948

where *Xk* is the *k*-th level of input and *nk* is the number of the pixels in the image having level *xk* . The probability density function (PDF) is computed by

$$P(X\_k) = n\_k / n\_\prime \tag{21}$$

where *n* is the total number of the pixels of the image. The dynamic range value *β* is defined as

$$\beta = \sum\_{k=0}^{L-1} \mathcal{S}(X\_k)\_\prime \tag{22}$$

where

$$S(X\_k) = \begin{cases} 1, & \text{if } P(X\_k) > 0 \\ 0, & \text{otherwise} \end{cases}.\tag{23}$$

The dynamic range matrix *α* of histogram is defined as

$$a = \frac{\beta}{2N - \beta'} \tag{24}$$

where *α* ∈ [0,1] and a larger value of *α* means a wider dynamic range in the histogram, which leads to better contrast. The image contrast metric is defined as

$$\mathbf{C} = \alpha \sum\_{k=0}^{N-1} \frac{\mathbf{X}\_k}{N} P(\mathbf{X}\_k). \tag{25}$$

For color images, the image contrast metric is determined by both gray contrast and color contrast. Because human perception is more sensitive to the luminance on contrast evaluation, we employ *L*\* channel in the CIELAB space to evaluate the color contrast. Thus, image contrast is determined by the histogram of gray intensity and the histogram of color luminance *L*\* (see Fig. 1). For the gray intensity *I*, the gray contrast metric is defined as

$$C\_g = \alpha\_I \sum\_{k=0}^{N\_f - 1} \frac{I\_k}{N} P(I\_k)\_{\prime} \tag{26}$$

where *αI* and *P*(*Ik* ) can be calculated as above for gray intensity. For *L*\* channel, the color contrast metric is

$$C\_c = \alpha\_c \sum\_{k=0}^{N\_{\tilde{L}}} \frac{L\_k^\*}{N\_{\tilde{L}}!} P(L\_k^\*)\_{\prime} \tag{27}$$

where *αc* and *P*(*L <sup>k</sup>* ) can be calculated as above for *L*\* channel. The global *image contrast metric* (ICM) is defined as

$$\text{ICM} = \sqrt{\text{o}\_1 \text{C}\_{\text{g}}^2 + \text{o}\_2 \text{C}\_{\text{c}}^2} \,\,\,\tag{28}$$

where *ω*1 and *ω*2 are the weights of *Cg* and *Cc*. For simplicity, we choose *ω*1=*ω*2= 0.5. *ICM* varies within [0,1]. The evaluation of image contrast metric of color fusion image is shown in Fig. 1.

**Figure 1.** Diagram of calculation of the contrast metric.

#### **3.4. Color Natural Metric (CNM)**

where *M*×*N* is the size of the image.

88 New Advances in Image Fusion

**3.3. Image Contrast Metric (ICM)**

as

where

we employ *L*\*

An image with excellent contrast has a wide dynamic range of intensity level and appropriate intensity. Both the dynamic range of intensity level or the overall intensity distribution of the image can be provided by a histogram. A global contrast metric is proposed using the histogram character. The histogram of image with levels in the range [0, *N*-1] is a frequency-

where *Xk* is the *k*-th level of input and *nk* is the number of the pixels in the image having level

where *n* is the total number of the pixels of the image. The dynamic range value *β* is defined

*k*

*k*

*otherwise*

1

=0 = ( ), *L*

1, ( ) > 0 ( )= . 0,

= , <sup>2</sup>*<sup>N</sup>* b

1


*k*

For color images, the image contrast metric is determined by both gray contrast and color contrast. Because human perception is more sensitive to the luminance on contrast evaluation,

*k*

channel in the CIELAB space to evaluate the color contrast. Thus, image contrast

=0 = ( ). *N*

*k <sup>X</sup> C PX N*

a

b

where *α* ∈ [0,1] and a larger value of *α* means a wider dynamic range in the histogram, which

*k*

The dynamic range matrix *α* of histogram is defined as

leads to better contrast. The image contrast metric is defined as

*k* b *S X* -

*if P X S X*

ìï í

a-

( )= , *k k hX n* (20)

( )= , *k k PX n n* (21)

å (22)

ïî (23)

(24)

å (25)

distribution function defined as the overall intensity distribution of an image

*xk* . The probability density function (PDF) is computed by

Given a daylight image *f*1(*x*, *y*) and a colorized image *f*2(*x*, *y*), if a colorized image is similar to the daylight image then the colorized image is considered as of a good quality. Since a human is sensitive to hue in addition to luminance, we compare the *a*\* and *b*\* channels of the reference image with that of the colorized image using the gray relational analysis (GRA) theory (Ma *et al.*, 2005).

We first convert two images, *f*1 and *f*2, to *L*\* *a*\* *b*\* space. *L <sup>i</sup>* (*x*,*y*), *ai* (*x*,*y*), and *bi* (*x*,*y*) are the *L*\* *a*\* *b*\* values of *f <sup>i</sup>* at pixel (*x*, *y*). The gray relation coefficient between *a*<sup>1</sup> and *a*2 at pixel (*x*, *y*) is defined as

$$\varphi\_u^\*(\mathbf{x}, y) = \frac{\underset{i}{\text{minmin}} \, | \, a\_1^\*(i, j) - a\_2^\*(i, j) | \, +0.5 \max\_{i} \max\_{j} | \, a\_1^\*(i, j) - a\_2^\*(i, j) |}{|a\_1^\*(\mathbf{x}, y) - a\_2^\*(\mathbf{x}, y)| + 0.5 \max\_{i} \max\_{j} | \, a\_1^\*(i, j) - a\_2^\*(i, j) | \, +\varepsilon},\tag{29}$$

**4. Objective Evaluation Index (OEI)**

The two images are first converted into *L*\*

GM values is defined as

defined as follows

between *PC*1 and *PC*2 at pixel (*x*, *y*) is defined as

into one similarity measure, *SL* (*x*), as follows

With the four metrics defined in Section 3, a new *objective evaluation index* (OEI) is proposed to quantitatively evaluate the qualities of colorized images. Given the reference image *f*1 and the colorized image *f*2, the OEI is calcualted in two steps. First the local similarity maps of the two images are computed, and then the similarity maps are integrated into a single similarity score.

> *a*\* *b*\*

2 2

where *K*<sup>1</sup> is a positive constant. In practice, the determination of *K*<sup>1</sup> depends on the dynamic range of *PC* values. *SPC* varies within [0,1]. Similarly, the similarity measure based on the two

2 (,) (,) ( , )= , (,) (,) *PC PC x y PC x y K S xy PC x y PC x y K*

2 2

1 2 ( , ) = [ ( , )] [ ( , )] , *<sup>L</sup> PC <sup>G</sup> S xy S xy S xy* l

<sup>1</sup> max

(,) (,)

*L*

*OEI S CNM*

æ ö ç ÷ ´ ´

max

è ø

*PC x y*

*PC x y S x y*

(,)

*x y*

*x y*

å

where *λ*1 and *λ*2 are parameters to adjust the relative importance of PC and GM features.

With the aid of the similarity *SL* (*x*,*y*) at each pixel (*x*, *y*), the overall similarity between *f*1 and *f*2 can be calculated with the averaged *SL* (*x*,*y*) over all pixels. However, the image saliency (i.e., local significance) usually varies with the pixel location. For example, edges convey more crucial information than smooth areas. Specifically, a human is sensitive to phase congruent structures (Henriksson *et al.*, 2009), and thus a larger *PC*(*x*, *y*) value between *f*1 and *f*2 implies a higher impact on evaluating the similarity between *f*1 and *f*2 at location (*x*, *y*). Therefore, we use *PC*max(*x*,*y*)=*max PC*1(*x*,*y* ),*PC*2(*x*,*y*) to weigh the importance of *SL* (*x*,*y*) in formulating the overall similarity. Accordingly, the objective evaluation index (OEI) between *f*1 and *f*<sup>2</sup> is

(,) <sup>2</sup> <sup>3</sup>

g

*ICM*

g

å (38)

g

<sup>=</sup> ( ) ( ), (,)

2 (,) (,) ( , )= , (,) (,) *<sup>G</sup> G xyG xy K S xy G xy G xy K*

calculated and denoted as *PC*1 and *PC*2 for *f*1 and *f*2 images, respectively. The similarity measure

12 1

1 21

12 2

+

 l

122

where *K*2 is a positive constant. *SG* varies within [0,1]. Then, *SPC*(*x*,*y*) and *SG*(*x*,*y*) are combined

space. For *L*\*

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

+

information, the *PC* maps are

http://dx.doi.org/10.5772/56948

91

+ + (35)

+ + (36)

(37)

where *ε* is a small positive constant.

The gray relation coefficient between *b*1 and *b*2 at pixel (*x*, *y*) is defined as

$$\varphi\_b^\*(\mathbf{x}, \mathbf{y}) = \frac{\underset{i}{\text{minmin}} \, |\, b\_1^\*(i, j) - b\_2^\*(i, j)| \, +0.5 \max\_{i} \max\_{j} |\, b\_1^\*(i, j) - b\_2^\*(i, j)|}{|\, b\_1^\*(\mathbf{x}, y) - b\_2^\*(\mathbf{x}, y)| \, +0.5 \max\_{i} \max\_{j} |\, b\_1^\*(i, j) - b\_2^\*(i, j)| \, +\infty}. \tag{30}$$

In the definitions of *ξa*(*x*,*y*) and *ξb*(*x*,*y*), min() and max() are operated over whole image. However, it is possible that min() and max() are operated over a small neighborhood of (*x*, *y*).

The gray rational degrees of *a*\* and *b*\* information for two images are defined as

$$R\_a = \sum\_{(\mathbf{x}, \mathbf{y})} o(\mathbf{x}, \mathbf{y}) \xi\_a(\mathbf{x}, \mathbf{y}),\tag{31}$$

$$R\_b = \sum\_{(\mathbf{x}, \mathbf{y})} o(\mathbf{x}, \mathbf{y}) \xi\_b(\mathbf{x}, \mathbf{y}), \tag{32}$$

where *ω*(*x*,*y*) is the weight of the gray rational coefficient, which satisfies

$$\sum\_{(\mathbf{x},\mathbf{y})} o(\mathbf{x},\mathbf{y}) = 1.\tag{33}$$

For simplicity, we choose *ω*(*x*,*y*)= 1 *<sup>M</sup>* <sup>×</sup> *<sup>N</sup>* where *M* and *N* are the length of vectors *x* and *<sup>y</sup>* respectively.

The *color natural metric* (CNM) is defined as

$$\text{CNN} = \sqrt{R\_a R\_b} \,\text{.}\tag{34}$$

*CNM* varies within [0,1]; the larger the CNM, the more similar the two images.

#### **4. Objective Evaluation Index (OEI)**

We first convert two images, *f*1 and *f*2, to *L*\*

values of *f <sup>i</sup>*

*a*

*b*

The gray rational degrees of *a*\*

For simplicity, we choose *ω*(*x*,*y*)=

The *color natural metric* (CNM) is defined as

respectively.

x

x

90 New Advances in Image Fusion

*x y*

where *ε* is a small positive constant.

as

*a*\* *b*\*

\* \* \* \* 1 2 1 2

\* \* \* \* 1 2 1 2

( , )= . | ( , ) ( , )| 0.5 | ( , ) ( , )| maxmax

= ( , ) ( , ), *a a*

= ( , ) ( , ), *b b*

 x

 x

( , ) = 1.

*a xy a xy a ij a ij*

( , )= , | ( , ) ( , )| 0.5 | ( , ) ( , )| maxmax

*i j i j*

The gray relation coefficient between *b*1 and *b*2 at pixel (*x*, *y*) is defined as

and *b*\*

(,)

(,)

where *ω*(*x*,*y*) is the weight of the gray rational coefficient, which satisfies

(,)

*x y* åw

1

*CNM* varies within [0,1]; the larger the CNM, the more similar the two images.

*x y R xy xy* åw

*x y R xy xy* åw

*i j i j*

*x y b xy b xy b ij b ij*

space. *L <sup>i</sup>*

\* \* \* \* 1 2 1 2

*a ij a ij a ij a ij*


\* \* \* \* 1 2 1 2

*b ij b ij b ij b ij*


information for two images are defined as


minmin| ( , ) ( , )| 0.5 | ( , ) ( , )| maxmax

*i j*

In the definitions of *ξa*(*x*,*y*) and *ξb*(*x*,*y*), min() and max() are operated over whole image. However, it is possible that min() and max() are operated over a small neighborhood of (*x*, *y*).


minmin| ( , ) ( , )| 0.5 | ( , ) ( , )| maxmax

*i j*

at pixel (*x*, *y*). The gray relation coefficient between *a*<sup>1</sup> and *a*2 at pixel (*x*, *y*) is defined

(*x*,*y*), *ai*

(*x*,*y*), and *bi*

e

e

(31)

(32)

*x y* (33)

*<sup>M</sup>* <sup>×</sup> *<sup>N</sup>* where *M* and *N* are the length of vectors *x* and *<sup>y</sup>*

= . *CNM R Ra b* (34)

(*x*,*y*) are the *L*\*

*a*\* *b*\*

(29)

(30)

With the four metrics defined in Section 3, a new *objective evaluation index* (OEI) is proposed to quantitatively evaluate the qualities of colorized images. Given the reference image *f*1 and the colorized image *f*2, the OEI is calcualted in two steps. First the local similarity maps of the two images are computed, and then the similarity maps are integrated into a single similarity score.

The two images are first converted into *L*\* *a*\* *b*\* space. For *L*\* information, the *PC* maps are calculated and denoted as *PC*1 and *PC*2 for *f*1 and *f*2 images, respectively. The similarity measure between *PC*1 and *PC*2 at pixel (*x*, *y*) is defined as

$$S\_{PC}(\mathbf{x}, y) = \frac{2PC\_1(\mathbf{x}, y)PC\_2(\mathbf{x}, y) + K\_1}{PC\_1^2(\mathbf{x}, y) + PC\_2^2(\mathbf{x}, y) + K\_1},\tag{35}$$

where *K*<sup>1</sup> is a positive constant. In practice, the determination of *K*<sup>1</sup> depends on the dynamic range of *PC* values. *SPC* varies within [0,1]. Similarly, the similarity measure based on the two GM values is defined as

$$S\_G(\mathbf{x}, \mathbf{y}) = \frac{2G\_1(\mathbf{x}, \mathbf{y})G\_2(\mathbf{x}, \mathbf{y}) + K\_2}{G\_1^2(\mathbf{x}, \mathbf{y}) + G\_2^2(\mathbf{x}, \mathbf{y}) + K\_2},\tag{36}$$

where *K*2 is a positive constant. *SG* varies within [0,1]. Then, *SPC*(*x*,*y*) and *SG*(*x*,*y*) are combined into one similarity measure, *SL* (*x*), as follows

$$\mathbf{S}\_{L}(\mathbf{x},\boldsymbol{y}) = \left[\mathbf{S}\_{\rm PC}(\mathbf{x},\boldsymbol{y})\right]^{\lambda} \left[\mathbf{S}\_{G}(\mathbf{x},\boldsymbol{y})\right]^{\lambda} \mathbf{\hat{I}}\tag{37}$$

where *λ*1 and *λ*2 are parameters to adjust the relative importance of PC and GM features.

With the aid of the similarity *SL* (*x*,*y*) at each pixel (*x*, *y*), the overall similarity between *f*1 and *f*2 can be calculated with the averaged *SL* (*x*,*y*) over all pixels. However, the image saliency (i.e., local significance) usually varies with the pixel location. For example, edges convey more crucial information than smooth areas. Specifically, a human is sensitive to phase congruent structures (Henriksson *et al.*, 2009), and thus a larger *PC*(*x*, *y*) value between *f*1 and *f*2 implies a higher impact on evaluating the similarity between *f*1 and *f*2 at location (*x*, *y*). Therefore, we use *PC*max(*x*,*y*)=*max PC*1(*x*,*y* ),*PC*2(*x*,*y*) to weigh the importance of *SL* (*x*,*y*) in formulating the overall similarity. Accordingly, the objective evaluation index (OEI) between *f*1 and *f*<sup>2</sup> is defined as follows

$$\text{OEI} = \left(\frac{\sum\_{\langle \mathbf{x}, \mathbf{y} \rangle} PC\_{\text{max}} \langle \mathbf{x}, \mathbf{y} \rangle \mathbf{S}\_{\text{L}} \langle \mathbf{x}, \mathbf{y} \rangle}{\sum\_{\langle \mathbf{x}, \mathbf{y} \rangle} PC\_{\text{max}} \langle \mathbf{x}, \mathbf{y} \rangle} \right)^{\gamma\_1} \times \langle \mathbf{S}\_{\text{lCM}} \rangle^{\gamma\_2} \times \langle \text{CNM} \rangle^{\gamma\_3}, \tag{38}$$

#### 92 New Advances in Image Fusion

where

$$PC\_{\max}(\mathbf{x}, \mathbf{y}) = \max[PC\_1(\mathbf{x}, \mathbf{y}), PC\_2(\mathbf{x}, \mathbf{y})] \tag{39}$$

**5. Experimental results and discussions**

Fig. 7, which was taken at noon time.

In our experiments, five triplets of multispectral NV images (as shown in Figs. 3-7; collected at Alcorn State University), color RGB, near infrared (NIR) and long wave infrared (LWIR), were colorized by using six different coloring methods as described in Section 2. The threeband input images are shown in Figs. 3-7a, b and c, respectively. The image resolutions and its taken time are given in figure captions. The RGB images and LWIR images were taken by a FLIR SC620 two-in-one camera, which has LWIR camera (of 640×480 pixel original resolution and 7.5~13 µm spectral range) and an integrated visible-band digital camera (2048×1536 pixel original resolution). The NIR images were taken by a FLIR SC6000 camera (640×512 pixel original resolution and 0.9~1.7 µm spectral range). Two cameras (SC620 and SC6000) were placed on the same fixture and turned to aim at the same location. The images were typically captured during sunset time and dusk time during a fall season. One exception is shown in

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

by 10%, and the brightness of (a) and (i) were increased by 10%.

**Fig. 3**. Night-vision coloring comparison (Case# AT008 – taken at sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased

**Figure 3.** Night-vision coloring comparison (Case# AT008 – taken at sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) and (i) were increased by 10%.

http://dx.doi.org/10.5772/56948

93

$$S\_{ICM} = \frac{2\,\text{ICM}(f\_1) \times \text{ICM}(f\_2) + K\_3}{\text{ICM}(f\_1)^2 + \text{ICM}(f\_2)^2 + K\_3},\tag{40}$$

where *CNM* is previously defined and *K*3 and *γ<sup>i</sup>* (*i* = 1,2,3) are positive constants. The diagram of calculating OEI is shown in Fig. 2. The range of *OEI* is [0,1]. The larger the *OEI* value of a colorized image is, the more similar (i.e., the better) the colorized image is to the reference image. *Error pooling* is the integration of methods with tradeoffs between *γ*1, *γ*2, and *γ*3.

**Figure 2.** Diagram of calculating OEI in *L*\* *a*\* *b*\* space.

*γ*1, *γ*2, and *γ*<sup>3</sup> are the weights of three components in the OEI metric. Selection of *γ<sup>i</sup>* is critical for the OEI calculation. The values of *γ<sup>i</sup>* are empirically decided, and the typical values of *γ*<sup>1</sup> and *γ*2 are between 0.8~1.1 and *γ*3 is between 0.05~0.2. *Ki* (*i* = 1,2,3) are constants to increase the metric stability. In our experiments presented in Section 6, we chose *γ*1=*γ*2= 1, *γ*3= 0.2; *K*<sup>1</sup> = 0.85, *K*2 = 160, *K*3 = 0.001; and *λ*<sup>1</sup> =*λ*<sup>2</sup> =1.

### **5. Experimental results and discussions**

where

92 New Advances in Image Fusion

Reference Image

Colorized Image

**Figure 2.** Diagram of calculating OEI in *L*\*

*K*2 = 160, *K*3 = 0.001; and *λ*<sup>1</sup> =*λ*<sup>2</sup> =1.

max <sup>1</sup> <sup>2</sup> *PC x y PC x y PC x y* ( , ) = max[ ( , ), ( , )], (39)

+ + (40)

(*i* = 1,2,3) are positive constants. The diagram

Error

Pooling OEI

is critical

1 23 2 2 1 23

´ +

2 () () = ,

*ICM f ICM f K*

of calculating OEI is shown in Fig. 2. The range of *OEI* is [0,1]. The larger the *OEI* value of a colorized image is, the more similar (i.e., the better) the colorized image is to the reference image. *Error pooling* is the integration of methods with tradeoffs between *γ*1, *γ*2, and *γ*3.

ICM1

PC1

G1

CNM

PC2

G2

() () *ICM ICM f ICM f K <sup>S</sup>*

*L\**

*a***\*** *b***\***

*a***\*** *b***\***

Gray image ICM2

*γ*1, *γ*2, and *γ*<sup>3</sup> are the weights of three components in the OEI metric. Selection of *γ<sup>i</sup>*

for the OEI calculation. The values of *γ<sup>i</sup>* are empirically decided, and the typical values of *γ*<sup>1</sup> and *γ*2 are between 0.8~1.1 and *γ*3 is between 0.05~0.2. *Ki* (*i* = 1,2,3) are constants to increase the metric stability. In our experiments presented in Section 6, we chose *γ*1=*γ*2= 1, *γ*3= 0.2; *K*<sup>1</sup> = 0.85,

*a*\* *b*\* space.

*L\**

where *CNM* is previously defined and *K*3 and *γ<sup>i</sup>*

Gray image

*L*\**a*\**b*\* space

*L*\**a*\**b*\* space

In our experiments, five triplets of multispectral NV images (as shown in Figs. 3-7; collected at Alcorn State University), color RGB, near infrared (NIR) and long wave infrared (LWIR), were colorized by using six different coloring methods as described in Section 2. The threeband input images are shown in Figs. 3-7a, b and c, respectively. The image resolutions and its taken time are given in figure captions. The RGB images and LWIR images were taken by a FLIR SC620 two-in-one camera, which has LWIR camera (of 640×480 pixel original resolution and 7.5~13 µm spectral range) and an integrated visible-band digital camera (2048×1536 pixel original resolution). The NIR images were taken by a FLIR SC6000 camera (640×512 pixel original resolution and 0.9~1.7 µm spectral range). Two cameras (SC620 and SC6000) were placed on the same fixture and turned to aim at the same location. The images were typically captured during sunset time and dusk time during a fall season. One exception is shown in Fig. 7, which was taken at noon time.

pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased **Figure 3.** Night-vision coloring comparison (Case# AT008 – taken at sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) and (i) were increased by 10%.

by 10%, and the brightness of (a) and (i) were increased by 10%.

**Fig. 3**. Night-vision coloring comparison (Case# AT008 – taken at sunset time; 640×480

640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, statmatch then joint-HM, and LUT-mapping, respectively. The settings in the colormappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color **Figure 4.** Night-vision coloring comparison (Case# AT009 – taken after sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statis‐ tic-matching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) was increased by 10%.

sky. For example, the car is clearly identified in Fig. 5d, where the water area (between ground and trees and shown in cyan color) is certainly noticeable. However, it is hard to realize any

**Fig. 5**. Night-vision coloring comparison (Case# AT012 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(a) due to the dark RGB image in (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) and (i) were

**Figure 5.** Night-vision coloring comparison (Case# AT012 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(a) due to the dark RGB image in (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of

(a) (b) (c)

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

(d) (e) (f)

(g) (h) (i)

95

http://dx.doi.org/10.5772/56948

All color mapping methods were applied to the five triplets and their results are presented in Figs. 3-7. The source images are the color-fusion images (Figs. 3-7d), while the target images are the color RGB images (Figs. 3-4a & Fig. 8a-b). Figs. 5-6a cannot be used as the target images since they are too dark and noisy. Figs. 3-7e show the colored images with the statistic matching (SM) method, which are more similar to the daylight pictures in contrast with the color-fusion images. The five results (Figs. 3-7e) are equivalently good, which means that the statistic matching is reliable. The histogram matching (HM) results shown in Figs. 3-7f are oversatu‐ rated, which may be more suitable for segmentation-based colorization (Zheng & Essock, 2008). The joint histogram matching (JHM) are illustrated in Figs. 3-7g, where the mapped

water area in the original images (Figs. 5a-c).

(a) and (i) were increased by 20% and 10%, respectively.

increased by 20% and 10%, respectively.

images were increased by 10%, and the brightness of (a) was increased by 10%.

**Fig. 4**. Night-vision coloring comparison (Case# AT009 – taken after sunset time;

Of course, image registration and fusion (Hil & Batchelor, 2001) were applied to the three band images shown in Figs. 3-7, where manual alignment was employed to the RGB image shown in Figs. 5-6a since they are so dark and noisy. To better present the color images (including the daylight RGB images and the colorized NV images), contrast and brightness adjustments (as described in figure captions) were applied. Notice that piecewise contrast stretching (Eq. (2)) was used for NIR enhancement. As referred in Eq. (1d), the fused images (shown elsewhere (Zheng & Essock, 2008)) were obtained using the *a*DWT algorithm (Zheng *et al.*, 2005). The channel-based color fusion (CBCF, defined in Eqs. (1)) was applied to the NIR and LWIR images (shown in Figs. 3-7b & c), and the results are illustrated in Figs. 3-7d. The resulted images from two-band color fusion (Figs. 3-7d) resemble natural colors, which makes scene classification easier. The paved ground appears reddish since they have strong heat radiations (at dusk time) and thus causes strong responses in LWIR images. In the color-fusion images, the trees, buildings and grasses can be easily distinguished from ground (parking lots) and The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques http://dx.doi.org/10.5772/56948 95

**Fig. 5**. Night-vision coloring comparison (Case# AT012 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are **Figure 5.** Night-vision coloring comparison (Case# AT012 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(a) due to the dark RGB image in (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) and (i) were increased by 20% and 10%, respectively.

sky. For example, the car is clearly identified in Fig. 5d, where the water area (between ground and trees and shown in cyan color) is certainly noticeable. However, it is hard to realize any water area in the original images (Figs. 5a-c). contrasts of all color images were increased by 10%, and the brightness of (a) and (i) were increased by 20% and 10%, respectively.

source = (d) and target = Fig. 8(a) due to the dark RGB image in (a). Notice that the

Of course, image registration and fusion (Hil & Batchelor, 2001) were applied to the three band images shown in Figs. 3-7, where manual alignment was employed to the RGB image shown in Figs. 5-6a since they are so dark and noisy. To better present the color images (including the daylight RGB images and the colorized NV images), contrast and brightness adjustments (as described in figure captions) were applied. Notice that piecewise contrast stretching (Eq. (2)) was used for NIR enhancement. As referred in Eq. (1d), the fused images (shown elsewhere (Zheng & Essock, 2008)) were obtained using the *a*DWT algorithm (Zheng *et al.*, 2005). The channel-based color fusion (CBCF, defined in Eqs. (1)) was applied to the NIR and LWIR images (shown in Figs. 3-7b & c), and the results are illustrated in Figs. 3-7d. The resulted images from two-band color fusion (Figs. 3-7d) resemble natural colors, which makes scene classification easier. The paved ground appears reddish since they have strong heat radiations (at dusk time) and thus causes strong responses in LWIR images. In the color-fusion images, the trees, buildings and grasses can be easily distinguished from ground (parking lots) and

images were increased by 10%, and the brightness of (a) was increased by 10%.

**Fig. 4**. Night-vision coloring comparison (Case# AT009 – taken after sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, statmatch then joint-HM, and LUT-mapping, respectively. The settings in the colormappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color

**Figure 4.** Night-vision coloring comparison (Case# AT009 – taken after sunset time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statis‐ tic-matching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). Notice that the contrasts of all color images were increased by 10%, and the brightness of (a) was increased by 10%.

(a) (b) (c)

94 New Advances in Image Fusion

(d) (e) (f)

(g) (h) (i)

All color mapping methods were applied to the five triplets and their results are presented in Figs. 3-7. The source images are the color-fusion images (Figs. 3-7d), while the target images are the color RGB images (Figs. 3-4a & Fig. 8a-b). Figs. 5-6a cannot be used as the target images since they are too dark and noisy. Figs. 3-7e show the colored images with the statistic matching (SM) method, which are more similar to the daylight pictures in contrast with the color-fusion images. The five results (Figs. 3-7e) are equivalently good, which means that the statistic matching is reliable. The histogram matching (HM) results shown in Figs. 3-7f are oversatu‐ rated, which may be more suitable for segmentation-based colorization (Zheng & Essock, 2008). The joint histogram matching (JHM) are illustrated in Figs. 3-7g, where the mapped

(a) (b) (c)

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

http://dx.doi.org/10.5772/56948

97

(d) (e) (f)

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a).

**Fig. 7**. Night-vision coloring comparison (Case# ST102 – taken at noon time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are

**Figure 7.** Night-vision coloring comparison (Case# ST102 – taken at noon time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM,

(g) (h) (i)

source = (d) and target = (a).

(a) (b)

source = (d) and target = (a).

creased by 10%.

(a) (b)

Fig. 6, ST029). Notice that their contrasts were increased by 10%.

Fig. 6, ST029). Notice that their contrasts were increased by 10%.

**Figure 8.** Color RGB images for night-vision colorization (taken before sunset time; 640×480 pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of Fig. 6, ST029). Notice that their contrasts were in‐

**Fig. 8**. Color RGB images for night-vision colorization (taken before sunset time; 640×480 pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of

**Fig. 7**. Night-vision coloring comparison (Case# ST102 – taken at noon time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are

**Fig. 8**. Color RGB images for night-vision colorization (taken before sunset time; 640×480 pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of

pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(b) due to the dark RGB image in (a). Notice that the **Figure 6.** Night-vision coloring comparison (Case# ST029 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(b) due to the dark RGB image in (a). Notice that the contrasts of (d-i) were increased by 10%, and (a) was increased by 20%. The brightness of (a) and (i) were increased by 20% and 10%, respectively.

contrasts of (d-i) were increased by 10%, and (a) was increased by 20%. The brightness of

(a) and (i) were increased by 20% and 10%, respectively.

**Fig. 6**. Night-vision coloring comparison (Case# ST029 – taken at dusk time; 640×480

images are better than the color fusions but preserve much reddish colors (existed in source images). The "stat-match then joint-HM" (SM-JHM) means that a joint-HM is performed with inputs of (source = the SM-colored image in Fig. 3e; target = the RGB image in Fig. 3a). The SM-JHM results are presented in Figs. 3-7h, which sometimes are better than the results from either stat-match or joint-HM (e.g., Fig. 3h). The examples of LUT-mapping colorization are given in Figs. 3-7i. Figs. 3-4i and Fig. 7i (an ideal case of LUT mapping) shows impressive colors; whereas Figs. 5-6i exhibit noisy and distorted since the reference images (shown in Figs. 8a-b) are misaligned with the NV images (shown in Figs. 5-6). When using the LUT established in a different case at daytime (aiming at different direction at nighttime), the more misalignment the worse the LUT-colored results appear. The LUT-based colorization described in Subsection 2.6 is perhaps suitable for a surveillance application where a camera is aiming at a fixed direction.

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques http://dx.doi.org/10.5772/56948 97

using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a). pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are **Figure 7.** Night-vision coloring comparison (Case# ST102 – taken at noon time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = (a).

source = (d) and target = (a).

**Fig. 7**. Night-vision coloring comparison (Case# ST102 – taken at noon time; 640×480

images are better than the color fusions but preserve much reddish colors (existed in source images). The "stat-match then joint-HM" (SM-JHM) means that a joint-HM is performed with inputs of (source = the SM-colored image in Fig. 3e; target = the RGB image in Fig. 3a). The SM-JHM results are presented in Figs. 3-7h, which sometimes are better than the results from either stat-match or joint-HM (e.g., Fig. 3h). The examples of LUT-mapping colorization are given in Figs. 3-7i. Figs. 3-4i and Fig. 7i (an ideal case of LUT mapping) shows impressive colors; whereas Figs. 5-6i exhibit noisy and distorted since the reference images (shown in Figs. 8a-b) are misaligned with the NV images (shown in Figs. 5-6). When using the LUT established in a different case at daytime (aiming at different direction at nighttime), the more misalignment the worse the LUT-colored results appear. The LUT-based colorization described in Subsection 2.6 is perhaps suitable for a surveillance application where a camera is aiming at a fixed

**Fig. 6**. Night-vision coloring comparison (Case# ST029 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR LWIR), statistic-matching, and histogrammatching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(b) due to the dark RGB image in (a). Notice that the contrasts of (d-i) were increased by 10%, and (a) was increased by 20%. The brightness of

**Figure 6.** Night-vision coloring comparison (Case# ST029 – taken at dusk time; 640×480 pixels): (a-c) Color RGB, NIR, and LWIR images, respectively; (d-f) The colorized images using channel-based color fusion of (NIR⊕ LWIR), statisticmatching, and histogram-matching, respectively; (g-i) The colorized images using joint-HM, stat-match then joint-HM, and LUT-mapping, respectively. The settings in the color-mappings of (e-i) are source = (d) and target = Fig. 8(b) due to the dark RGB image in (a). Notice that the contrasts of (d-i) were increased by 10%, and (a) was increased by 20%. The

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(a) and (i) were increased by 20% and 10%, respectively.

brightness of (a) and (i) were increased by 20% and 10%, respectively.

direction.

96 New Advances in Image Fusion

pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of Fig. 6, ST029). Notice that their contrasts were increased by 10%. **Figure 8.** Color RGB images for night-vision colorization (taken before sunset time; 640×480 pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of Fig. 6, ST029). Notice that their contrasts were in‐ creased by 10%. Fig. 6, ST029). Notice that their contrasts were increased by 10%.

pixels): (a) from Case# AT002 (target of Fig. 5, AT012); (b) from Case# ST014 (target of

**Fig. 8**. Color RGB images for night-vision colorization (taken before sunset time; 640×480

Visual inspections of colorized images can generally tell which one is better or the best when there are big enough differences between several versions of colorized images. For example, casual inspections may easily confirm that, top 3 methods are SM, SM-JHM, and LUT; HM and JHM are poor; and CBCF is medium. However, the subjective evalutions become more and more difficult with a larger number of color images and also hard with small or diverse differences. In other words, it is hard for subjective evalutions to give an exact order of six colroziation methods. Let us examine the objective evaluations.

**6. Conclusions**

**Acknowledgements**

W911NF-08-1-0404.

**Author details**

**References**

313-317.

Yufeng Zheng1\*, Wenjie Dong2

1 Alcorn State University, USA

3 I-Fusion Technologies, Inc, USA

2 The University of Texas-Pan American, USA

4 US Air Force Research Laboratory, USA

In this chapter, we review six night-vision colorization techniques, a channel-based color fusion (CBCF) procedure; statistic matching (SM), histogram matching (HM), joint histogram matching (JHM), and stat-match then joint-HM (SM-JHM) method, and LUT-based ap‐ proaches. An objective evaluation metric for NV colorization, objective evaluation index (OEI), is introduced. The experimental results with five case analyses showed the order of coloriza‐ tion methods from the best to the worst: SM, SM-JHM, LUT, CBCF, HM, JHM. The order of

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

http://dx.doi.org/10.5772/56948

99

The accurate objective metric such as OEI will help develop, select, and/or tune up a better NV colorization technique. The ideally colorized NV imagery can significantly enhance the night vision targeting by human users and will eventually lead to improved performance of remote

This research is supported by the U. S. Army Research Office under grant number

[1] Alparone, L.; Baronti, S.; Garzelli, A.; & Nencini, F. (2004). A global quality measure‐ ment of Pan-sharpened multispectral imagery, *IEEE Geosci. Remote Sens. Lett.*, 1(4),

and Erik P. Blasch4

, Genshe Chen3

\*Address all correspondence to: Yufeng.Zheng@R2Image.com

objective evaluations comply with the order of subjective evaluations.

sensing, nighttime perception, and situational awareness.

The objective evaluations using the OEI metric defined in Eq. (14) (refer to Section 4) are presented in Table 1 (corresponding to Figs. 3-7 respectively), where the orders of metric values (1 for the smallest OEI) are given within round parentheses. Keep in mind that, the larger the OEI value of a colorized image is, the better quality (i.e., the higher order number) the colorized image has. According to the OEI values in Table 1, the quality order of colorized images varies with figures (cases). To have an overall impression, the sums of the order numbers in five cases (i.e., Figs. 3-7) are calculated and shown at the rightmost column in Table 1. The quality order of each colorization method (6 for the best) is given within the curly brackets. The order of colorization methods from the best to the worst: SM (stat-match), SM-JHM (stat-match then joint-HM), LUT, CBCF (channel-based color fusion), HM (histogram matching), JHM (joint-HM). This order sorted by OEI values is quite consistent with the order of subjective evaluations.


**Table 1.** The OEI (Order) values of six color-mapping methods over five cases shown in Figs. 3-7 (The Sum & {Order} at last colmun is calculated with the orders of five cases).

The subjective evaluations of night vision coloration are based on casual visual inspections. More qualitative measurements, subjective evaluations (by a group of subjects), and statistical analysis will be introduced in the future. The quantitative (objective) evaluations using the objective quality index (OEI) require a reference (daylight) image. Thus we will continuously improve the OEI metric by relaxing the requirement of a reference image. We will further conduct more comprehensive comparisons.

#### **6. Conclusions**

Visual inspections of colorized images can generally tell which one is better or the best when there are big enough differences between several versions of colorized images. For example, casual inspections may easily confirm that, top 3 methods are SM, SM-JHM, and LUT; HM and JHM are poor; and CBCF is medium. However, the subjective evalutions become more and more difficult with a larger number of color images and also hard with small or diverse differences. In other words, it is hard for subjective evalutions to give an exact order of six

The objective evaluations using the OEI metric defined in Eq. (14) (refer to Section 4) are presented in Table 1 (corresponding to Figs. 3-7 respectively), where the orders of metric values (1 for the smallest OEI) are given within round parentheses. Keep in mind that, the larger the OEI value of a colorized image is, the better quality (i.e., the higher order number) the colorized image has. According to the OEI values in Table 1, the quality order of colorized images varies with figures (cases). To have an overall impression, the sums of the order numbers in five cases (i.e., Figs. 3-7) are calculated and shown at the rightmost column in Table 1. The quality order of each colorization method (6 for the best) is given within the curly brackets. The order of colorization methods from the best to the worst: SM (stat-match), SM-JHM (stat-match then joint-HM), LUT, CBCF (channel-based color fusion), HM (histogram matching), JHM (joint-HM). This order sorted by OEI values is quite

> Fig. 5 (AT012)

CBCF (d) 0.4753 (3) 0.5497 (3) 0.5178 (2) 0.5132 (4) 0.5872 (3) 15 {3}

**SM (e)** 0.5470 (6) 0.6022 (5) 0.6058 (6) 0.5529 (5) 0.6337 (6) **28 {6}**

HM (f) 0.4519 (2) 0.4890 (1) 0.3587 (1) 0.5099 (3) 0.5736 (2) 9 {2}

JHM (g) 0.4372 (1) 0.5250 (2) 0.5189 (3) 0.4674 (1) 0.5503 (1) 8 {1}

*SM-JHM (h)* 0.5428 (5) 0.5954 (4) 0.5978 (5) 0.5678 (6) 0.6154 (4) *24 {5}*

LUT (i) 0.5148 (4) 0.6025 (6) 0.5238 (4) 0.4882 (2) 0.6322 (5) 21 {4}

**Table 1.** The OEI (Order) values of six color-mapping methods over five cases shown in Figs. 3-7 (The Sum & {Order} at

The subjective evaluations of night vision coloration are based on casual visual inspections. More qualitative measurements, subjective evaluations (by a group of subjects), and statistical analysis will be introduced in the future. The quantitative (objective) evaluations using the objective quality index (OEI) require a reference (daylight) image. Thus we will continuously improve the OEI metric by relaxing the requirement of a reference image. We will further

Fig. 6 (ST029)

Fig. 7 (ST102)

Sum {Order}

colroziation methods. Let us examine the objective evaluations.

consistent with the order of subjective evaluations.

Fig. 4 (AT009)

(AT008)

last colmun is calculated with the orders of five cases).

conduct more comprehensive comparisons.

Method (Plot) Fig. 3

98 New Advances in Image Fusion

In this chapter, we review six night-vision colorization techniques, a channel-based color fusion (CBCF) procedure; statistic matching (SM), histogram matching (HM), joint histogram matching (JHM), and stat-match then joint-HM (SM-JHM) method, and LUT-based ap‐ proaches. An objective evaluation metric for NV colorization, objective evaluation index (OEI), is introduced. The experimental results with five case analyses showed the order of coloriza‐ tion methods from the best to the worst: SM, SM-JHM, LUT, CBCF, HM, JHM. The order of objective evaluations comply with the order of subjective evaluations.

The accurate objective metric such as OEI will help develop, select, and/or tune up a better NV colorization technique. The ideally colorized NV imagery can significantly enhance the night vision targeting by human users and will eventually lead to improved performance of remote sensing, nighttime perception, and situational awareness.

### **Acknowledgements**

This research is supported by the U. S. Army Research Office under grant number W911NF-08-1-0404.

#### **Author details**

Yufeng Zheng1\*, Wenjie Dong2 , Genshe Chen3 and Erik P. Blasch4


#### **References**

[1] Alparone, L.; Baronti, S.; Garzelli, A.; & Nencini, F. (2004). A global quality measure‐ ment of Pan-sharpened multispectral imagery, *IEEE Geosci. Remote Sens. Lett.*, 1(4), 313-317.

[2] Blasch, E.; Li X.; Chen, G. & Li, W. (2008). Image Quality Assessment for Performance Evaluation of Image Fusion, *Proc. of 11th international conference on Information fusion*, Germany.

[17] Toet, A. & Hogervorst , M.A. (2012). Progress in color night vision," *Opt. Eng*. 51 (1),

The Objective Evaluation Index (OEI) for Evaluation of Night Vision Colorization Techniques

http://dx.doi.org/10.5772/56948

101

[18] Toet, A. & IJspeert, J. K. (2001). Perceptual evaluation of different image fusion schemes, in: I. Kadar (Ed.), *Signal Processing, Sensor Fusion, and Target Recognition X*, The International Society for Optical Engineering, Bellingham, WA, pp.436–441. [19] Tsagaris, V. (2009). Objective evaluation of color image fusion methods, *Opt. Eng.* 48,

[20] Tsagaris, V. & Anastassopoulos, V. (2006) "Global measure for assessing image fu‐

[21] Varga, J. T. (1999). Evaluation of operator performance using true color and artificial color in natural scene perception (Report ADA363036), Naval Postgraduate School,

[22] Wald, L. ; Ranchin, T. ; & Mangolini, M. (1997). Fusion of satellite images of different spatial resolutions: assessing the quality of resulting images, *Photogramm. Eng. Re‐*

[23] Wang, W.; Li, J.; Huang, F.; & Feng, H. (2008). Design and implementation of log-Ga‐ bor filter in fingerprint image enhancement, *Pattern Recognit. Letters*, 29(3), 301-308.

[24] Waxman, A.M.; Gove, A. N. & *et al.* (1996). Progress on color night vision: visible/IR fusion, perception and search, and low-light CCD imaging, *Proc. SPIE Vol. 2736*, pp.

[25] Yuan, Y.; Zhang, J.; Chang, B.; & Han Y. (2011). Objective quality evaluation of visi‐

[26] Zhang, L.; Zhang, L.; Mou, X. & Zhang, D. (2011) FSIM: A Feature Similarity Index for Image Quality Assessment, *IEEE Trans. on Image Processing*, 20 (8), 2378 - 2386. [27] Zheng, Y. (2011). A channel-based color fusion technique using multispectral images

[28] Zheng, Y. (2012). An Overview of Night Vision Colorization Techniques using Multi‐ spectral Images: from Color Fusion to Color Mapping", 2012 International Confer‐ ence on Audio, Language and Image Processing (ICALIP 2012), Shanghai, China. [29] Zheng, Y. & Essock, E. A. (2008). A local-coloring method for night-vision coloriza‐ tion utilizing image analysis and image fusion, *Information Fusion* 9, 186-199.

[30] Zheng, Y.; Essock, E. A. & Hansen, B. C. (2005). An advanced DWT fusion algorithm and its optimization by using the metric of image quality index, *Optical Engineering*

96-107, Enhanced and Synthetic Vision 1996, Jacques G. Verly; Ed.

ble and infrared color fusion image, *Opt. Eng.*, 50(3), 033202.

for night vision enhancement, *Proc. SPIE 8135*, 813511.

010901.

066201.

Monterey, CA.

*mote Sens.* 63( 6), 691-699.

44 (3), 037003-1-12.

sion methods," *Opt. Eng.* 45, 026201.


[17] Toet, A. & Hogervorst , M.A. (2012). Progress in color night vision," *Opt. Eng*. 51 (1), 010901.

[2] Blasch, E.; Li X.; Chen, G. & Li, W. (2008). Image Quality Assessment for Performance Evaluation of Image Fusion, *Proc. of 11th international conference on Information fusion*,

[3] Essock, E. A.; Sinai, M. J. & et al. (1999). Perceptual ability with real-world nighttime scenes: imageintensified, infrared, and fused-color imagery, *Hum. Factors* 41(3), 438–

[4] Fischer, S.; Sroubek, F.; Perrinet, L.; Redondo, R. & Cristoal, G. (2007). Self-invertible

[6] Gonzalez, R. C. & Woods, R. E. (2002). *Digital Image Processing* (Second Edition),

[7] Henriksson, L.; Hyvarinen, A. & Vanni, S. (2009). Representation of cross-frequency spatial phase relationships in human visual cortex, *J. Neuroscience*, 29(45),

[8] Hill, D. L. G. & Batchelor P. (2001). Registration methodology: concepts and algo‐ rithms, in Medical Image Registration, Hajnal, J. V.; Hill, D. L. G.; & Hawkes, D. J.

[9] Hogervorst, M.A. & Toet, A. (2008). Method for applying daytime colors to nighttime

[10] Kovesi, P. (1999) . Image features from phase congruency, Videre*: J. Comp. Vis. Res.*,

[11] Liu, Z.; Blasch, E.; Xue, Z.; Langaniere, R.; & Wu, R (2012). Objective Assessment of Multiresolution Image Fusion Algorithms for Context Enhancement in Night Vision: A Comparative Survey, *IEEE Trans. Pattern Analysis and Machine Intelligence*, 34(1):

[12] Ma, M.; Tian, H.P.; & Hao, C.Y. (2005). New method to quality evaluation for image

[13] Malacara, D. (2002) Color Vision and Colorimetry: Theory and Applications, SPIE

[14] Mancas-Thillou, C. & Gosselin, B. (2006). Character segmentation-by-recognition us‐

[15] Morrone, M.C.; Ross, J.; Burr, D.C.; & Owens, R. (1986) Mach bands are phase de‐

[16] Toet, A. (2003). Natural colour mapping for multiband nightvision imagery, *Informa‐*

fusion using gray relational analysis, *Opt. Eng.* 44, 087010.

ing log-Gabor filters, *Proc. Int. Conf. Pattern Recognition*, 901-904.

2D log-Gabor wavelets, *Int. J. Computer Vision*, 75(2), 231-246.

Prentice Hall, ISBN: 0201180758, Upper Saddle River, NJ.

imagery in realtime, *Proc. SPIE 6974*, 697403.

[5] Gabor, D. (1946). Theory of communication, *J. Inst. Elec. Eng.*, 93(3), 429-457.

Germany.

14342-14351.

1(3), 1-26.

94-109.

Eds, Boca Raton, FL.

Press, Bellingham, WA.

*tion Fusion* 4, 155-166.

pendent, *Nature*, 324 (6049), 250-253.

452.

100 New Advances in Image Fusion


[31] Zheng, Y; Dong, W.; & Blasch, E. (2012). Qualitative and quantitative comparisons of multispectral night vision colorization techniques," Optical Engineering, 51(8), 087004

**Chapter 6**

(1)

(2)

**A Trous Wavelet and Image Fusion**

Additional information is available at the end of the chapter

In 1992, Mallat and Zhong designed a fast algorithm for the orthogonal wavelet transform (OWT) of a discrete signal *f*0(*x*) having finite energy by level filtering with a brace of low-filter

1


1


1


1


*h*(*n*) and high-pass filter *g*(*n*)*.* For the original image *A*0, the OWT can be achieved as:

( , ) ( ) ( ) (2 ,2 )

*A i j hmhnA i m j n*

<sup>ì</sup> <sup>=</sup> - - <sup>ï</sup>

<sup>ï</sup> <sup>=</sup> - - <sup>ï</sup>

<sup>ï</sup> <sup>=</sup> - - <sup>ï</sup>

<sup>ï</sup> <sup>=</sup> - - <sup>ï</sup>

*D ij hmgnA i m j n*

*D i j gmhnA i m j n*

*D ij gmgnA i m j n*

(,) 4 ( ) ( ) , 2 2

é ù - - = + ê ú ë û

é ù - - <sup>+</sup> ê ú ë û

é ù - - <sup>+</sup> ê ú ë û

é ù - - ê ú ë û

© 2013 Chen; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*i m j n A i j hmhnA*

( , ) ( ) ( ) (2 ,2 )

( , ) ( ) ( ) (2 ,2 )

( , ) ( ) ( ) (2 ,2 )

, <sup>~</sup> <sup>1</sup>

å

*r r mn z*

Î

*r r mn z*

Î

*r r mn z*

Î

*r r mn z*

Î

, 1

*r r mn z*

å

Î

4 ( )() , 2 2

4 ( )() , 2 2

4 ( )() , 2 2

2

*i m j n gmhnD*

*r*

*r*

*i m j n hmgnD*

3

*i m j n gmgnD*

*r*

, <sup>~</sup> <sup>2</sup>

å

, ~ ~ <sup>3</sup>

å

,

The reconstruction can be achieved by the inverse OWT (IOWT) as:

å

Shaohui Chen

**1. Introduction**

http://dx.doi.org/10.5772/56947

**1.1. Introduction to the à trous wavelet**

ï

ï í

ï

î

1


,

*mn z*

å

Î

,

*mn z*

å

Î

,

*mn z*

å

Î

[32] Zheng, Y; Reese, K; Blasch, E; & McManamon, P. (2013). Qualitative evaluations and comparisons of six night-vision colorization methods, Proc. SPIE 8745.

## **A Trous Wavelet and Image Fusion**

Shaohui Chen

[31] Zheng, Y; Dong, W.; & Blasch, E. (2012). Qualitative and quantitative comparisons of multispectral night vision colorization techniques," Optical Engineering, 51(8),

[32] Zheng, Y; Reese, K; Blasch, E; & McManamon, P. (2013). Qualitative evaluations and

comparisons of six night-vision colorization methods, Proc. SPIE 8745.

087004

102 New Advances in Image Fusion

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56947

#### **1. Introduction**

#### **1.1. Introduction to the à trous wavelet**

In 1992, Mallat and Zhong designed a fast algorithm for the orthogonal wavelet transform (OWT) of a discrete signal *f*0(*x*) having finite energy by level filtering with a brace of low-filter *h*(*n*) and high-pass filter *g*(*n*)*.* For the original image *A*0, the OWT can be achieved as:

$$\begin{cases} A\_r(i,j) = \sum\_{m,n \in \mathbb{Z}} h(m)h(n)A\_{r-1}(2i-m, 2j-n) \\\\ D\_r^1(i,j) = \sum\_{m,m \in \mathbb{Z}} h(m)g(m)A\_{r-1}(2i-m, 2j-n) \\\\ D\_r^2(i,j) = \sum\_{m,m \in \mathbb{Z}} g(m)h(n)A\_{r-1}(2i-m, 2j-n) \\\\ D\_r^3(i,j) = \sum\_{m,m \in \mathbb{Z}} g(m)g(n)A\_{r-1}(2i-m, 2j-n) \end{cases} \tag{1}$$

The reconstruction can be achieved by the inverse OWT (IOWT) as:

$$\begin{aligned} A\_{r-1}(i,j) &= 4 \sum\_{m,n \in \mathbb{Z}} h(m)h(n) A\_r \left[ \frac{i-m}{2}, \frac{j-n}{2} \right] + \\ 4 \sum\_{m,n \in \mathbb{Z}} h(m)g(n) D\_r^\dagger \left[ \frac{i-m}{2}, \frac{j-n}{2} \right] + \\ 4 \sum\_{m,n \in \mathbb{Z}} g(m)h(n) D\_r^2 \left[ \frac{i-m}{2}, \frac{j-n}{2} \right] + \\ 4 \sum\_{m,n \in \mathbb{Z}} g(m)g(n) D\_r^3 \left[ \frac{i-m}{2}, \frac{j-n}{2} \right] \end{aligned} \tag{2}$$

© 2013 Chen; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In (1) and (2), *r*=1, 2,…, *<sup>N</sup>* denotes the decomposition levels, *<sup>h</sup>*˜(*<sup>m</sup>*) and *g*˜(*m*) are the conjugate filters of *h*(*n*) and *g*(*n*). *Ar* denotes the low frequency component of *A*<sup>0</sup> in horizontal and vertical direction. Similarly, *Dr* 1 , *Dr* 2 , *Dr* <sup>3</sup> respectively denote the horizontal low frequency and vertical high frequency component, the horizontal high frequency and vertical low frequency compo‐ nent, the horizontal high frequency and vertical high frequency component at resolution level *r*. High frequency component represent the detail and edge information while the low frequency component represent the coarse information.

0

planes have the same dimensions as the original image.

a mask of 5×5:

1 4 6 4 1 4 16 24 16 4 6 24 36 24 6 4 16 24 16 4 1 4 6 4 1

(Núñez et al., 1999).

**2.1. Feature space**

**2. Multivalued wavelet transform**

)

1 256 ( 1

=

The RWT of an image is accomplished by a separated fltering following rows and columns, respectively. Specifically, a single wavelet plane is produced at each scale by subtraction of two successive approximations without decimation. Hence, wavelet and approximation

A scaling function which has a B3 cubic spline profile, and its use leads to a convolution with

The RWT method is based on the fact that, in the RWT decomposition, the images are the successive versions of the original image at increasing scales. Thus, the first RWT planes of the high-resolution panchromatic image have spatial information that is not present in the multispectral image. The RWT based image fusion can be carried out using substitution method and additive method. In the wavelet substitution method, some of the RWT planes of the multispectral image are substituted by the RWT planes corresponding to the panchromatic image. In the additive method, the RWT planes of the panchromatic image are added to the

In the substitution method, the RWT planes of the multispectral image are discarded and substituted by the corresponding planes of the panchromatic image. While, in the additive method all the spatial information in the multispectral image is preserved, and the detail information from both sensors is used. The main difference between adding the panchromatic RWT planes to the multispectral images and to the intensity component is that in the first case, high frequency information is added to each multispectral image, while in the latter high frequency information modifies only the intensity. Thus, from the theoretical point of view, adding to the intensity component is a better choice than adding to each multispectral image

Remote sensing image is the carrier of information by sampling the real valued function of space-time about the observed earth's surface. The digital number values of a remote sensing image have multivarious meanings, which include fractal geometry (Liu and Li 1997),

multispectral image or to the intensity component of the multispectral images.

*J J j j*

w

*fx fx x* (5)

A Trous Wavelet and Image Fusion http://dx.doi.org/10.5772/56947 105

() () ()

= + å

As a simple example, the brace of low-filter *h*(*n*) and high-pass filter *g*(*n*) is given by *h*(*n*)=[0.7071, 0.7071], *g*(*n*)=[-0.7071, 0.7071].

The OWT is a popular method used for fusing multisensor images. The OWT decomposes an image with a wavelet basis according to pyramid scheme. The resolution is reduced by onehalf at each level by subsampling data by two. One low frequency component, horizontal, vertical and diagonal detail components are produced at each level. The complete decompo‐ sition produces the same number of pixels as the original image.

The OWT can be used to improve the quality of the fused image. However, some limita‐ tions exist: 1) the OWT is applied to discrete images with sizes that are powers of two, because the resolution is reduced by two at each level. In this sense, it is not possible to fuse images of any sizes; 2) the analysis pixel by pixel is not possible since data are reduced at each resolution, it cannot follow to distinguish the evolution of a dominant feature through levels. 3) no satisfactory rule allowing a good quality of the fusion with the OWT exists (Chibani and Houacine, 2003).

For the OWT, the down-sampled multiresolution analysis does not preserve the transla‐ tion invariance, *i.e.* a translation of the original signal does not necessarily imply a translation of the corresponding wavelet coefficients. Therefore, wavelet coefficients generated by an image discontinuity could disappear arbitrarily. This nonstationarity in the representation is a direct consequence of the downsampling operation. In order to pre‐ serve this property, stationary wavelet transform was introduced (Garzelli 2002). The redundant wavelet transform (RWT) overcomes the limits of the OWT, and allows a great flexibility in defining fusion rules. The RWT can be finished by using *à trous* (holes) algorithm as:

$$\text{l.f.}\,f\_j(\mathbf{x}) = \sum\_n h(n) f\_{j-1}(\mathbf{x} + n\mathbf{2}^{j-1}); j = 1, \dots, J \tag{3}$$

$$\log\_j(\mathbf{x}) = f\_{j-1}(\mathbf{x}) - f\_j(\mathbf{x}) = \sum\_n \mathbf{g}(n) f\_{j-1}(\mathbf{x} + n 2^{j-1}); j = 1, \dots, J \tag{4}$$

The original signal can be reconstructed by adding the set of wavelet coefficients for all scales with the last approximation scale *fJ* (*x*) as

$$f\_0(\mathbf{x}) = f\_J(\mathbf{x}) + \sum\_{j=1}^{J} \alpha\_j(\mathbf{x}) \tag{5}$$

The RWT of an image is accomplished by a separated fltering following rows and columns, respectively. Specifically, a single wavelet plane is produced at each scale by subtraction of two successive approximations without decimation. Hence, wavelet and approximation planes have the same dimensions as the original image.

A scaling function which has a B3 cubic spline profile, and its use leads to a convolution with a mask of 5×5:

$$
\frac{1}{256} \begin{vmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{vmatrix}
$$

In (1) and (2), *r*=1, 2,…, *<sup>N</sup>* denotes the decomposition levels, *<sup>h</sup>*˜(*<sup>m</sup>*) and *g*˜(*m*) are the conjugate filters of *h*(*n*) and *g*(*n*). *Ar* denotes the low frequency component of *A*<sup>0</sup> in horizontal and vertical

high frequency component, the horizontal high frequency and vertical low frequency compo‐ nent, the horizontal high frequency and vertical high frequency component at resolution level *r*. High frequency component represent the detail and edge information while the low

As a simple example, the brace of low-filter *h*(*n*) and high-pass filter *g*(*n*) is given by

The OWT is a popular method used for fusing multisensor images. The OWT decomposes an image with a wavelet basis according to pyramid scheme. The resolution is reduced by onehalf at each level by subsampling data by two. One low frequency component, horizontal, vertical and diagonal detail components are produced at each level. The complete decompo‐

The OWT can be used to improve the quality of the fused image. However, some limita‐ tions exist: 1) the OWT is applied to discrete images with sizes that are powers of two, because the resolution is reduced by two at each level. In this sense, it is not possible to fuse images of any sizes; 2) the analysis pixel by pixel is not possible since data are reduced at each resolution, it cannot follow to distinguish the evolution of a dominant feature through levels. 3) no satisfactory rule allowing a good quality of the fusion with the OWT

For the OWT, the down-sampled multiresolution analysis does not preserve the transla‐ tion invariance, *i.e.* a translation of the original signal does not necessarily imply a translation of the corresponding wavelet coefficients. Therefore, wavelet coefficients generated by an image discontinuity could disappear arbitrarily. This nonstationarity in the representation is a direct consequence of the downsampling operation. In order to pre‐ serve this property, stationary wavelet transform was introduced (Garzelli 2002). The redundant wavelet transform (RWT) overcomes the limits of the OWT, and allows a great flexibility in defining fusion rules. The RWT can be finished by using *à trous* (holes)

1

*f x hn f x n j J* (3)

1

*x f x f x gnf x n j J* (4)

<sup>1</sup> ( ) ( ) ( 2 ); 1, , - - *j j* = += å *<sup>j</sup>* <sup>L</sup>

The original signal can be reconstructed by adding the set of wavelet coefficients for all scales

1 1

(*x*) as

 ( ) ( ) ( ) ( ) ( 2 ); 1, - - - *jj j j* = -= + = å *<sup>j</sup>* <sup>L</sup> *n*

*n*

<sup>3</sup> respectively denote the horizontal low frequency and vertical

direction. Similarly, *Dr*

104 New Advances in Image Fusion

1 , *Dr* 2 , *Dr*

*h*(*n*)=[0.7071, 0.7071], *g*(*n*)=[-0.7071, 0.7071].

exists (Chibani and Houacine, 2003).

w

with the last approximation scale *fJ*

algorithm as:

frequency component represent the coarse information.

sition produces the same number of pixels as the original image.

The RWT method is based on the fact that, in the RWT decomposition, the images are the successive versions of the original image at increasing scales. Thus, the first RWT planes of the high-resolution panchromatic image have spatial information that is not present in the multispectral image. The RWT based image fusion can be carried out using substitution method and additive method. In the wavelet substitution method, some of the RWT planes of the multispectral image are substituted by the RWT planes corresponding to the panchromatic image. In the additive method, the RWT planes of the panchromatic image are added to the multispectral image or to the intensity component of the multispectral images.

In the substitution method, the RWT planes of the multispectral image are discarded and substituted by the corresponding planes of the panchromatic image. While, in the additive method all the spatial information in the multispectral image is preserved, and the detail information from both sensors is used. The main difference between adding the panchromatic RWT planes to the multispectral images and to the intensity component is that in the first case, high frequency information is added to each multispectral image, while in the latter high frequency information modifies only the intensity. Thus, from the theoretical point of view, adding to the intensity component is a better choice than adding to each multispectral image (Núñez et al., 1999).

#### **2. Multivalued wavelet transform**

#### **2.1. Feature space**

Remote sensing image is the carrier of information by sampling the real valued function of space-time about the observed earth's surface. The digital number values of a remote sensing image have multivarious meanings, which include fractal geometry (Liu and Li 1997), raggedness of ground surface (Liu 2000), inner specialties (Eskicioglu and Fisher 1995), definition and contrast (Lu and Healy, Jr. 1994), and edge and boundary-dependent shape segmentation (Nikolov *et al.* 2000). They are displayed by the grey-values, the abstracted spectral reflectance, statistical elements, *e.g.*, mean and variance, the mutual relationship between neighborhood pixels, and grey-values of the same object, respectively. In the follow‐ ing text, these statistical attributes of the original image (*I*) are dissected into seven represen‐ tative features with pseudo-formulae.

1. Setover: Setover (*S*) is an important connection between the specific observation of greyvalue fluctuation and the usual intensity stability. It balances the total oscillation around the center by the absolute bias between each grey-value and the mean *μI* . Simultaneously it improves the confidence and sensitivity to locate abnormity by removing *μI* .

$$S = \left| I - \mu\_I \right| \tag{6}$$

5. Contrast: Contrast (*C*) is another ratio of the difference between the grey-values of the current

identification. Between the contrast and the visibility of an image, a high correlation exists (Li

*I I*

6. Definition: In order to find out where is how change, definition (*D*) is defined with the

predicates that the more abrupt the change is, the clearer the feature of the image becomes.

*i I I m <sup>D</sup>* d

7. Curvature: Curvature (*U*) is a ruler of the deflection extent, and it is rewarded by increasing the accuracy of smoothness or roughness recognition; on the other hand, it is an indicator of salient information that will actually guide the variation finder (Chakraborty *et al.* 1995).

> *i a i*

Apparently, all features are cognate with each other, in other words, when one is high or goes down, so the others appear. Subsequently, a feature vector formed orderly from above seven features can be considered as a paradigm in a mathematical structure called feature space. It is evident that this representation space is beneficial to image processing and analysis technologies at heightening the precision of significance verdict in manner of replacing the

1 1 2 2 0 0

 (( ) - -

= = <sup>=</sup> å å - *M N I i x y*

> *I m <sup>U</sup> m m*

1

d

m

m

*<sup>I</sup> <sup>C</sup>*

minimum *mi* of all grey-values, the current grey-value, and the total deflection *δ<sup>I</sup>*

for magnifying the maximum likelihood of variation-dependent



*I x, y m MN* (12)


<sup>0</sup>*I SV FGCDU* = é ù ë û (14)

. Definition

A Trous Wavelet and Image Fusion http://dx.doi.org/10.5772/56947 107

pixel and the background to *μI*

*ma* is the maximum grey-value.

original image with the feature vector as follows:

*et al.* 2002).

2. Visibility: Visibility (*V*) is defined inspired from the human visual system (Li *et al.* 2002) with *μI* and the standard deviation *σ<sup>I</sup>* . Its each element is the contributive rate scaling local variety. It is equivalent to the deep projection of the corresponding setover onto *σ<sup>I</sup>* .

$$V = \left(\frac{I - \mu\_I}{\sigma\_I}\right)^2\tag{7}$$

3. Flat: The grey-values of a remote sensing image indirectly memorize the reflectance of the scanned groundcover by surveying device. In order to eliminate the possible influence of sunshine, namely the average intensity, flat (*F*) is defined according as each grey-value is divided by *μI* .

$$F = \frac{I}{\mu\_I} \tag{8}$$

4. Gradient: Gradient (*G*) is pictured by the spatial frequency (Eskicioglu and Fisher 1995) following from the fact that the relationship between contiguous grey-values usually implies change. It is the manner that grey-values switch to their neighbors and weighs the overall activity level of image.

$$\mathbf{G} = \left| \frac{\partial I}{\partial m} + \frac{\partial I}{\partial n} \right| \tag{9}$$

*m* and *n* denote the row and column of the image *I*.

5. Contrast: Contrast (*C*) is another ratio of the difference between the grey-values of the current pixel and the background to *μI* for magnifying the maximum likelihood of variation-dependent identification. Between the contrast and the visibility of an image, a high correlation exists (Li *et al.* 2002).

$$C = \left| \frac{I - \mu\_I}{\mu\_I} \right| \tag{10}$$

6. Definition: In order to find out where is how change, definition (*D*) is defined with the minimum *mi* of all grey-values, the current grey-value, and the total deflection *δ<sup>I</sup>* . Definition predicates that the more abrupt the change is, the clearer the feature of the image becomes.

$$D = \left| \frac{I - m\_i}{\delta\_I} \right| \tag{11}$$

$$\left\|\mathcal{S}\right\|\_{I}^{2} = \frac{1}{MN} \sum\_{x=0}^{M-1} \sum\_{y=0}^{N-1} \left(I(\mathbf{x}, y - m\_i)\right)^{2} \tag{12}$$

7. Curvature: Curvature (*U*) is a ruler of the deflection extent, and it is rewarded by increasing the accuracy of smoothness or roughness recognition; on the other hand, it is an indicator of salient information that will actually guide the variation finder (Chakraborty *et al.* 1995).

$$LI = \left| \frac{I - m\_i}{m\_a - m\_i} \right| \tag{13}$$

*ma* is the maximum grey-value.

raggedness of ground surface (Liu 2000), inner specialties (Eskicioglu and Fisher 1995), definition and contrast (Lu and Healy, Jr. 1994), and edge and boundary-dependent shape segmentation (Nikolov *et al.* 2000). They are displayed by the grey-values, the abstracted spectral reflectance, statistical elements, *e.g.*, mean and variance, the mutual relationship between neighborhood pixels, and grey-values of the same object, respectively. In the follow‐ ing text, these statistical attributes of the original image (*I*) are dissected into seven represen‐

1. Setover: Setover (*S*) is an important connection between the specific observation of greyvalue fluctuation and the usual intensity stability. It balances the total oscillation around the

2. Visibility: Visibility (*V*) is defined inspired from the human visual system (Li *et al.* 2002) with

2 *I I*

m

3. Flat: The grey-values of a remote sensing image indirectly memorize the reflectance of the scanned groundcover by surveying device. In order to eliminate the possible influence of sunshine, namely the average intensity, flat (*F*) is defined according as each grey-value is

> *I <sup>I</sup> <sup>F</sup>* m

4. Gradient: Gradient (*G*) is pictured by the spatial frequency (Eskicioglu and Fisher 1995) following from the fact that the relationship between contiguous grey-values usually implies change. It is the manner that grey-values switch to their neighbors and weighs the overall

> *I I <sup>G</sup> m n* ¶ ¶ = +

s

æ ö - <sup>=</sup> ç ÷ è ø . Simultaneously it

(7)

.

.

(6)

<sup>=</sup> (8)

¶ ¶ (9)

. Its each element is the contributive rate scaling local variety.

center by the absolute bias between each grey-value and the mean *μI*

improves the confidence and sensitivity to locate abnormity by removing *μI*

It is equivalent to the deep projection of the corresponding setover onto *σ<sup>I</sup>*

*<sup>I</sup> <sup>V</sup>*

*<sup>I</sup> S I* = m

tative features with pseudo-formulae.

106 New Advances in Image Fusion

*μI* and the standard deviation *σ<sup>I</sup>*

divided by *μI*

.

activity level of image.

*m* and *n* denote the row and column of the image *I*.

Apparently, all features are cognate with each other, in other words, when one is high or goes down, so the others appear. Subsequently, a feature vector formed orderly from above seven features can be considered as a paradigm in a mathematical structure called feature space. It is evident that this representation space is beneficial to image processing and analysis technologies at heightening the precision of significance verdict in manner of replacing the original image with the feature vector as follows:

$$I\_0 = \left[ \begin{array}{c} \text{S } V \text{ } F \text{ } G \text{ } \text{C } D \text{ } U \end{array} \right] \tag{14}$$

#### **2.2. Multivalued wavelet transform**

The multivalued wavelet transform (MWT) employed can be performed by applying the RWT to each feature of *I*0 as

$$\begin{aligned} I\_{j-1}^{\cdot}(\mathbf{x}, \mathbf{y}) &= \sum\_{n} h(n) I\_{j-1}(\mathbf{x} + n 2^{j-1}, \mathbf{y}) \\ I\_{j}(\mathbf{x}, \mathbf{y}) &= \sum\_{n} h(n) I\_{j-1}^{\cdot}(\mathbf{x}, \mathbf{y} + n 2^{j-1}) \\ \alpha\_{j}(\mathbf{x}, \mathbf{y}) &= I\_{j-1}(\mathbf{x}, \mathbf{y}) - I\_{j}(\mathbf{x}, \mathbf{y}); j = 1, \cdots, J \end{aligned} \tag{15}$$

in this paper, although the images were processed and numerically evaluated. The study area is composed of various features such as roads, buildings, trees, etc., ranging in size from less than 5 m up to 50 m. It is obvious that the HRPI has better spatial resolution than the LRMIs and more details can be found from the HRPI. Before the image fusion, the raw LRMIs were

A Trous Wavelet and Image Fusion http://dx.doi.org/10.5772/56947 109

The resolution ratio between the QuickBird HRPI and the LRMIs is 1: 4. Therefore, when performing the à trous based fusion algorithm, à trous filter 2-1/2(1/16, 1/4, 3/8, 1/4, 1/16), together with a decomposition level of two, is employed to abstract the high frequency

**Figure 1.** (a) the original LRMIs at 2.8 m resolution level; (b) the HRPI at 0.7 m resolution level; (c) the HRMIs produced

Visual inspection provides a comprehensive impression of image clarity and the similarity of the original and fused images (Wang et al., 2005). By visually comparing all the HRMIs (Fig. 1(c)) with the LRMI (Fig. 1(a)), it is apparent that the spatial resolutions of the HRMIs are much higher than that of the LRMI. Some small spatial structure details, such as edges, lines, which

resampled to the same pixel size of the HRPI in order to perform image registration.

information of the HRPI. Fused images are shown in Fig. 1(c).

(a) (b)

(c)

from the AWT method

The original feature vector *I*0 can be rebuilt perfectly as

$$I\_0(\mathbf{x}, \mathbf{y}) = I\_J(\mathbf{x}, \mathbf{y}) + \sum\_{j=1}^{J} o\_j(\mathbf{x}, \mathbf{y}) \tag{16}$$

For fusing one multispectral (*T*) image and one panchromatic (*P*) image, the *T* image is first resampled to the pixel size of the *P* image. This fuser that produces the fused image (*F*) is summarized as follows:

$$\left| \begin{array}{c} o\_{P\_{\boldsymbol{\eta}}}^{j} \left( \mathbf{x}, \mathbf{y} \right) = \\\\ o\_{F\_{\boldsymbol{\eta}}}^{j} \left( \mathbf{x}, \mathbf{y} \right) = \begin{cases} o\_{P\_{\boldsymbol{\eta}}}^{j} \left( \mathbf{x}, \mathbf{y} \right) & E \left( \mathbf{x}, \mathbf{y} \right) > 0 \\\\ o\_{P\_{\boldsymbol{\eta}}}^{j} \left( \mathbf{x}, \mathbf{y} \right) + o\_{T\_{\boldsymbol{\eta}}}^{j} \left( \mathbf{x}, \mathbf{y} \right) \mid 2 & E \left( \mathbf{x}, \mathbf{y} \right) = 0 \end{cases} \tag{17}$$

where *E*(*x*, *y*) denotes the value of the electing map at position (*x*, *y*), and *j* is the decomposition level.

#### **3. Example**

#### **3.1. Fusing QuickBird images using à trous wavelet**

The raw images are downloaded from http://studio.gge.unb.ca/UNB/images. These images are acquired by a commercial satellite, QuickBird, which collects one 0.7 m resolution pan‐ chromatic band (450-900 nm) and blue (450-520 nm), green (520-600 nm), red (630-690 nm), near infrared (760-900 nm) bands of 2.8 m resolution. The QuickBird data set was taken over the Pyramid area of Egypt in 2002. The test images of size 1024 by 1024 at the resolution of 0.7 m are cut from the raw images and used as HRPI and LRMIs. Fig. 1(a) displays the LRMIs as a color composite where the red, green, blue bands are mapped into the RGB color space. The HRPI is shown in Fig. 1(b). The near infrared band is not shown because of the limited space in this paper, although the images were processed and numerically evaluated. The study area is composed of various features such as roads, buildings, trees, etc., ranging in size from less than 5 m up to 50 m. It is obvious that the HRPI has better spatial resolution than the LRMIs and more details can be found from the HRPI. Before the image fusion, the raw LRMIs were resampled to the same pixel size of the HRPI in order to perform image registration.

**2.2. Multivalued wavelet transform**

to each feature of *I*0 as

108 New Advances in Image Fusion

summarized as follows:

level.

**3. Example**

w

The multivalued wavelet transform (MWT) employed can be performed by applying the RWT

'

= +

= +

*I x y hnI x n y*

*' j*

*I x y hnI xy n*

( ) ( ) ( ); 1, ,

=- =

*x, y I x, y I x, y j J*

() () ()

= + å

1

1

=

For fusing one multispectral (*T*) image and one panchromatic (*P*) image, the *T* image is first resampled to the pixel size of the *P* image. This fuser that produces the fused image (*F*) is

(,) (,) (,) 0

 w

*x y xy Exy*

w

*i*

*j P*

 w

*i i*

*j j P T*

w

ï

**3.1. Fusing QuickBird images using à trous wavelet**

*i i*

*j j F T*

( ( , ) ( , )) / 2 ( , ) 0

*xy xy Exy*

<sup>ï</sup> + = <sup>î</sup>

where *E*(*x*, *y*) denotes the value of the electing map at position (*x*, *y*), and *j* is the decomposition

The raw images are downloaded from http://studio.gge.unb.ca/UNB/images. These images are acquired by a commercial satellite, QuickBird, which collects one 0.7 m resolution pan‐ chromatic band (450-900 nm) and blue (450-520 nm), green (520-600 nm), red (630-690 nm), near infrared (760-900 nm) bands of 2.8 m resolution. The QuickBird data set was taken over the Pyramid area of Egypt in 2002. The test images of size 1024 by 1024 at the resolution of 0.7 m are cut from the raw images and used as HRPI and LRMIs. Fig. 1(a) displays the LRMIs as a color composite where the red, green, blue bands are mapped into the RGB color space. The HRPI is shown in Fig. 1(b). The near infrared band is not shown because of the limited space

<sup>ì</sup> <sup>&</sup>gt; <sup>ï</sup> <sup>ï</sup> = < <sup>í</sup>

*J J j j*

w

(,) (,) 0

*xy Exy*


1 1

*j - j n*

å

(,) () (

å

*j j n jj j*

(,) () (

1


w

The original feature vector *I*0 can be rebuilt perfectly as

0


, 2)

2 ,)


*j*

L

*I x, y I x, y x, y* (16)

(15)

(17)

The resolution ratio between the QuickBird HRPI and the LRMIs is 1: 4. Therefore, when performing the à trous based fusion algorithm, à trous filter 2-1/2(1/16, 1/4, 3/8, 1/4, 1/16), together with a decomposition level of two, is employed to abstract the high frequency information of the HRPI. Fused images are shown in Fig. 1(c).

**Figure 1.** (a) the original LRMIs at 2.8 m resolution level; (b) the HRPI at 0.7 m resolution level; (c) the HRMIs produced from the AWT method

Visual inspection provides a comprehensive impression of image clarity and the similarity of the original and fused images (Wang et al., 2005). By visually comparing all the HRMIs (Fig. 1(c)) with the LRMI (Fig. 1(a)), it is apparent that the spatial resolutions of the HRMIs are much higher than that of the LRMI. Some small spatial structure details, such as edges, lines, which are not discernible in the LRMI, can be identified individually in each of the HRMIs. Buildings corners, holes, and textures are much sharper in Fig. 1(c) than in Fig. 1(a) and can be seen as clear as in Fig. 1(b). This means that the fusion method can improve the spatial quality of the LRMI during the fusion process.

Compared visually with the original TM image, the spatial discernment of the fused images for the pair of fusers is undoubtedly better. Some small features, such as edges and lines, which are not interpretable in the original TM image can be identified individually in the fused images. Other large features, such as lakes, rivers and blocks, are much sharper than those in the original TM image. These signify that the fuser can assimilate spatial information from the SPOT image. Figure 2(c) shows less retained colours than figure 2(d), and recovery of the original colours is necessary for correct thematic mapping (Chibani and Houacine 2002). For instance, in figure 2(c), all of the green colours shown in the lower left part of figure 2(a) disappear. Second, with regard to clarity, a field of 'spider-web' shape in the left-of-centre part

A Trous Wavelet and Image Fusion http://dx.doi.org/10.5772/56947 111

Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sci‐

[1] Chakraborty, A., Worring, M., and Duncan, J. S., 1995, On multi-feature integration for deformable boundary finding. Proceedings of the Fifth International Conference on Computer Vision, Cambridge, MA, USA, 20-23 June 1995 (Washington: IEEE

[2] Chen, S. H., Su, H. B. and Zhang, R. H., 2008, Feature space and measure metric for fusing multisensor images. International Journal of Remote Sensing, 29, 3257–3270.

[3] Chen S. H., Su H. B., Zhang R. H., Tian J., 2008, Fusing remote sensing images using a` trous wavelet transform and empirical mode decomposition. Pattern Recognition

[4] Chibani, Y., and Houacine, A., 2002, the joint use of IHS transform and redundant wavelet decomposition for fusing multispectral and panchromatic images. Interna‐

[5] Chibani, Y., Houacine, A., 2003, Redundant versus orthogonal wavelet decomposi‐

[6] Eskicioglu, A. M., and Fisher, P. S., 1995, Image quality measures and their perform‐

tion for multisensor image fusion. Pattern Recognition 36, 879-887.

ance. IEEE Transactions on Communications, 43, 2959-2965.

of figure 2(c) displays a 'salt-and-granule' face.

Address all correspondence to: chensh@igsnrr.ac.cn

Computer Society Publications), 846-851.

tional Journal of Remote Sensing, 23, 3821-3833.

Letters 29, 330–342.

**Author details**

ences, Beijing, China

**References**

Shaohui Chen\*

#### **3.2. Fusing TM and SPOT images using multivalued wavelet transform**

In this section, three TM images (TM3=Red, TM4=near Infrared, TM5=Infrared) with 171×171 pixels and one SPOT image with 5120×5120 m2 are fused using the MWT. For the MWT fuser, the three TM images are interpolated to 10 m pixel size in advance, the SPOT image, the TM image and their feature sequences are decomposed with RWT into three levels, and then the voting and electing fuser is fulfilled from the first to the third level. Figure 2(a), 2(b), and 2(c) exhibit the original TM image as a colour composite where TM3, TM4 and TM5 are coded in blue, green and red, the SPOT image, the fused image, respectively.

Compared visually with the original TM image, the spatial discernment of the fused images for the pair of fusers is undoubtedly better. Some small features, such as edges and lines, which are not interpretable in the original TM image can be identified individually in the fused images. Other large features, such as lakes, rivers and blocks, are much sharper than those in the original TM image. These signify that the fuser can assimilate spatial information from the SPOT image. Figure 2(c) shows less retained colours than figure 2(d), and recovery of the original colours is necessary for correct thematic mapping (Chibani and Houacine 2002). For instance, in figure 2(c), all of the green colours shown in the lower left part of figure 2(a) disappear. Second, with regard to clarity, a field of 'spider-web' shape in the left-of-centre part of figure 2(c) displays a 'salt-and-granule' face.

### **Author details**

Shaohui Chen\*

are not discernible in the LRMI, can be identified individually in each of the HRMIs. Buildings corners, holes, and textures are much sharper in Fig. 1(c) than in Fig. 1(a) and can be seen as clear as in Fig. 1(b). This means that the fusion method can improve the spatial quality of the

In this section, three TM images (TM3=Red, TM4=near Infrared, TM5=Infrared) with 171×171 pixels and one SPOT image with 5120×5120 m2 are fused using the MWT. For the MWT fuser, the three TM images are interpolated to 10 m pixel size in advance, the SPOT image, the TM image and their feature sequences are decomposed with RWT into three levels, and then the voting and electing fuser is fulfilled from the first to the third level. Figure 2(a), 2(b), and 2(c) exhibit the original TM image as a colour composite where TM3, TM4 and TM5 are coded in

**Figure 2.** (a) The original TM images at 30 m resolution. (b) The SPOT image at 10 m resolution. (c) The fused images

**3.2. Fusing TM and SPOT images using multivalued wavelet transform**

blue, green and red, the SPOT image, the fused image, respectively.

(a) (b)

(c)

at 10 m resolution.

LRMI during the fusion process.

110 New Advances in Image Fusion

Address all correspondence to: chensh@igsnrr.ac.cn

Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sci‐ ences, Beijing, China

#### **References**


[7] Garzelli, A., 2002, Possibilities and limitations of the use of wavelets in image fusion. In Proceedings of IEEE International Geoscience and Remote Sensing Symposium, 24-28 June 2002, Toronto, Ontario, Canada (Washington: IEEE Inc.), 1, pp. 66–68.

**Chapter 7**

**Image Fusion Based on Shearlets**

Miao Qiguang, Shi Cheng and Li Weisheng

Additional information is available at the end of the chapter

Image decomposition is important to image fusion and affects the information extraction quality, even the whole fusion quality. Wavelet theory has been developed since the begin‐ ning of the last century. It was first applied to signal processing in the 1980's[1], and over the past decade it has been recognized as having great potential in image processing appli‐ cations, as well as in image fusion[2]. Wavelet transforms are more useful than Fourier transforms, and it is efficient in dealing with one-dimensional point-wise smooth signal [3-5]. However the limitations of the direction make it not perform well for multidimension‐ al data. Images contain sharp transition such as edges, and wavelet transforms are not opti‐

Recently, a theory for multidimensional data called multi-scale geometric analysis (MGA) has been developed. Many MGA tools were proposed, such as ridgelet, curvelet, bandelet, contourlet, etc [6-9]. The new MGA tools provide higher directional sensitivity than wave‐ lets. Shearlets, a new approach provided in 2005, possess not only all above properties, but equipped with a rich mathematical structure similar to wavelets, which are associated to a multiresolution analysis. The shearlets form a tight frame at various scales and directions, and are optimally sparse in representing images with edges. Only the curvelets has the simi‐ lar properties with shearlets [10-14]. But the construction of curvelets is not built directly in the discrete domain and it does not provide a multiresolution representation of the geome‐ try. The decomposition of shearlets is similar to contourlets, while the contourlet transform consists of an application of the Laplacian pyramid followed by directional filtering, for shearlets, the directional filtering is replaced by a shear matrix. An important advantage of the shearlet transform over the contourlet transform is that there are no restrictions on the

> © 2013 Qiguang et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

http://dx.doi.org/10.5772/56945

mally efficient in representing them.

direction numbers. [15-19]

**1. Introduction**


### **Chapter 7**

## **Image Fusion Based on Shearlets**

Miao Qiguang, Shi Cheng and Li Weisheng

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/56945

### **1. Introduction**

[7] Garzelli, A., 2002, Possibilities and limitations of the use of wavelets in image fusion. In Proceedings of IEEE International Geoscience and Remote Sensing Symposium, 24-28 June 2002, Toronto, Ontario, Canada (Washington: IEEE Inc.), 1, pp. 66–68. [8] Li, S. T., Kwok, J. T., and Wang, Y. N., 2002, Multifocus image fusion using artificial

[9] Liu, J. G., 2000, Smoothing filter-based intensity modulation: a spectral preserve im‐ age fusion technique for improving spatial details. International Journal of Remote

[10] Liu, Y. X., and Li, Y. D., 1997, New approaches of multifractal image analysis. Pro‐ ceedings of International Conference on Information, Communications and Signal Processing, Singapore, 9-12 Sept. 1997 (New York: IEEE Publications), 2, 970-974. [11] Lu, J., and Healy, JR., D. M., 1994, Contrast enhancement via multiscale gradient transformation. Proceedings of IEEE International Conference on Image processing, Texas, USA, 13-16 Nov. 1994 (CA, USA: IEEE Computer Society Publications), 2,

[12] Mallat, S. G., and Zhong, S., 1992, Characterization of signals from multiscale edges.

[13] Nikolov, S. G., Bull, D. R., and Canagarajah, C. N., 2000, 2-D image fusion by multi‐ scale edge graph combination. Proceedings of the Third International Conference on Information Fusion, Paris, France, 10-13 July 2000 (CA, USA: International Society of

[14] Núñez J., Otazu, X., et al., 1999, Multiresolution-based image fusion with additive wavelet decomposition, IEEE Transactions on Geoscience and Remote Sensing, vol.

[15] Wang, Z. J., Ziou, D., Armenakis, C., Li, D. R., and Li, Q. Q., 2005, A comparative analysis of image fusion methods. IEEE Transactions on Geoscience and Remote

*IEEE Transaction Pattern Anaysis Machine Intelligence*, 14, 710-732.

Information Fusion Publications), 1, MOD3/16- MOD3/22.

neural networks. Pattern Recognition Letters, 23, 985-997.

Sensing, 21, 3461-3472.

482-486.

112 New Advances in Image Fusion

37, no. 3, 1204-1211.

Sensing, 43, 1391-1402.

Image decomposition is important to image fusion and affects the information extraction quality, even the whole fusion quality. Wavelet theory has been developed since the begin‐ ning of the last century. It was first applied to signal processing in the 1980's[1], and over the past decade it has been recognized as having great potential in image processing appli‐ cations, as well as in image fusion[2]. Wavelet transforms are more useful than Fourier transforms, and it is efficient in dealing with one-dimensional point-wise smooth signal [3-5]. However the limitations of the direction make it not perform well for multidimension‐ al data. Images contain sharp transition such as edges, and wavelet transforms are not opti‐ mally efficient in representing them.

Recently, a theory for multidimensional data called multi-scale geometric analysis (MGA) has been developed. Many MGA tools were proposed, such as ridgelet, curvelet, bandelet, contourlet, etc [6-9]. The new MGA tools provide higher directional sensitivity than wave‐ lets. Shearlets, a new approach provided in 2005, possess not only all above properties, but equipped with a rich mathematical structure similar to wavelets, which are associated to a multiresolution analysis. The shearlets form a tight frame at various scales and directions, and are optimally sparse in representing images with edges. Only the curvelets has the simi‐ lar properties with shearlets [10-14]. But the construction of curvelets is not built directly in the discrete domain and it does not provide a multiresolution representation of the geome‐ try. The decomposition of shearlets is similar to contourlets, while the contourlet transform consists of an application of the Laplacian pyramid followed by directional filtering, for shearlets, the directional filtering is replaced by a shear matrix. An important advantage of the shearlet transform over the contourlet transform is that there are no restrictions on the direction numbers. [15-19]

© 2013 Qiguang et al.; licensee InTech. This is a paper distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In recent years, the theory of the shearlets, which is used in image processing, has been stud‐ ied gradually. Now the applications of shearlets are mainly in image denoising, sparse im‐ age representation [20] and edge detection [21, 22]. Its applications in image fusion are still under exploring.

#### **2. Shearlets [12, 20]**

#### **2.1. The theory of Shearlets**

In dimension *n* =2, the affine systems with composite dilations are defined as follows.

$$A\_{AS}(\boldsymbol{\wp}) = \langle \boldsymbol{\wp}\_{j,l,k}(\mathbf{x}) = |\det A|^{j/2} \left. \boldsymbol{\wp}(\mathbf{S}^l A^j \mathbf{x} - k) : j, l \in \mathfrak{G}, k \in \mathfrak{G}^2 \right\rangle \tag{1}$$

(0) (0) 2

ˆ ˆ ˆˆ () ( , ) ( ) ( )

y x y xx

<sup>1</sup>∈*<sup>C</sup> <sup>∞</sup>*(ℝ) is a wavelet, and supp*<sup>ψ</sup>*

^ (0)

1 0

2 2

yw

*j*

³

2 1

*j*

=-

2 1 2 1

*j j*

*j j*

³ ³ =- =-

*j j l l*


*l*

(0)

y x *j*

yw

<sup>2</sup>⊂ −1,1 . This implies *ψ*

**Figure 1.** frequency support of the shearlets

In addition, we assume that

Eq. (3) and (4) imply that

Where *ψ* ^

and for ∀ *j* ≥0

supp*ψ* ^

12 11 2

 y xy x

^

∈*C <sup>∞</sup>*(ℝ), and supp*ψ*


2

<sup>ˆ</sup> -

å -= £

0 0 1 12 0 0 2 2 1

å å =å å - =


There are several examples of functions *ψ*1, *ψ*2 satisfying the properties described above.


y

*j l j j*

 w

> w

2 2 2 2 2

 xy

2 2

ˆ -

å = ³ *<sup>j</sup>*

1

= = (2)

<sup>1</sup>⊂ −1/2, −1/16 ∪ 1/16, 1/2 ; *ψ*

*<sup>j</sup>* (3)

⊂ −1/2, 1/2 <sup>2</sup>

*l* (4)

x

*A S l* (5)

x

.

^

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 115

<sup>2</sup>∈*<sup>C</sup> <sup>∞</sup>*(ℝ), and

x

^ (0)

Where *ψ* ∈ *L* <sup>2</sup> (ℝ<sup>2</sup> ), *A*, *S* are both 2×2 invertible matrices, and |det*S* | =1, the elements of this system are called composite wavelet if *AAS* (*ψ*) forms a tight frame for *<sup>L</sup>* <sup>2</sup> (ℝ<sup>2</sup> ).

$$\sum\_{j,l,k} |\_{\prime\prime} \psi\_{j,l,k}>|^2 = \|f\|^2$$

Let A denote the parabolic scaling matrix and S denote the shear matrix. For each *a* >0 and *s* ∈ℝ,

$$A = \begin{pmatrix} a & 0 \\ 0 & \sqrt{a} \end{pmatrix}, S = \begin{pmatrix} 1 & s \\ 0 & 1 \end{pmatrix}.$$

The matrices described above have the special roles in shearlet transform. The first matrix ( *a* 0 0 *a* ) controls the 'scale' of the shearlets, by applying a fine dilation faction along the two axes, which ensures that the frequency support of the shearlets becomes increasingly elon‐ gated at finer scales. The second matrix ( 1 *s* 0 1 ), on the other hand, is not expansive, and only controls the orientation of the shearlets. The size of frequency support of the shearlets is il‐ lustrated in Fig. 1 for some particular values of a and s.

*ψj*,*l*,*k* for different values of a and s.

In references [12], assume *a* =4, *s* =1, where *A*= *A*0 is the anisotropic dilation matrix and *S* =*S*<sup>0</sup> is the shear matrix, which are given by

$$A\_0 = \begin{pmatrix} 4 & 0 \\ 0 & 2 \end{pmatrix}, \qquad S\_0 = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}.$$

For ∀*ξ* =(*ξ*1, *ξ*2)∈ℝ ^ 2 , *ξ*<sup>1</sup> ≠0, let *ψ* ^ (0) (*ξ*) be given by

**Figure 1.** frequency support of the shearlets

$$
\hat{\psi}^{(0)}(\xi) = \hat{\psi}^{(0)}(\xi\_1, \xi\_2) = \hat{\psi}\_1(\xi\_1)\hat{\psi}\_2(\frac{\tilde{\xi}\_2}{\tilde{\xi}\_1}) \tag{2}
$$

Where *ψ* ^ <sup>1</sup>∈*<sup>C</sup> <sup>∞</sup>*(ℝ) is a wavelet, and supp*<sup>ψ</sup>* ^ <sup>1</sup>⊂ −1/2, −1/16 ∪ 1/16, 1/2 ; *ψ* ^ <sup>2</sup>∈*<sup>C</sup> <sup>∞</sup>*(ℝ), and supp*ψ* ^ <sup>2</sup>⊂ −1,1 . This implies *ψ* ^ (0) ∈*C <sup>∞</sup>*(ℝ), and supp*ψ* ^ (0) ⊂ −1/2, 1/2 <sup>2</sup> .

In addition, we assume that

$$\sum\_{j\geq 0} |\hat{\varphi}\_1(\mathcal{Z}^{-2j}o)|^2 = 1, \qquad |\ o \geq 1/8 \tag{3}$$

and for ∀ *j* ≥0

In recent years, the theory of the shearlets, which is used in image processing, has been stud‐ ied gradually. Now the applications of shearlets are mainly in image denoising, sparse im‐ age representation [20] and edge detection [21, 22]. Its applications in image fusion are still

In dimension *n* =2, the affine systems with composite dilations are defined as follows.

, , ( ) { ( ) | det | (S ) : , , }

Let A denote the parabolic scaling matrix and S denote the shear matrix. For each *a* >0 and

The matrices described above have the special roles in shearlet transform. The first matrix

axes, which ensures that the frequency support of the shearlets becomes increasingly elon‐

controls the orientation of the shearlets. The size of frequency support of the shearlets is il‐

In references [12], assume *a* =4, *s* =1, where *A*= *A*0 is the anisotropic dilation matrix and *S* =*S*<sup>0</sup>

(*ξ*) be given by

) controls the 'scale' of the shearlets, by applying a fine dilation faction along the two

y

this system are called composite wavelet if *AAS* (*ψ*) forms a tight frame for *<sup>L</sup>* <sup>2</sup>

/2 2

), *A*, *S* are both 2×2 invertible matrices, and |det*S* | =1, the elements of

(ℝ<sup>2</sup> ).

), on the other hand, is not expansive, and only


under exploring.

114 New Advances in Image Fusion

Where *ψ* ∈ *L* <sup>2</sup>

∑ *j*,*l*,*k*

*s* ∈ℝ,

*a* 0 0 *a*

*A*=(

( *a* 0 0 *a*

*A*<sup>0</sup> =(

For ∀*ξ* =(*ξ*1, *ξ*2)∈ℝ

**2. Shearlets [12, 20]**

**2.1. The theory of Shearlets**

y

(ℝ<sup>2</sup>

<sup>|</sup> <sup>&</sup>lt; *<sup>f</sup>* , *<sup>ψ</sup>j*,*l*,*<sup>k</sup>* <sup>&</sup>gt; <sup>|</sup> <sup>2</sup> <sup>=</sup> *<sup>f</sup>* <sup>2</sup>

),*S* =(

1 *s* 0 1 ).

gated at finer scales. The second matrix (

*ψj*,*l*,*k* for different values of a and s.

is the shear matrix, which are given by

^ 2

), *S*<sup>0</sup> =(

lustrated in Fig. 1 for some particular values of a and s.

, *ξ*<sup>1</sup> ≠0, let *ψ*

^ (0)

 y= =

$$\sum\_{l=-2^{l}}^{2^{l}-1} |\hat{\varphi}\_{2}(\mathfrak{L}^{j}\,\boldsymbol{\alpha}-l)|^{2} = 1, \qquad |\,\boldsymbol{\alpha}\,|\leq l\tag{4}$$

There are several examples of functions *ψ*1, *ψ*2 satisfying the properties described above. Eq. (3) and (4) imply that

$$\sum\_{j\geq 0} \sum\_{l=-2^{j}}^{2^{j}-1} |\hat{\psi}^{(0)}(\xi A\_{0}^{-j} S\_{0}^{-l})|^{2} = \sum\_{j\geq 0} \sum\_{l=-2^{j}}^{2^{j}-1} |\hat{\psi}\_{1}(\mathbb{2}^{-2} \,\_{l} \tilde{\xi}\_{1})|^{2} |\,\_{l} \hat{\psi}\_{2}(\mathbb{2}^{j} \,\_{l} \frac{\tilde{\xi}\_{2}}{\tilde{\xi}\_{1}} - l)|^{2} = 1,\tag{5}$$

for any (*ξ*1, *ξ*2)∈*D*0, where *D*<sup>0</sup> ={(*ξ*1, *ξ*2)∈ℝ ^ <sup>2</sup> : <sup>|</sup> *<sup>ξ</sup>*<sup>1</sup> <sup>|</sup> <sup>≥</sup>1/8, <sup>|</sup> *<sup>ξ</sup>*<sup>2</sup> <sup>|</sup> <sup>≤</sup>1}, the functions {*ψ* ^ (0) (*ξA*<sup>0</sup> − *j S*0 −*l* )} form a tiling of *D*0. This is illustrated in Fig.2 (a). This property described above implies that the collection

$$\mathbb{P}\left\{\boldsymbol{\nu}^{(0)}\_{j,l,k}(\mathbf{x}) = 2^{\frac{3-j}{2}}\boldsymbol{\nu}^{(0)}(S^{l}\_{0}A^{j}\_{0}\mathbf{x} - k) : j \ge 0, -2^{j} \le l \le 2^{j} - 1, k \in \mathfrak{C}^{2}\right\} \tag{6}$$

and *ψ* ^ (1)

(*ξ*)=*ψ* ^ (1)

Parseval frame for *L* <sup>2</sup>

frame for *L* <sup>2</sup>

proximation error is

*<sup>ε</sup><sup>M</sup>* <sup>≤</sup>*CM* <sup>−</sup><sup>1</sup>

*l* = −2 *<sup>j</sup>*

**2.2. Discrete Shearlets**

, ⋯, 2 *<sup>j</sup>*

−1, Let

x y

y

(*R* <sup>2</sup>

(*ξ*1, *ξ*2)=*ψ* ^ <sup>1</sup>(*ξ*2)*ψ* ^ 2( *ξ*1 *ξ*2

(*D*1)∨ is as follows,

3

*j*

 y ), where *ψ* ^

(1) (1) <sup>2</sup> <sup>2</sup> , , <sup>1</sup> <sup>1</sup> { ( ) 2 ( ) : 0, 2 2 1, }.

m

*M*

Where *IM* is the index set of the *M* largest inner products | < *f* , *ψ<sup>μ</sup>*

*M M*

The approximation error of Fourier approximations is *ε<sup>M</sup>* <sup>≤</sup>*CM* <sup>−</sup>1/2

and this quantity approaches asymptotically zero as *M* increases.

e

than Fourier and Wavelet approximations.

Î

*I*

=- = å

, and the approximation error of Shearlets is *ε<sup>M</sup>* ≤*C*(log*M* )

is more suitable to derive numerical implementation. For *ξ* =(*ξ*1, *ξ*2)∈*R*

2 1

cxy

1 2

1 2

cxy

*W l ll*

2 2

*j l D D*

0 2 1 , 2 2

2

ˆ (2 )

<sup>ï</sup> - ïî

*j*

x

x

x

x

x

x

1

2

y

y

ì

ï

ï ï = - ³ - ££ - Î

*l j j j*

To make this discussion more rigorous, it will be useful to examine this problem from the point of view of approximation theory. If *F* ={*ψ<sup>μ</sup>* :*μ* ∈*I*} is a bas is or, more generally, a tight

), then an image f can be approximated by the partial sums

y y

<sup>2</sup> <sup>2</sup> || || | , | ,

m

Ï

It will be convenient to describe the collection of shearlets presented above in a way which

0 1

 x

 x

*jj j*

 x

 x

*jj j D D*

*l ll*

 cx

 cx

*l otherwise*

ˆ ˆ (2 ) ( ) (2 1) ( ) 2

ï - + - + =-

0 1

( ) (2 ) ( ) (2 -1) ( ) 2 1 ˆ ˆ

ïï = - + - =- <sup>í</sup>

*I*

*M*

m

 y

=< > å *M*

, , m m

<sup>1</sup> and *ψ* ^

*jlk x SAx k j l k* ¢ (9)

*f f* (10)

*ff f* (11)

3 *M* <sup>−</sup><sup>2</sup>

2 are defined as (2) and (3), then the

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 117

> |. The resulting ap‐

, of the Wavelet is

^ 2

, which is better

, *j* ≥0 and

(12)

is a Parseval frame for *L* <sup>2</sup> (*D*0)<sup>∨</sup> ={ *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>* <sup>2</sup> (ℝ<sup>2</sup> ): supp *f* ^ <sup>⊂</sup>*D*0}. And from the conditions on the support of *ψ* ^ <sup>1</sup> and *ψ* ^ 2 one can easily observe that the function *ψj*,*l*,*k* have frequency support,

$$\text{supp}\,\mathbf{p}\hat{\boldsymbol{\nu}}\_{j,\mathbf{k},\mathbf{l}}^{(0)} \in \{ (\xi\_1, \xi\_2) : \xi\_1 \in [-2^{2j-1}, -2^{2j-4}] \cup [2^{2j-4}, 2^{2j-1}], |\frac{\tilde{\xi\_2}}{\tilde{\xi\_1}} + l2^{-j}| \le 2^{-j} \}\tag{7}$$

That is, each element *ψ* ⌢ *<sup>j</sup>*,*l*,*k* is support on a pair of trapezoids, of approximate size 2<sup>2</sup> *<sup>j</sup>* ×2 *<sup>j</sup>* , oriented along lines of slope *l*2<sup>−</sup> *<sup>j</sup>* .(see Fig.2 (b)).

**Figure 2.** (a) The tiling of the frequency by the shearlets; (b) The size of the frequency support of a shearlet ψ *<sup>j</sup>*,*l*,*<sup>k</sup>* .

Similarly we can construct a Parseval frame for *L* <sup>2</sup> (*D*1)<sup>∨</sup>, where *D*1 is the vertical cone,

$$D\_1 = \{ (\xi\_1', \xi\_2') \in \hat{\mathbb{I}} \: \: \vert \: \xi\_2' \rangle \ge 1/8, \vert \: \frac{\xi\_1'}{\xi\_2} \vert \le \text{l} \}, \tag{8}$$

Let

$$A\_1 = \begin{pmatrix} 2 & 0 \\ 0 & 4 \end{pmatrix}, \qquad S\_1 = \begin{pmatrix} 1 & 0 \\ 1 & 1 \end{pmatrix}.$$

and *ψ* ^ (1) (*ξ*)=*ψ* ^ (1) (*ξ*1, *ξ*2)=*ψ* ^ 1(*ξ*2)*ψ* ^ 2( *ξ*1 *ξ*2 ), where *ψ* ^ <sup>1</sup> and *ψ* ^ 2 are defined as (2) and (3), then the Parseval frame for *L* <sup>2</sup> (*D*1)<sup>∨</sup> is as follows,

$$\mathbb{P}\left(\boldsymbol{\Psi}\_{j,l,k}^{(\mathrm{I})}(\mathbf{x})=\mathbf{2}^{\frac{3j}{2}}\boldsymbol{\Psi}^{(\mathrm{I})}(\mathrm{S}\_{1}^{l}A\_{1}^{j}\mathbf{x}-k)\colon j\geq 0, -2^{j}\leq l\leq 2^{j}-1, k\in\mathsf{g}^{2}\right).\tag{9}$$

To make this discussion more rigorous, it will be useful to examine this problem from the point of view of approximation theory. If *F* ={*ψ<sup>μ</sup>* :*μ* ∈*I*} is a bas is or, more generally, a tight frame for *L* <sup>2</sup> (*R* <sup>2</sup> ), then an image f can be approximated by the partial sums

$$f\_M = \sum\_{\mu \neq I\_M}  \psi\_{\mu}, \tag{10}$$

Where *IM* is the index set of the *M* largest inner products | < *f* , *ψ<sup>μ</sup>* > |. The resulting ap‐ proximation error is

$$\varepsilon\_M = \left\| \left| f - f\_M \right| \right\|^2 = \sum\_{\mu \neq I\_M} \left| f, \left| \nu\_{\mu} \right|^2 \right. \tag{11}$$

and this quantity approaches asymptotically zero as *M* increases.

The approximation error of Fourier approximations is *ε<sup>M</sup>* <sup>≤</sup>*CM* <sup>−</sup>1/2 , of the Wavelet is *<sup>ε</sup><sup>M</sup>* <sup>≤</sup>*CM* <sup>−</sup><sup>1</sup> , and the approximation error of Shearlets is *ε<sup>M</sup>* ≤*C*(log*M* ) 3 *M* <sup>−</sup><sup>2</sup> , which is better than Fourier and Wavelet approximations.

#### **2.2. Discrete Shearlets**

for any (*ξ*1, *ξ*2)∈*D*0, where *D*<sup>0</sup> ={(*ξ*1, *ξ*2)∈ℝ

3

12 1

 x

 xx

⌢

Similarly we can construct a Parseval frame for *L* <sup>2</sup>

*j*

 y

(*D*0)<sup>∨</sup> ={ *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>* <sup>2</sup>

above implies that the collection

y

(0) j,k,l

oriented along lines of slope *l*2<sup>−</sup> *<sup>j</sup>*

y

That is, each element *ψ*

is a Parseval frame for *L* <sup>2</sup>

^ <sup>1</sup> and *ψ* ^

116 New Advances in Image Fusion

support of *ψ*

Let

*A*<sup>1</sup> =(

), *S*<sup>1</sup> =(

{*ψ* ^ (0) (*ξA*<sup>0</sup> − *j S*0 −*l* ^ <sup>2</sup> : <sup>|</sup> *<sup>ξ</sup>*<sup>1</sup> <sup>|</sup> <sup>≥</sup>1/8, <sup>|</sup> *<sup>ξ</sup>*<sup>2</sup> <sup>|</sup> <sup>≤</sup>1}, the functions

^ <sup>⊂</sup>*D*0}. And from the conditions on the

×2 *<sup>j</sup>* ,

)} form a tiling of *D*0. This is illustrated in Fig.2 (a). This property described

*jlk x SAx k j l k* ¢ (6)

2 one can easily observe that the function *ψj*,*l*,*k* have frequency support,


*<sup>j</sup>*,*l*,*k* is support on a pair of trapezoids, of approximate size 2<sup>2</sup> *<sup>j</sup>*

1

(*D*1)∨, where *D*1 is the vertical cone,

x

x

(0) (0) <sup>2</sup> <sup>2</sup> , , <sup>0</sup> <sup>0</sup> { ( ) 2 ( ) : 0, 2 2 1, }

(ℝ<sup>2</sup>

supp {( , ) : [ 2 , 2 ] [2 ,2 ],| 2 | 2 } ˆ

.(see Fig.2 (b)).

**Figure 2.** (a) The tiling of the frequency by the shearlets; (b) The size of the frequency support of a shearlet ψ *<sup>j</sup>*,*l*,*<sup>k</sup>* .

1 12 2

xx

2 1

ˆ {( , ) :| | 1/8,| | 1},

 x

2

*D* = γ £ ¡ (8)

x

x

= - ³ - ££ - Î

*l j j j*

): supp *f*

21 24 24 21 2

It will be convenient to describe the collection of shearlets presented above in a way which is more suitable to derive numerical implementation. For *ξ* =(*ξ*1, *ξ*2)∈*R* ^ 2 , *j* ≥0 and *l* = −2 *<sup>j</sup>* , ⋯, 2 *<sup>j</sup>* −1, Let

$$W\_{j,l}^{0}(\xi) = \begin{cases} \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{2}}{\xi\_{1}} - l)\chi\_{D\_{0}}(\xi) + \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{1}}{\xi\_{2}} - l + 1)\chi\_{D\_{l}}(\xi) & l = -2^{j} \\ \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{2}}{\xi\_{1}} - l)\chi\_{D\_{0}}(\xi) + \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{1}}{\xi\_{2}} - l \cdot 1)\chi\_{D\_{l}}(\xi) & l = 2^{j} - 1 \\ \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{2}}{\xi\_{1}} - l) & \text{otherwise} \end{cases} \tag{12}$$

$$W\_{j,l}^{1}(\xi) = \begin{cases} \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{2}}{\tilde{\xi}\_{1}} - l + 1)\chi\_{D\_{0}}(\xi) + \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{1}}{\tilde{\xi}\_{2}} - l)\chi\_{D\_{l}}(\xi) & l = -2^{j} \\ \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{2}}{\tilde{\xi}\_{1}} - l - 1)\chi\_{D\_{0}}(\xi) + \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{1}}{\tilde{\xi}\_{2}} - l)\chi\_{D\_{l}}(\xi) & l = 2^{j} - 1 \\ \hat{\wp}\_{2}(2^{j}\frac{\tilde{\xi}\_{1}}{\tilde{\xi}\_{2}} - l) & otherwise \end{cases} \tag{13}$$

**3. Multi-focus image fusion based on Shearlets**

*3.1.1. Image decomposition*

of multi-direction and multi-scale.

**Figure 3.** Image decomposition framework with shearlets

used in multi-scale decomposition with *j* =5.

*3.1.2. Image fusion*

fusion are adopted.

**3.1. Algorithm framework of multi-focus image fusion using Shearlets**

**1.** Multi-direction decomposition of image using shear matrix *S*0 or *S*1.

**2.** Multi-scale decompose of each direction using wavelet packets decomposition.

2(*l* + 2) + 2. The framework of Image decomposition with shearlets is shown in Fig. 3.

Image decomposition based on shearlet transform is composed by two parts, decomposition

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 119

In step (1), if the image is decomposed only by *S*0, or by *S*1, the number of the directions is 2(*l* + 1) + 1. If the image is decomposed both by *S*0 and *S*1, the number of the directions is

Image fusion framework based on shearlets is shown in Fig. 4. The following steps of image

**2.** Transform the original images using shearlets. Both horizontal and vertical cones are adopted in this method. The number of the directions is 6. Then the wavelet packets are

**1.** The two images taking part in the fusion are geometrically reg is tered to each other.

Where *ψ*2, *<sup>D</sup>*0, *<sup>D</sup>*1 are defined in section 2. For −<sup>2</sup> *<sup>j</sup>* ≤*l* ≤2 *<sup>j</sup>* −2, each term *W <sup>j</sup>*,*<sup>l</sup>* (*d* ) (*ξ*) is a window function localized on a pair of trapezoids, as illustrated in fig.1a. When *l* = −2 *<sup>j</sup>* or *l* =2 *<sup>j</sup>* −1, at the junction of the horizontal cone *D*0 and the vertical cone, *W <sup>j</sup>*,*<sup>l</sup>* (*d* ) (*ξ*) is the superposition of two such function.

Using this notation, for *j* ≥0, −2 *<sup>j</sup>* + 1≤*l* ≤2 *<sup>j</sup>* −2, *k* ∈*Z* <sup>2</sup> , *d* =0, 1, we can write the Fourier transform of the Shearlets in the compact form

$$\hat{\psi}\_{j,l,k}^{(d)}(\xi) = 2^{\frac{3j}{2}} V(2^{-2j}\xi) W\_{j,l}^{(d)}(\xi) e^{-2\pi i \tilde{\varepsilon} A\_d^{-j} B\_d^{-l} k} \tag{14}$$

Where *V* (*ξ*1, *ξ*2)=*ψ* ^ 1(*ξ*1)*χD*<sup>0</sup> (*ξ*1, *ξ*2) + *ψ* ^ <sup>1</sup>(*ξ*2)*χD*<sup>1</sup> (*ξ*1, *ξ*2).

The Shearlet transform of *f* ∈ *L* <sup>2</sup> (*R* <sup>2</sup> ) can be computed by

$$ = 2^{\frac{3j}{2}} \int\_{\mathbb{R}^2} \widehat{f}(\xi) \overline{V(2^{-2j}\xi)} \overline{W\_{j,l}^{(d)}(\xi)} e^{-2\pi i \xi\_b A\_d^{-j} B\_d^{-l} k} d\xi \tag{15}$$

Indeed, one can easily verify that

$$\sum\_{d=0}^{1} \sum\_{l=-2^{l}}^{2^{l}-1} \left| W\_{j,l}^{(d)}(\xi\_1, \xi\_2) \right|^2 = 1 \tag{16}$$

And form this it follows that

$$\left|\hat{\phi}(\xi\_1, \xi\_2)\right|^2 + \sum\_{d=0}^{1} \sum\_{j \ge 0} \sum\_{l=-2}^{2^{\prime}-1} \left|V(\mathbf{2}^{2^j}\xi\_1, \mathbf{2}^{2^j}\xi\_2)\right| \left|W\_{j,l}^{(d)}(\xi\_1, \xi\_2)\right|^2 = \mathbf{I} \tag{17}$$

and

#### **3. Multi-focus image fusion based on Shearlets**

#### **3.1. Algorithm framework of multi-focus image fusion using Shearlets**

#### *3.1.1. Image decomposition*

and

118 New Advances in Image Fusion

0 1

*j jj D D*

*l l l*

 x

 x

 x

 x

−2, *k* ∈*Z* <sup>2</sup>

 x- - - - <sup>=</sup> *<sup>j</sup> <sup>l</sup>*

(*ξ*1, *ξ*2).

) can be computed by

 xx

() 2 , 1 2

2 2 2 () 2 1 2 1 2 12 ,

 | (2 ,2 ) || ( , ) | 1 x

+ = åå å

 x

*jj d*


*d j d iA B k*

p x

*d d*

*jlk V We j l* (14)

p x

> xx

*j l*

*f fV W e d* (15)

*d d*

 x

*W* (16)

*V W* (17)

*l otherwise*

≤*l* ≤2 *<sup>j</sup>*

−2, each term *W <sup>j</sup>*,*<sup>l</sup>*

(*d* )

(*d* )

, *d* =0, 1, we can write the Fourier

(*ξ*) is a window

or *l* =2 *<sup>j</sup>*

(*ξ*) is the superposition of

 cx

 cx (13)

−1, at

ˆ ˆ (2 1) ( ) (2 ) ( ) 2

ï - + + - =-

( ) (2 1) ( ) (2 ) ( ) 2 1 ˆ ˆ

ïï = -- + - = - <sup>í</sup>

2 1

cxy

1 2

1 2

function localized on a pair of trapezoids, as illustrated in fig.1a. When *l* = −2 *<sup>j</sup>*

cxy

2 2

*W l l l*

*j l D D*

1 2 1 , 2 2

the junction of the horizontal cone *D*0 and the vertical cone, *W <sup>j</sup>*,*<sup>l</sup>*

3

(*ξ*1, *ξ*2) + *ψ* ^

3

*j*

*j*

(*R* <sup>2</sup>

2

121

*j*


*j*

0 2

= =-

*d l*

1 21


*j*

*j*

0 0 2

= ³ =-

*d j l*

( ) 2 () <sup>2</sup> <sup>2</sup> , , , <sup>ˆ</sup> ( ) 2 (2 ) ( )

( ) 2 () <sup>2</sup> <sup>2</sup> , , , <sup>ˆ</sup> , 2 ( ) (2 ) ( )


å å <sup>=</sup>

*d j l*

 x

*jlk j l R*

 x

<sup>1</sup>(*ξ*2)*χD*<sup>1</sup>

*d j d iA B k*

1

ˆ (2 )

<sup>ï</sup> - ïî

Where *ψ*2, *<sup>D</sup>*0, *<sup>D</sup>*1 are defined in section 2. For −<sup>2</sup> *<sup>j</sup>*

*j*

x

x

x

x

x

x

2

2

y

Using this notation, for *j* ≥0, −2 *<sup>j</sup>* + 1≤*l* ≤2 *<sup>j</sup>*

transform of the Shearlets in the compact form

1(*ξ*1)*χD*<sup>0</sup>

y

^

The Shearlet transform of *f* ∈ *L* <sup>2</sup>

Indeed, one can easily verify that

And form this it follows that

ˆ| ( , )| fx x

yx

y

ì

ï

ï ï

x y

two such function.

Where *V* (*ξ*1, *ξ*2)=*ψ*

0 1

*j jj*

Image decomposition based on shearlet transform is composed by two parts, decomposition of multi-direction and multi-scale.


In step (1), if the image is decomposed only by *S*0, or by *S*1, the number of the directions is 2(*l* + 1) + 1. If the image is decomposed both by *S*0 and *S*1, the number of the directions is 2(*l* + 2) + 2. The framework of Image decomposition with shearlets is shown in Fig. 3.

**Figure 3.** Image decomposition framework with shearlets

#### *3.1.2. Image fusion*

Image fusion framework based on shearlets is shown in Fig. 4. The following steps of image fusion are adopted.


Low frequency coefficients of the fused image are replaced by the average of low frequency coefficients of the two source images.

**b.** The choice of high frequency coefficients.

$$D\_X(i,j) = \sum\_{i \le M, j \le N} |Y\_X(i,j)|, \quad X = A, B \tag{18}$$

experiment. The following image quality metrics are used in this experiment: Standard devi‐ ation (STD), Difference of entropy (DEN), Overall cross entropy (OCE), Entropy (EN),

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 121

Sharpness (SP), Peak signal to noise ratio (PSNR), Mean square error (MSE) and Q.

(a) Focus on the left image (b) Focus on the right image (c) Original image

(d) Shearlet (e) Contourlet (f) Haar

(g) PCA (h) Daubechies (i) LP Fig. 5. Fusion results on experiment images

**Figure 4.** Image fusion framework based on shearlets

**Table 1.** Comparison of multi-focus image fusion **Figure 5.** Fusion results on experiment images

Calculate the absolute value of high frequency coefficients in the neighborhood by Eq.(18) Where *M* = *N* =3 is the size of the neighborhood, *X* denotes the two source images, *DX* (*i*, *j*) is the regional absolute value of *X* image within 3 neighborhood with the center at (*i*, *j*), *YX* (*i*, *j*) means the pixel value at (*i*, *j*) from *X* .

Select the high frequency coefficients from the two source images.

$$F(i,j) = \begin{cases} A(i,j) & D\_A(i,j) \ge D\_B(i,j) \\ B(i,j) & D\_A(i,j) < D\_B(i,j) \end{cases} \tag{19}$$

Where *F* is the high frequency coefficients of the fused image.

Finally the region consistency check is done based on the fuse-decision map, which is shown in Eq.(20).

$$\begin{aligned} Map(i,j) = \begin{cases} 1 & D\_A(i,j) \ge D\_B(i,j) \\ 0 & D\_A(i,j) < D\_B(i,j) \end{cases} \end{aligned} \tag{20}$$

According to Eq.(20), if the certain coefficient in the fused image is to come from source im‐ age *A*, but with the majority of its surrounding neighbors from *B*, this coefficient will be switched to come from *B*.

**4.** The fused image is gotten using the inverse shearlet transform.

#### **3.2. Simulation experiments**

**1.** Multi-focus image of Bottle

The following group images are selected to prove the validity proposed in this section.

The two source images, Fig.5.(a) and (b), are the multi-focus images, which focus on the dif‐ ferent parts. The fusion methods of these experiments are shearlets, contourlets, Haar, Dau‐ bechies, PCA and Laplacian Pyramid (LP). Fusion rule mentioned above is used in this

**Figure 4.** Image fusion framework based on shearlets

**3.** Fusion rule based on regional absolute value is adopted in this algorithem.

,

£ £ *X X* = = å *iMjN*

Select the high frequency coefficients from the two source images.

Where *F* is the high frequency coefficients of the fused image.

**4.** The fused image is gotten using the inverse shearlet transform.

Low frequency coefficients of the fused image are replaced by the average of low frequency

( , ) | ( , ) |, ,

(, ) (, ) (, ) (, ) (, ) (, ) (, ) <sup>ì</sup> <sup>³</sup> <sup>=</sup> <sup>í</sup> î <

Finally the region consistency check is done based on the fuse-decision map, which is shown

1 (, ) (, ) (, ) 0 (, ) (, ) <sup>ì</sup> <sup>³</sup> <sup>=</sup> <sup>í</sup> î <

According to Eq.(20), if the certain coefficient in the fused image is to come from source im‐ age *A*, but with the majority of its surrounding neighbors from *B*, this coefficient will be

The following group images are selected to prove the validity proposed in this section.

The two source images, Fig.5.(a) and (b), are the multi-focus images, which focus on the dif‐ ferent parts. The fusion methods of these experiments are shearlets, contourlets, Haar, Dau‐ bechies, PCA and Laplacian Pyramid (LP). Fusion rule mentioned above is used in this

*A B A B*

*A B A B*

Calculate the absolute value of high frequency coefficients in the neighborhood by Eq.(18) Where *M* = *N* =3 is the size of the neighborhood, *X* denotes the two source images, *DX* (*i*, *j*) is the regional absolute value of *X* image within 3 neighborhood with the center at (*i*, *j*),

*D i j Y i j X AB* (18)

*Ai j D i j D i j Fij Bi j D i j D i j* (19)

*D ij D ij Map i j D ij D ij* (20)

**a.** The choice of low frequency coefficients.

**b.** The choice of high frequency coefficients.

*YX* (*i*, *j*) means the pixel value at (*i*, *j*) from *X* .

in Eq.(20).

switched to come from *B*.

**3.2. Simulation experiments**

**1.** Multi-focus image of Bottle

coefficients of the two source images.

120 New Advances in Image Fusion

experiment. The following image quality metrics are used in this experiment: Standard devi‐ ation (STD), Difference of entropy (DEN), Overall cross entropy (OCE), Entropy (EN), Sharpness (SP), Peak signal to noise ratio (PSNR), Mean square error (MSE) and Q.

Fig. 5. Fusion results on experiment images

**Table 1.** Comparison of multi-focus image fusion **Figure 5.** Fusion results on experiment images

Fig.5. (c) is the ideal image, Fig.5.(d) ~Fig.5.(i) are the fused images with different methods. From the subjective evaluation of Fig.6 and objective metrics from Table 1, we can see that shearlet transform have more detail information, disperse the gray level and higher sharp‐ ness of the fused image than other methods do.


(g) PCA (h) Average

5.9189 24.8884 50.4706 0.3022

generator section, which can be described by discrete equation [23-25].

from *F* -channel. When the neuron threshold *θ<sup>j</sup>* ≥*Uj*

EN SP STD Q

**4.1. Theory of PCNN**

signal *Fj*

a positive offset on signal *L <sup>j</sup>*

6.1851 20.5271 45.0704 0.6881

**Figure 6.** Fusion results on experiment images

(a) CT (b) MRI (c) Shearlet

(d) Contourlet (e) Haar (f) Daubechies

Fig. 6. Fusion results on experiment images **Table 2.** Comparison of medical image fusion

> 5.9870 16.9938 35.8754 0.4960

PCNN, called the third generation artificial neural network, is feedback network formed by the connection of lots of neurons, according to the inspiration of biologic v is ual cortex pat‐ tern. Every neuron is made up of three sections: receptive section, modulation and pulse

The receptive field receives the input from the other neurons or external environment, and transmits them in two channels: *F* -channel and *L* -channel. In the modulation on field, add

**4. Remote sensing image fusion based on Shearlets and PCNN**

Shearlet Contourlet Haar Daubechies PCA Average

5.9784 14.8810 35.1490 0.4994

from *L* -channel; use the result to multiply modulation with

5.8792 17.2292 45.3889 0.6847

5.9868 16.9935 34.9141 0.4943

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 123

, the pulse generator is turned

**Table 1.** Comparison of multi-focus image fusion

#### **2.** Multi-focus Images of CT and MRI

The source images are the CT (Computer Tomography) and MRI (Magnetic Resonance Imaging) images. And Entropy (EN), Sharpness (SP), Standard deviation (STD) and Q is used to evaluate the effect of the fused images.

Fig.6 (a) is a CT image, whose brightness has relation with tissue density and the bone is shown clearly, but soft tissue is invisible. Fig.6 (b) is a MRI image, whose brightness has re‐ lation with the number of hydrogen atoms in t issue, so the soft t issue is shown clearly, but the bone is invisible. The CT image and the MRI image are complementary, the advantages could be fused into one image. The desired standard image cannot be acquired, thus only entropy and sharpness are adopted to evaluate the fusion result. Fusion rule mentioned above is used in this experiment.


**Table 2.** Comparison of medical image fusion

20.5271

#### EN 6.1851 5.9189 5.9870 5.9784 5.8792 5.9868 **4. Remote sensing image fusion based on Shearlets and PCNN**

24.8884 50.4706

#### SP STD 45.0704 **4.1. Theory of PCNN**

Fig.5. (c) is the ideal image, Fig.5.(d) ~Fig.5.(i) are the fused images with different methods. From the subjective evaluation of Fig.6 and objective metrics from Table 1, we can see that shearlet transform have more detail information, disperse the gray level and higher sharp‐

**shearlet contourlet Haar Daubechies PCA LP**

41.2225 0.0144 0.0470 6.9493 14.8401 31.188 49.0528 0.9010

41.3253 0.0113 0.0484 6.9462 12.9532 31.1887 49.4549 0.9131

44.1356 0.0354 0.0179 6.9703 19.4853 40.3666 5.9761 0.8809

41.3589 0.0150 0.0442 6.9499 15.3007 31.4881 45.8016 0.8954

The source images are the CT (Computer Tomography) and MRI (Magnetic Resonance Imaging) images. And Entropy (EN), Sharpness (SP), Standard deviation (STD) and Q is

Fig.6 (a) is a CT image, whose brightness has relation with tissue density and the bone is shown clearly, but soft tissue is invisible. Fig.6 (b) is a MRI image, whose brightness has re‐ lation with the number of hydrogen atoms in t issue, so the soft t issue is shown clearly, but the bone is invisible. The CT image and the MRI image are complementary, the advantages could be fused into one image. The desired standard image cannot be acquired, thus only entropy and sharpness are adopted to evaluate the fusion result. Fusion rule mentioned

**Shearlet Contourlet Haar Daubechies PCA Average**

5.9784 14.8810 35.1490 0.4994

5.8792 17.2292 45.3889 0.6847

5.9868 16.9935 34.9141 0.4943

5.9870 16.9938 35.8754 0.4960

ness of the fused image than other methods do.

43.3313 0.0227 0.0125 6.9577 18.7049 39.3935 7.0625 0.8703

STD DEN OCE EN SP PSNR MSE Q

EN SP STD Q

43.3322 0.0021 0.0107 6.9628 19.1502 40.8004 5.0067 0.9042

122 New Advances in Image Fusion

**Table 1.** Comparison of multi-focus image fusion

**2.** Multi-focus Images of CT and MRI

above is used in this experiment.

6.1851 20.5271 45.0704 0.6881

**Table 2.** Comparison of medical image fusion

used to evaluate the effect of the fused images.

5.9189 24.8884 50.4706 0.3022

Q 0.6881 0.3022 0.4960 0.4994 0.6847 0.4943 PCNN, called the third generation artificial neural network, is feedback network formed by the connection of lots of neurons, according to the inspiration of biologic v is ual cortex pat‐ tern. Every neuron is made up of three sections: receptive section, modulation and pulse generator section, which can be described by discrete equation [23-25].

16.9938 35.8754

Shearlet Contourlet Haar Daubechies PCA Average

14.8810 35.1490 17.2292 45.3889 16.9935 34.9141

The receptive field receives the input from the other neurons or external environment, and transmits them in two channels: *F* -channel and *L* -channel. In the modulation on field, add a positive offset on signal *L <sup>j</sup>* from *L* -channel; use the result to multiply modulation with signal *Fj* from *F* -channel. When the neuron threshold *θ<sup>j</sup>* ≥*Uj* , the pulse generator is turned off; otherwise, the pulse generator is turned on, and output a pulse. The mathematic model of PCNN is described below [26-30].

$$\begin{cases} F\_{\bar{y}}[n] = \exp(-\alpha\_F) F\_{\bar{y}}[n-1] + V\_F \sum m\_{\bar{y}kl} Y\_{kl}[n-1] + S\_{\bar{y}}\\ L\_{\bar{y}}[n] = \exp(-\alpha\_L) L\_{\bar{y}}[n-1] + V\_L \sum w\_{\bar{y}kl} Y\_{kl}[n-1] \\ U\_{\bar{y}}[n] = F\_{\bar{y}}[n] \mathbf{1} + \beta L\_{\bar{y}}[n] \mathbf{1} \\ Y\_{\bar{y}}[n] = 1 \quad \text{if } U\_{\bar{y}}[n] > \theta\_{\bar{y}}[n] \text{ or } 0 \text{ otherwise} \\ \theta\_{\bar{y}}[n] = \exp(-\alpha\_{\theta}) \theta\_{\bar{y}}[n-1] + V\_{\theta} Y\_{\bar{y}}[n-1] \end{cases} \tag{21}$$

Where *α<sup>F</sup>* , *αL* is the constant time of decay, *αθ* is the threshold constant time of decay, *Vθ* is the threshold amplitude coefficient, *VF* , *VL* are the link amplitude coefficients, *β* is the val‐ ue of link strength, and *mijkl*, *wijkl* are the link weight matrix.

Figure 7. The model of PCNN neuron **Figure 7.** The model of PCNN neuron

249-263.

New York, 2002.

#### 2. Corrected references **4.2. Algorithm framework of remote sensing image fusion using Shearlts and PCNN**

Reference [1] S. G. Mallat, Theory for Multiresolution Signal Decomposition: The Wavelet Representation, IEEE Transaction on Pattern Analysis and Machine Intelligence, When PCNN is used for image processing, it is a single two-dimensional network. The num‐ ber of the neurons is equal to the number of pixels. There is a one-to-one correspondence between the image pixels and the network neurons.

11(1989),pp: 674-693. In this paper, Shearlets and PCNN are used to fuse images. The steps are described below:

[2] A. Krista, Z.Yun, D.Peter, Wavelet Based Image Fusion Techniques — An introduction, review and comparison, International Society for Photogrammetry and Sensing, 62(2007), pp: **1.** Decompose the original images A and B respectively into many different directions *f NA*, *f* ^ *NA*, *f NB*, *f* ^ *NB* (*N* =1, ..., *n*) via Shear matrixs (In this chapter, *n* =3).

> Compression. Applied and Computational Hamonic Analysis, 4(1997), pp: 147-187. [5] N. Kingsbury, Complex Wavelets for Shift Invarian Analysis and Filtering of Signals.

[7] Y. Xiao-Hui, J. Licheng, Fusion Algorithm for Remote Sensing Images Based on Nonsubsampled Contourlet Transform, Acta Automatica Sinica,34(2008), pp: 274-281. [8] E. J. Candes, and D. L. Donoho, Continuous curvelet transform. I. Resolution of the

Continuous Wavelet Transform. Signal Processing, 31(1993), pp: 241-272.

Applied and Computational Hamonic Analysis, 10(2001), pp: 234-253.

[3] J. P. Antoine, P. Carrette, R. Murenzi, B. Piette, Image Analysis with Two Dimensional

[4] F. G. Meyer, R. R. Coifman, Brushlets: A Tool for Directional Image Analysis and Image

[6] P. Brémaud , Mathematical principles of signal processing: Fourier and wavelet Analysis,

2

**2.** Calculate the gradient features in every direction to form feature maps,

*NA*, *DG f NBDG f*^

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945

*<sup>l</sup>* . The fused

**Fused Image**

*Inverse Shearlets*

*NB* into PCNN, and fire maps in all directions

*<sup>l</sup>* and *f* ^ *NB*

*<sup>l</sup>* , *f* ^ *NA <sup>l</sup>* , *<sup>f</sup> NB*

*NB*

*NB* .

*NB l*

*NB l*

*NA fire f*

ˆ *NA fire f* *Nf*

ˆ *Nf*

*NB fire f*

ˆ *NB fire f*

*NA* <sup>≥</sup> *fire f*^

*NA* <sup>&</sup>lt; *fire f*^

*NA <sup>l</sup>* <sup>≥</sup>*Var f*^

*NA <sup>l</sup>* <sup>&</sup>lt;*Var f*^ *NB*

125

*NB*.

*NB* are obtained.

**5.** Take the Shearlets on original images A and B, the high frequency coefficients in all di‐

*<sup>h</sup>* , and the low are *<sup>f</sup> NA*

*NA*, *Grad f NB*, *Grad f*^

are high frequency coefficients after the decomposition.

*NA*, *DG f NBDG f*^

*<sup>h</sup>* and *f* ^ *NB*

, *f* ^ *N h* ={ *f* ^ *NA <sup>h</sup>* , *fire f*^

*<sup>l</sup>* , *f* ^ *N l* ={ *f* ^ *NA <sup>l</sup>* , *Var f*^

**6.** The fused image is obtained using the inverse Shearlet transform.

*A*0

*DWT*

*DGfNA*

*PCNN*

*PCNN*

*PCNN*

*PCNN*

<sup>ˆ</sup> *DGfNA*

*DGfNB*

<sup>ˆ</sup> *DGfNB*

*A*0

*DWT*

*A*1

*DWT*

*A*1

*DWT*

high frequency coefficients in all directions can be selected as follow:

*f* ^ *NB <sup>h</sup>* , *fire f*^

The fusion rule of the low frequency coefficients in any direction is described below:

*f* ^ *NB <sup>l</sup>* , *Var f*^

*NA*, *fire f NB*, *fire f*^

**3.** Decompose feature map of all directions using DWT, *DG f NA*, *DG f*^

*Grad f NA*, *Grad f*^

**4.** Take *DG f NA*, *DG f*^

*fire f NA*, *fire f*^

rections are *f NA*

*<sup>l</sup>* , *Var f NA*

*<sup>l</sup>* , *Var f NA*

0 *S*

*NA f*

ˆ *NA f*

*NB f*

ˆ *NB f*

0 *S*

1 *S*

1 *S*

Where *Varf* is the variance of *f* .

*f N <sup>h</sup>* ={ *<sup>f</sup> NA*

*f N <sup>l</sup>* ={ *<sup>f</sup> NA*

*f NB*

*f NB*

**Image A**

**Image B**

*<sup>h</sup>* , *f* ^ *NA <sup>h</sup>* , *<sup>f</sup> NB*

*<sup>h</sup>* , *fire f NA* ≥ *fire f NB*

*<sup>h</sup>* , *fire f NA* < *fire f NB*

*<sup>l</sup>* <sup>≥</sup>*Var f NB l*

*<sup>l</sup>* <sup>&</sup>lt;*Var f NB*

*Grad fNA*

<sup>ˆ</sup> *Grad fNA*

*Grad fNB*

<sup>ˆ</sup> *Grad fNB*

**Figure 8.** Image fusion framework with Shearlets and PCNN


$$\begin{aligned} \; \_f f\_N^{\;h} = \begin{bmatrix} f \; \_ {NA'} & \operatorname{free} f \; \_ {NA} \ge \operatorname{free} f \; \_{NB} \\ f \; \_ {NB'} & \operatorname{free} f \; \_ {NA} < \operatorname{free} f \; \_{NB} \end{bmatrix} \qquad \begin{aligned} \; \_f f\_h^{\;h} = \begin{bmatrix} \widehat{f} \; \_ {NA'} & \operatorname{free} \widehat{f} \; \_ {NA} \ge \operatorname{free} \widehat{f} \\ \widehat{f} \; \_ {NA'} & \operatorname{free} \widehat{f} \; \_ {NA} < \operatorname{free} \widehat{f} \; \_ {NB} \end{bmatrix}. \end{aligned}$$

The fusion rule of the low frequency coefficients in any direction is described below:

$$\begin{array}{llll} f^{\operatorname{l}}\_{\operatorname{NA'}} = \begin{vmatrix} f^{\operatorname{l}}\_{\operatorname{NA'}} & \operatorname{Var}\operatorname{f}^{\operatorname{l}}\_{\operatorname{NA}} \cong \operatorname{Var}\operatorname{f}^{\operatorname{l}}\_{\operatorname{NB}} & & \bigwedge^{\operatorname{l}}\_{\operatorname{NA'}} = \begin{vmatrix} \bigwedge^{\operatorname{l}}\_{\operatorname{NA'}} & \operatorname{Var}\operatorname{\widehat{f}}^{\operatorname{l}}\_{\operatorname{NA}} \cong \operatorname{Var}\operatorname{\widehat{f}}^{\operatorname{l}}\\ \bigwedge^{\operatorname{l}}\_{\operatorname{NA'}} & \operatorname{Var}\operatorname{\widehat{f}}^{\operatorname{l}}\_{\operatorname{NA}} \cong \operatorname{Var}\operatorname{\widehat{f}}^{\operatorname{l}} \end{vmatrix} \end{array}$$

Where *Varf* is the variance of *f* .

off; otherwise, the pulse generator is turned on, and output a pulse. The mathematic model

å å

(21)

2

 q

Where *α<sup>F</sup>* , *αL* is the constant time of decay, *αθ* is the threshold constant time of decay, *Vθ* is the threshold amplitude coefficient, *VF* , *VL* are the link amplitude coefficients, *β* is the val‐

1+ βLjk

<sup>0</sup> <sup>1</sup>

U jk

θ jk

T Vj T α j

Yjk

[ ] exp( ) [ 1] [ 1] [ ] exp( ) [ 1] [ 1]

*Fn Fn V m Y n S*

*ij F ij F ijkl kl ij*

<sup>ì</sup> = - -+ - + <sup>ï</sup> ï = - -+ -

[ ] [ ](1 [ ]) [] 1 [] [] 0

*ij ij ij ij ij ij*

ue of link strength, and *mijkl*, *wijkl* are the link weight matrix.

L αij

F αij

F αkj

Figure 7. The model of PCNN neuron

between the image pixels and the network neurons.

L αkj

*U n Fn Ln*

í = + <sup>ï</sup> = > <sup>ï</sup>

> aq

a

a

[ ] exp( ) [ 1] [ 1] q

∑

+1

∑

[1] S. G. Mallat, Theory for Multiresolution Signal Decomposition: The Wavelet Representation, IEEE Transaction on Pattern Analysis and Machine Intelligence,

In this paper, Shearlets and PCNN are used to fuse images. The steps are described below:

**1.** Decompose the original images A and B respectively into many different directions

When PCNN is used for image processing, it is a single two-dimensional network. The num‐ ber of the neurons is equal to the number of pixels. There is a one-to-one correspondence

**4.2. Algorithm framework of remote sensing image fusion using Shearlts and PCNN**

[2] A. Krista, Z.Yun, D.Peter, Wavelet Based Image Fusion Techniques — An introduction, review and comparison, International Society for Photogrammetry and Sensing, 62(2007), pp:

*NB* (*N* =1, ..., *n*) via Shear matrixs (In this chapter, *n* =3).

[3] J. P. Antoine, P. Carrette, R. Murenzi, B. Piette, Image Analysis with Two Dimensional

[4] F. G. Meyer, R. R. Coifman, Brushlets: A Tool for Directional Image Analysis and Image

[6] P. Brémaud , Mathematical principles of signal processing: Fourier and wavelet Analysis,

Compression. Applied and Computational Hamonic Analysis, 4(1997), pp: 147-187. [5] N. Kingsbury, Complex Wavelets for Shift Invarian Analysis and Filtering of Signals.

[7] Y. Xiao-Hui, J. Licheng, Fusion Algorithm for Remote Sensing Images Based on Nonsubsampled Contourlet Transform, Acta Automatica Sinica,34(2008), pp: 274-281. [8] E. J. Candes, and D. L. Donoho, Continuous curvelet transform. I. Resolution of the

Continuous Wavelet Transform. Signal Processing, 31(1993), pp: 241-272.

Applied and Computational Hamonic Analysis, 10(2001), pp: 234-253.

*Y n if U n n or otherwise n n VY n*

<sup>ï</sup> = - -+ - <sup>î</sup>

*ij ij ij*

q

b

*ij L ij L ijkl kl*

*Ln Ln V wY n*

of PCNN is described below [26-30].

124 New Advances in Image Fusion

q

wij

YL

Yjk

jk S

**Figure 7.** The model of PCNN neuron

Reference

249-263.

*NA*, *f NB*, *f*

*f NA*, *f* ^ wkj

Mij

Mkj

2. Corrected references

11(1989),pp: 674-693.

^

New York, 2002.

ï

**6.** The fused image is obtained using the inverse Shearlet transform.

**Figure 8.** Image fusion framework with Shearlets and PCNN

#### **4.3. Simulation experiments**

In this section, three different examples, Optical and SAR images, remote sensing image and hyperspectral image, are provided to demonstrate the effectiveness of the proposed method. Many different methods, including Average, Laplacian Pyramid (LP), Gradient Pyramid (GP), Contrast Pyramid (CP), Contourlet-PCNN (C-P), and Wavelet-PCNN (W-P), are used to compare with our proposed approach. The subjective v is ual perception gives us direct Comparisons, and some objective image quality assessments are also used to evaluate the performance of the proposed approach. The following image quality metrics are used in this paper: Entropy (EN), Overall cross entropy (OCE), Standard deviation (STD), Average gra‐ dient (Ave-grad), *Q*, and *QAB*/*<sup>F</sup>* .

Fig.10 (c) is the fused image using Shearlets and PCNN. The numerical results in Fig.5 and Table 1 show that the fused image based on Shearlets and PCNN keep better river information, and even involve excellent city features. In Fig 10.(d), in the middle of the fused image using Contourlet and PCNN, has obvious splicing effect. Fig.11(c) is the fused Hyperspectral image. Fig.11(a) and (b) are the two original images, The track of the airport is clear in Fig.11(a), however, some planes information are lost. Fig. 11(b) shows the different information. In the fused image, the track information is more clearly, and aircrafts characters are more obvious. But lines on the runways are not clear enough in the fused images using other methods. From Table 3 we can see that most metric values

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 127

can see that most metric values using the proposed method are better than other methods do.

(a) Optical (b) SAR (c) Shearlet—PCNN

(d) C-P (e) W-P (f) CP

(g) Average (h) LP (i) GP

Fig.9 Optical and SAR images fusion results based on Shearlets and PCNN

using the proposed method are better than other methods do.

**Figure 9.** Optical and SAR images fusion results based on Shearlets and PCNN

In these three different experiments, the parameters of values of PCNN are showing as fol‐ lows:

Experiment 1: *α<sup>L</sup>* =0.03, *αθ* =0.1, *VL* =1, *V<sup>θ</sup>* =10, *β* =0.2, *W* =( 1 / 2 1 1 / 2 111 1 / 2 1 1 / 2 ), and the itera‐

tive number is *n* =100.

$$\text{Experiment 2: } \alpha\_L = 0.02, \ a\_\theta = 0.05, \ V\_L = 1, \ V\_\theta = 15, \ \beta = 0.7, \ W = \begin{vmatrix} 1/\sqrt{2} & 1 & 1/\sqrt{2} \\ 1 & 1 & 1 \\ 1/\sqrt{2} & 1 & 1/\sqrt{2} \end{vmatrix}, \text{and the internal force } \alpha\_L \text{ is } \alpha\_L = 0.05.$$

ative number is *n* =100.

Experiment 3: *α<sup>L</sup>* =0.03, *αθ* =0.1, *VL* =1, *V<sup>θ</sup>* =15, *β* =0.5, *W* =( 1 / 2 1 1 / 2 111 1 / 2 1 1 / 2 ), and the itera‐

tive number is *n* =100.

As optical and SAR images, remote sensing image and hyperspectral image are widely used in military, so the study of these images in image fusion are of very important.

Fig.9-11 gives the fused images with Shearlet-PCNN and some other different methods. From Fig.9-11 and Table3, we can see that image fusion based on Shearlets and PCNN can get more information and less distortion than other methods. In experiment 1, the edge feature from Fig. 9(a) and spectral information from Fig. 9(b) are kept in the fused image by using the proposed method, which is showing in Fig.9(c). In Fig.9 (d), the spec‐ tral character in the fused image, fused by Contourlet and PCNN, is distorted and the from visual point of view, the color of image is too prominent. From Fig.9 (e)-(f), spectral information of the fused images is lost and the edge features are vague. Fig. 10 are the fused Remote sensing image, which is able to provide more new information since it can penetrate clouds, rain, and even vegetation. With different imaging modalities and differ‐ ent bands, its features are different in each image. In Fig.10(c) and (d), band 8 has more river characteristics but less city information, while band 4 has opposite imaging features. Fig.10 (c) is the fused image using Shearlets and PCNN. The numerical results in Fig.5 and Table 1 show that the fused image based on Shearlets and PCNN keep better river information, and even involve excellent city features. In Fig 10.(d), in the middle of the fused image using Contourlet and PCNN, has obvious splicing effect. Fig.11(c) is the fused Hyperspectral image. Fig.11(a) and (b) are the two original images, The track of the airport is clear in Fig.11(a), however, some planes information are lost. Fig. 11(b) shows the different information. In the fused image, the track information is more clearly, and aircrafts characters are more obvious. But lines on the runways are not clear enough in the fused images using other methods. From Table 3 we can see that most metric values using the proposed method are better than other methods do.

can see that most metric values using the proposed method are better than other methods do.

**4.3. Simulation experiments**

126 New Advances in Image Fusion

dient (Ave-grad), *Q*, and *QAB*/*<sup>F</sup>* .

tive number is *n* =100.

ative number is *n* =100.

tive number is *n* =100.

lows:

In this section, three different examples, Optical and SAR images, remote sensing image and hyperspectral image, are provided to demonstrate the effectiveness of the proposed method. Many different methods, including Average, Laplacian Pyramid (LP), Gradient Pyramid (GP), Contrast Pyramid (CP), Contourlet-PCNN (C-P), and Wavelet-PCNN (W-P), are used to compare with our proposed approach. The subjective v is ual perception gives us direct Comparisons, and some objective image quality assessments are also used to evaluate the performance of the proposed approach. The following image quality metrics are used in this paper: Entropy (EN), Overall cross entropy (OCE), Standard deviation (STD), Average gra‐

In these three different experiments, the parameters of values of PCNN are showing as fol‐

As optical and SAR images, remote sensing image and hyperspectral image are widely used

Fig.9-11 gives the fused images with Shearlet-PCNN and some other different methods. From Fig.9-11 and Table3, we can see that image fusion based on Shearlets and PCNN can get more information and less distortion than other methods. In experiment 1, the edge feature from Fig. 9(a) and spectral information from Fig. 9(b) are kept in the fused image by using the proposed method, which is showing in Fig.9(c). In Fig.9 (d), the spec‐ tral character in the fused image, fused by Contourlet and PCNN, is distorted and the from visual point of view, the color of image is too prominent. From Fig.9 (e)-(f), spectral information of the fused images is lost and the edge features are vague. Fig. 10 are the fused Remote sensing image, which is able to provide more new information since it can penetrate clouds, rain, and even vegetation. With different imaging modalities and differ‐ ent bands, its features are different in each image. In Fig.10(c) and (d), band 8 has more river characteristics but less city information, while band 4 has opposite imaging features.

in military, so the study of these images in image fusion are of very important.

1 / 2 1 1 / 2 111 1 / 2 1 1 / 2

1 / 2 1 1 / 2 111 1 / 2 1 1 / 2

1 / 2 1 1 / 2 111 1 / 2 1 1 / 2

), and the itera‐

), and the iter‐

), and the itera‐

Experiment 1: *α<sup>L</sup>* =0.03, *αθ* =0.1, *VL* =1, *V<sup>θ</sup>* =10, *β* =0.2, *W* =(

Experiment 2: *α<sup>L</sup>* =0.02, *αθ* =0.05, *VL* =1, *V<sup>θ</sup>* =15, *β* =0.7, *W* =(

Experiment 3: *α<sup>L</sup>* =0.03, *αθ* =0.1, *VL* =1, *V<sup>θ</sup>* =15, *β* =0.5, *W* =(

Fig.9 Optical and SAR images fusion results based on Shearlets and PCNN

**Figure 9.** Optical and SAR images fusion results based on Shearlets and PCNN

(a) remote-8 (b) remote-4 (c) Shearlet—PCNN

**Figure 11.** Hyperspectral image fusion results based on Shearlets and PCNN

0.1842

0.3002

0.2412

0.2816

0.3562

0.3753

0.4226

Experiment 1 Average

LP

GP

CP

C‐P

W‐P

proposed

(a) Hyperspectral 1 (b) Hyperspectral 2 (c) Shearlet—PCNN

(d) C-P (e)GP (f) LP

 (g) CP (h) W-P (i) Average Fig.11 Hyperspectral image fusion results based on Shearlets and PCNN

**Table 1** Comparison of image quality metrics

Dataset Algorithm *QAB F*/ Q EN STD Ave‐grad OCE

6.3620

22.1091

0.0285

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 129

3.2870

3.0844

3.2336

3.1292

0.5538

0.5689

0.5410

0.0478

0.0379

0.0457

0.0665

0.0662

0.0575

24.8906

22.6744

24.1864

31.2693

25.2683

34.1192

6.5209

6.3993

6.4759

6.7424

6.6142

6.9961

0.2908

0.3017

0.2953

0.2961

0.4523

0.4976

0.5010

(d) C-P (e) GP (f) LP

Fig.10 Remote sensing image fusion results based on Shearlets and PCNN

(g) CP (h) W-P (i) Average

**Figure 10.** Remote sensing image fusion results based on Shearlets and PCNN

**Figure 10.** Remote sensing image fusion results based on Shearlets and PCNN

128 New Advances in Image Fusion

(a) remote-8 (b) remote-4 (c) Shearlet—PCNN

(d) C-P (e) GP (f) LP

(g) CP (h) W-P (i) Average Fig.10 Remote sensing image fusion results based on Shearlets and PCNN

(d) C-P (e)GP (f) LP

(a) Hyperspectral 1 (b) Hyperspectral 2 (c) Shearlet—PCNN

**Table 1** Comparison of image quality metrics

Dataset Algorithm *QAB F*/ Q EN STD Ave‐grad OCE

6.5209

6.3993

6.4759

6.7424

6.6142

6.9961

0.3017

0.2953

0.2961

0.4523

0.4976

0.5010

Fig.11 Hyperspectral image fusion results based on Shearlets and PCNN

(g) CP (h) W-P (i) Average

0.0285

3.2870

3.0844

3.2336

3.1292

0.5538

0.5689

0.5410

0.0478

0.0379

0.0457

0.0665

0.0662

0.0575

22.1091

24.8906

22.6744

24.1864

31.2693

25.2683

34.1192

Experiment 1 Average 0.1842 0.2908 6.3620 **Figure 11.** Hyperspectral image fusion results based on Shearlets and PCNN

0.3002

0.2412

0.2816

0.3562

0.3753

0.4226

LP

GP

CP

C‐P

W‐P

proposed


of images, shown in the experiments, prove that the new algorithm we proposed in this

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 131

After development in recent years, the theory of Shearlets is gradually improving. But the time complexity of Shearlets decomposition has been the focus of the study. Which need fur‐ ther study, especially in its theory and applications. We will focus on other image process‐

[1] S. G. Mallat, Theory for Multiresolution Signal Decomposition: The Wavelet Repre‐ sentation, IEEE Transaction on Pattern Analysis and Machine Intelligence,

[2] A. Krista, Z.Yun, D.Peter, Wavelet Based Image Fusion Techniques — An introduc‐ tion, review and comparison, International Society for Photogrammetry and Sensing,

[3] J. P. Antoine, P. Carrette, R. Murenzi, B. Piette, Image Analysis with Two Dimension‐ al Continuous Wavelet Transform. Signal Processing, 31(1993), pp: 241-272.

[4] F. G. Meyer, R. R. Coifman, Brushlets: A Tool for Directional Image Analysis and Im‐ age Compression. Applied and Computational Hamonic Analysis, 4(1997), pp:

[5] N. Kingsbury, Complex Wavelets for Shift Invarian Analysis and Filtering of Signals.

[6] P. Brémaud , Mathematical principles of signal processing: Fourier and wavelet

[7] Y. Xiao-Hui, J. Licheng, Fusion Algorithm for Remote Sensing Images Based on Non‐ subsampled Contourlet Transform, Acta Automatica Sinica,34(2008), pp: 274-281.

[8] E. J. Candes, and D. L. Donoho, Continuous curvelet transform. I. Resolution of the wavefront set, Applied Computational Harmonic Analysis, 19 (2005), pp:162-197.

[9] M.N.Do, M.Vetterli, The Contourlet Transform: An Efficient Directional Multiresolu‐ tion Image Representation, IEEE Transaction on Image Processing, 14(2005), pp:

Applied and Computational Hamonic Analysis, 10(2001), pp: 234-253.

chapter is effective.

**Author details**

**References**

ing methods using shearlets in our future work.

Miao Qiguang, Shi Cheng and Li Weisheng

11(1989),pp: 674-693.

62(2007), pp: 249-263.

Analysis, New York, 2002.

147-187.

2091-2106.

**Table 3.** Comparison of image quality metrics

#### **5. Conclusion**

The theory of Shearlets is introduced in this chapter. As a novel MGA tool, shearlets offer more advantages over other MGA tools. The main advangtage of shearlets is that it can be studied within the framework of a generalized Multi-Resolution Analysis and with direc‐ tional subdivision schemes generalizing those of traditional wavelets. This is very relevant for the development of fast algorithmic implementations of the many directional representa‐ tion systems proved in the last decade.

In this chapter, we have succeed in demonstrations that shearlets are very competitive for‐ multi-focus image and remote sensing image fusion. As a new MGA tool, Shearlet is equip‐ ped with a rich mathematical structure similar to wavelet and can capture the information in any direction. And the edge and orientation information are more sensitive than gray ac‐ cording to human visibility. We take full advantage of multidirection of Shearlets and gradi‐ ent information to fuse image. Moreover, PCNN is selected as a fusion rule to select the fusion coefficients. Because the character is tics of directional and gradient facilitate motivat‐ ing PCNN neurons, the more precise image fusion results are gotten. Several different kinds of images, shown in the experiments, prove that the new algorithm we proposed in this chapter is effective.

After development in recent years, the theory of Shearlets is gradually improving. But the time complexity of Shearlets decomposition has been the focus of the study. Which need fur‐ ther study, especially in its theory and applications. We will focus on other image process‐ ing methods using shearlets in our future work.

#### **Author details**

**Dataset Algorithm** *QAB***/***<sup>F</sup>* **Q EN STD Ave-grad OCE**

6.3620 6.5209 6.3993 6.4759 6.7424 6.6142 6.9961

6.1975 6.9594 6.9024 6.9237 7.3332 6.8543 7.1572

6.5011 6.8883 6.5649 6.7499 6.9451 6.5847 7.0791 22.1091 24.8906 22.6744 24.1864 31.2693 25.2683 34.1192

46.1587 49.2283 47.0888 48.9839 54.3504 47.3304 56.2993

41.0552 47.4990 41.3974 43.4631 46.5294 41.6623 55.9533 0.0285 0.0478 0.0379 0.0457 0.0665 0.0662 0.0575

0.0236 0.0399 0.0342 0.0392 0.0390 0.0346 0.0381

0.0161 0.0274 0.0223 0.0318 0.0262 0.0231 0.0246 3.2870 3.0844 3.2336 3.1292 0.5538 0.5689 0.5410

2.9600 3.3738 3.6190 3.3812 3.0628 3.2436 2.9046

1.0939 0.9959 1.0249 0.9834 1.1745 1.5318 0.5246

0.2908 0.3017 0.2953 0.2961 0.4523 0.4976 0.5010

0.7581 0.7530 0.7599 0.7475 0.7516 0.7547 0.7775

0.7955 0.7728 0.7898 0.7469 0.7435 0.7788 0.7502

The theory of Shearlets is introduced in this chapter. As a novel MGA tool, shearlets offer more advantages over other MGA tools. The main advangtage of shearlets is that it can be studied within the framework of a generalized Multi-Resolution Analysis and with direc‐ tional subdivision schemes generalizing those of traditional wavelets. This is very relevant for the development of fast algorithmic implementations of the many directional representa‐

In this chapter, we have succeed in demonstrations that shearlets are very competitive for‐ multi-focus image and remote sensing image fusion. As a new MGA tool, Shearlet is equip‐ ped with a rich mathematical structure similar to wavelet and can capture the information in any direction. And the edge and orientation information are more sensitive than gray ac‐ cording to human visibility. We take full advantage of multidirection of Shearlets and gradi‐ ent information to fuse image. Moreover, PCNN is selected as a fusion rule to select the fusion coefficients. Because the character is tics of directional and gradient facilitate motivat‐ ing PCNN neurons, the more precise image fusion results are gotten. Several different kinds

Experiment 1 Average

130 New Advances in Image Fusion

Experiment 2 Average

Experiment 3 Average

LP GP CP C-P W-P proposed

LP GP CP C-P W-P proposed

LP GP CP C-P W-P proposed

**Table 3.** Comparison of image quality metrics

tion systems proved in the last decade.

**5. Conclusion**

0.1842 0.3002 0.2412 0.2816 0.3562 0.3753 0.4226

0.4016 0.5219 0.4736 0.5120 0.5658 0.4283 0.6212

0.5021 0.6414 0.5720 0.5909 0.5838 0.5319 0.6230 Miao Qiguang, Shi Cheng and Li Weisheng

#### **References**


[10] M. QiGuang, W. BaoShu, Multi-Focus Image Fusion Based on Wavelet Transform and Local Energy, Computer Science, 35(2008), pp: 231-235.

[25] R. P. Broussard, S. K. Rogers, M. E. Oxley et al, Physiologically motivated image fu‐ sion for object detection using a pulse coupled neural network , IEEE Trans. Neural

Image Fusion Based on Shearlets http://dx.doi.org/10.5772/56945 133

[26] W. Chen, L.C. Jiao, Adaptive tracking for periodically time-varying and nonlinearly parameterized systems using multilayer neural networks, IEEE Trans. on Neural

[27] W. Chen, Z.Q. Zhang, Globally stable adaptive backstepping fuzzy control for out‐ put-feedback systems with unknown high-frequency gain sign, Fuzzy Sets and Sys‐

[28] X.B.Qu, J.W.Yan, Image Fusion Algorithm Based on Features Motivated Multi-chan‐ nel Pulse Coupled Neural Networks, Bioinformatics and Biomedical Engineering,

[29] X.B.Qu, C.W. Hu, J.W.Yan, Image Fusion Algorithm Based On Orientation Informa‐ tion Motivated Pulse Coupled Neural Networks, Intelligent Control and Automa‐

[30] G.R. Easley, D. Labate, and F. Colonna, Shearlet Based Total Variation Diffusion for

Denoising, IEEE Trans. Image Process. 18(2009),pp: 260-268.

Network,10(1999), pp:554-563.

Networks, 21(2010), pp:345-351.

tems, 161(2010), pp: 821-836.

1(2008),pp: 2103-2106.

tion, 1(2008),pp: 2437-2441.


[25] R. P. Broussard, S. K. Rogers, M. E. Oxley et al, Physiologically motivated image fu‐ sion for object detection using a pulse coupled neural network , IEEE Trans. Neural Network,10(1999), pp:554-563.

[10] M. QiGuang, W. BaoShu, Multi-Focus Image Fusion Based on Wavelet Transform

[11] Wang-Q Lin, The Discrete Shearlet Transform: A New Directional Transform and

[12] G. Easley, Demetrio Labate and Wang-Q Lim, Sparse Directional Image Representa‐ tions using the Discrete Shearlet Transform, Applied Computational Harmonic Anal‐

[13] G.Kutyniok. and D. Labate, Construction of Regular and Irregular Shearlet Frames,

[14] G. Kutyniok and Wang-Q Lin, Image Separation Using Wavelets and Shearlets, Lec‐

[15] K.Guo and D.Labate, Optimally Sparse Multidimensional Representation using Shearlets, SIAM Journal on Mathematical Analysis, 39(2007), pp: 298-318.

[16] G. Kutyniok and D. Labate, Resolution of the Wavefront Set using Continuous Shear‐ lets, Trans. Amer. Math. Soc, 361(2009), pp:2719-2754. Shearlet webpage, http://

[17] K. Guo, W. Lim, D. Labate, G. Weiss, E. Wilson. The theory of wavelets with compo‐ site dilations[J]. Harmonic Analysis and Applications. 4(2006), pp:231–249.

[18] K. Guo, W. Lim, D. Labate, G. Weiss, E. Wilson. Wavelets with composite dilations and their MRA properties[J]. Appl. Comput. Harmon. Anal.,20(2006), pp:231–249.

[19] K. Guo, D. Labate and W. Lim, Edge Analysis and Identification using the Continu‐ ous Shearlet Transform, Applied Computational Harmonic Analysis, 27(2009), pp:

[20] G. Kutyniok and Wang-Q Lin, Image Separation Using Wavelets and Shearlets, Lec‐

[21] K. Guo, W. Lim, D. Labate, G. Weiss, E. Wilson. Wavelets with composite dila‐

[22] R. Eckhorn, H. J. Reitboeck, M Arndt et al. Feature linking via synchronization among distributed assemblies: Simulation of results from cat cortex, Neural Compu‐

[23] R. Eckhorn, H. J. Reitboeck, M Arndt et al. Feature linking via Stimulus-Evoked Os‐ cillstions: Experimental Results form Cat Visual Cortex and Functional Implications form Network Model. In:Proc Int JCNN, Washington D C. 1(1989), pp:723-730.

[24] W. Jin, Z. J. Li, L. S. Wei, H. Zhen, The improvements of BP neural network learning algorithm, Signal Processing Proceedings ,2000, WCCC-ICSP 2000. 5th International

tions[J]. Electron. Res. Announc. Amer. Math. Soc., 10(2004), pp:78–87.

and Local Energy, Computer Science, 35(2008), pp: 231-235.

Compactly supported Shearlet Frame, 5(2010), pp:1166-1180.

Journal of Wavelet Theory and Applications, 1(2007), pp: 1-10.

ture Notes in Computer Science, 6920(2012), pp:416-430.

ture Notes in Computer Science, 6920(2012), pp:416-430.

ysis, 25(2008), pp: 25-46.

132 New Advances in Image Fusion

www.shearlet.org.

tation, 2(1990), pp: 293-307.

Conference on, 3(2000),pp: 1647-1649.

24-46.


## *Edited by Qiguang Miao*

Image Fusion is an important branch of information fusion, and it is also an important technology for image understanding and computer vision. The fusion process is to merging different images into one to get more accurate description for the scene. The original images for image fusion are always obtained by several different image sensors, or the same sensor in different operating modes. The fused image can provide more effective information for further image processing, such as image segmentation, object detection and recognition. Image fusion is a new study field which combined with many different disciplines, such as sensors, signal processing, image processing, computer and artificial intelligence. In the past two decades, a large number of research literatures appear. This book is edited based on these research results, and many research scholars give a great help to this book.

Photo by rudchenko / iStock

New Advances in Image Fusion

New Advances in

Image Fusion