**1. Introduction**

Hyperspectral imaging (HSI), also known as imaging spectroscopy, is a technology capable of sampling hundreds of narrow spectral bands across the electromagnetic spectrum through the use of an optical element that disperses the incoming radiation into certain wavelengths [1]. This technology combines the main features of two existing technologies: imaging and spectroscopy, making possible to exploit both the morphological features and the chemical composition of objects captured by a camera. The interaction between electromagnetic radiation and matter is distinctive for each material, therefore by using this technology it is possible to discriminate among different materials [2]. The characteristic spectral curve associated with a certain material is called spectral signature or spectral fingerprint, and through its analysis it

**Figure 1.** *Example of a HS cube with an example of a spectral signature.*

is possible to differentiate among different materials or substances. The data structure used in HSI comprises both the spectral and spatial features from a given scene, and is referred to as hyperspectral (HS) cube. **Figure 1** shows a graphical representation of an HS cube with an example of a spectral signature for the top-right pixel.

Although historically HSI has been applied to remote sensing [3], in recent years this technology has become a trending topic in different research fields such as food quality analysis [4, 5], military and security applications [6] or agriculture [7, 8], among many others [9]. HSI is also an emerging imaging modality in the medical field. It has been proven that the interaction between the electromagnetic radiation and matter carries useful information for diagnostic proposes [10]. As an alternative diagnostic tool, one of the strengths offered by HSI is being completely non-invasive and label-free. In medical research applications, this technology has been employed for more than twenty years in different areas such as the analysis of cancerous tissues in *in-vivo* and *ex-vivo* samples [11], digital and computational pathology [12], melanoma detection [13] or several gastroenterology diseases [14].

In this chapter, a survey of the most common processing frameworks employed in the literature for information extraction in medical HSI will be presented. First, a brief introduction of the optical properties of biological tissues is provided. Second, the most common information extraction methods employed for HSI medical processing are described and discussed, including optical inverse modeling and machine learning methods. The last section summarizes the conclusions reached in this literature analysis.

## **2. Optical properties of biological tissue**

The interaction between light and biological tissues has been proven to be a useful tool to identify and classify several diseases. Absorption, refraction and scattering are the three different types of interaction that can be measured in biological tissues [15]. Light absorption measures the amount of light absorbed and transformed to energy by tissue molecules. Specific wavelengths of the spectrum will present absorption peaks related to the transitions between two energy levels in a molecule, which can provide tissue diagnostic information. Absorption is the inverse measurement of reflectance using HSI systems. The measurement of refraction and reflection of light is based on changes in speed and direction of the incident light into tissue. Particularly, hemoglobin (Hb) is the major component of

**43**

disadvantages.

**Figure 2.**

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications*

the spectral signature between 450 and 600 nm of biological tissues, and spectral differences can be observed in the absorption/reflectance between oxygenated and deoxygenated Hb states [16]. A single absorbance peak is found at 560 nm in deoxygenated Hb, while two absorbance peaks are found at 540 and 580 nm in oxygenated

*Oxy-Hb (a) and deoxy-Hb (b) normalized absorption spectra, with Hb concentrations of 50 g/L and 68 g/L, respectively. The solid lines are experimentally measured, and the dotted black lines are the ideal. Oxy-Hb* 

There are two main types of medical HSI processing: optical inverse modeling and machine learning approaches. In this section, both methods will be presented in detail, showing their main characteristics, as well as their advantages and

Hb [17]. **Figure 2** shows an example of these Hb signatures published in [18]. Regarding the measurement of light scattering, it is achieved when there is a spatial variation of the reflective index in the illuminated tissue. Scattering properties can be highly useful in diagnostic applications, since they provide different variations in tissue affected by a certain disease [19]. For example, the spectral range between 700 and 900 nm is related with the scattering dominant optical properties of collagen [20]. Also, the near-infrared spectral region is the scattering dominant region of fat, lipids, collagen, and water. Moreover, several tissues have fluorescence properties that can be revealed when such tissue is excited with certain wavelengths. As an example, ultraviolet light can be used to excite tissues, revealing the fluorescence emission of proteins and nucleic acids [21]. More details about

biological tissue optical properties can be found in [19].

*(c) and deoxy-Hb (d) measured and theoretical attenuation coefficients [18].*

**3. Information extraction methods for HSI**

*DOI: http://dx.doi.org/10.5772/intechopen.93960*

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications DOI: http://dx.doi.org/10.5772/intechopen.93960*

#### **Figure 2.**

*Multimedia Information Retrieval*

**Figure 1.**

reached in this literature analysis.

**2. Optical properties of biological tissue**

is possible to differentiate among different materials or substances. The data structure used in HSI comprises both the spectral and spatial features from a given scene, and is referred to as hyperspectral (HS) cube. **Figure 1** shows a graphical representation of

Although historically HSI has been applied to remote sensing [3], in recent years this technology has become a trending topic in different research fields such as food quality analysis [4, 5], military and security applications [6] or agriculture [7, 8], among many others [9]. HSI is also an emerging imaging modality in the medical field. It has been proven that the interaction between the electromagnetic radiation and matter carries useful information for diagnostic proposes [10]. As an alternative diagnostic tool, one of the strengths offered by HSI is being completely non-invasive and label-free. In medical research applications, this technology has been employed for more than twenty years in different areas such as the analysis of cancerous tissues in *in-vivo* and *ex-vivo* samples [11], digital and computational pathology [12], melanoma detection [13] or several gastroenterology diseases [14]. In this chapter, a survey of the most common processing frameworks employed in the literature for information extraction in medical HSI will be presented. First, a brief introduction of the optical properties of biological tissues is provided. Second, the most common information extraction methods employed for HSI medical processing are described and discussed, including optical inverse modeling and machine learning methods. The last section summarizes the conclusions

The interaction between light and biological tissues has been proven to be a useful tool to identify and classify several diseases. Absorption, refraction and scattering are the three different types of interaction that can be measured in biological tissues [15]. Light absorption measures the amount of light absorbed and transformed to energy by tissue molecules. Specific wavelengths of the spectrum will present absorption peaks related to the transitions between two energy levels in a molecule, which can provide tissue diagnostic information. Absorption is the inverse measurement of reflectance using HSI systems. The measurement of refraction and reflection of light is based on changes in speed and direction of the incident light into tissue. Particularly, hemoglobin (Hb) is the major component of

an HS cube with an example of a spectral signature for the top-right pixel.

*Example of a HS cube with an example of a spectral signature.*

**42**

*Oxy-Hb (a) and deoxy-Hb (b) normalized absorption spectra, with Hb concentrations of 50 g/L and 68 g/L, respectively. The solid lines are experimentally measured, and the dotted black lines are the ideal. Oxy-Hb (c) and deoxy-Hb (d) measured and theoretical attenuation coefficients [18].*

the spectral signature between 450 and 600 nm of biological tissues, and spectral differences can be observed in the absorption/reflectance between oxygenated and deoxygenated Hb states [16]. A single absorbance peak is found at 560 nm in deoxygenated Hb, while two absorbance peaks are found at 540 and 580 nm in oxygenated Hb [17]. **Figure 2** shows an example of these Hb signatures published in [18].

Regarding the measurement of light scattering, it is achieved when there is a spatial variation of the reflective index in the illuminated tissue. Scattering properties can be highly useful in diagnostic applications, since they provide different variations in tissue affected by a certain disease [19]. For example, the spectral range between 700 and 900 nm is related with the scattering dominant optical properties of collagen [20]. Also, the near-infrared spectral region is the scattering dominant region of fat, lipids, collagen, and water. Moreover, several tissues have fluorescence properties that can be revealed when such tissue is excited with certain wavelengths. As an example, ultraviolet light can be used to excite tissues, revealing the fluorescence emission of proteins and nucleic acids [21]. More details about biological tissue optical properties can be found in [19].

## **3. Information extraction methods for HSI**

There are two main types of medical HSI processing: optical inverse modeling and machine learning approaches. In this section, both methods will be presented in detail, showing their main characteristics, as well as their advantages and disadvantages.

## **3.1 Optical inverse modeling**

In optical inverse modeling techniques, a mathematical equation which models the interaction between the light and tissue is proposed, and the collected HS data is used to extract optical properties, such as the absorption or scattering of tissue. First, a physics-based model is proposed for the light propagation in tissues. Second, the HS data are used to extract optical properties from the proposed light propagation model. Although the number of studies which make use of this type of approach is limited, some researchers have used HS and light transport models in tissue to extract useful information for the detection of different diseases or conditions. Milanic *et al.* used Monte Carlo simulations of a light transport model in skin to extract information about the contents of melanin and blood saturation, with the goal of measuring cholesterol levels in human skin [22]. The same authors performed a similar processing analysis to skin HS data, but with the goal of detecting arthritis [23]. Claridge *et al.* demonstrated the utility of optical inverse modeling techniques for the estimation of the blood volume fraction of ex-vivo colon samples, showing statistically significant differences between the blood volume fraction of tumor and healthy conditions [24].

The use of optical inverse modeling for information extraction in medical HSI presents some advantages and challenges. The main advantage is to count with an established physical-based model for correlating measured data, which are theoretically strong and contain tissue optical parameters that can be used for diagnostics. The main disadvantage of this approach is the possibility of bias in the model development and over-simplification of complex physical processes, which could result in suboptimal performance for information extraction.

#### **3.2 Machine learning methods**

Machine Learning (ML) methods are algorithms able to learn from data. ML algorithms enable solutions to difficult tasks which usually cannot be performed by a traditionally designed computer program [25]. There are different ML algorithms depending on the task they perform. In regression problems, a numerical variable is estimated from the data. In the context of medical HSI, Arimoto *et al.* used regression techniques to estimate the oxygen saturation map from human retina [26]. In classification problems, the objective is to assign a data sample to a fixed category. For example, Fabelo *et al.* used classification to identify normal tissue, tumor tissue, hypervascularized tissue and background in HS images from in-vivo human brain tissue [27]. The results of the classification of a medical HS image are usually represented as a classification map or heat map, where different colors are used for each class (**Figure 3**).

ML algorithms can be classified as supervised and unsupervised. In unsupervised algorithms, the goal is to cluster similar data samples in groups, extracting the information from data features. In supervised algorithms, the data is comprised of the data features and associated labels [29]. For example, in the example of **Figure 3A**, the data features consist of the spectra of each pixel of the HS image, and the labels are the different categories into which each pixel can be categorized, i.e. normal tissue, tumor tissue, hypervascularized tissue and background. The main goal of supervised algorithms is to use data and their labels to train a model which can be used to perform predictions about new data. ML techniques can be categorized as Feature Learning (FL) or Deep Learning (DL) methods. In FL approaches, the inputs of a supervised classifier are given by features extracted from the data. For example, in an image processing framework, such features may be related to shape, texture or color. On the contrary, DL approaches are devoted

**45**

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications*

to use all the data as input to a supervised classifier, and the important features to

*Example of classification and heat maps obtained through ML classification from (A) in-vivo brain tissue HS* 

There are challenges related with both types of ML approaches. On the one hand, in FL methods, the classification may be biased by which features are selected from the data for the classification, while the identification of features is performed automatically in a DL algorithm. On the other hand, DL methods usually require large amounts of data to succeed in the feature extraction and classification, while FL approaches may provide good performance with a limited dataset. Next, we provide a survey about the different ML approaches which are commonly used for

In this section, we describe the most common FL approaches which have been employed for processing medical HS data. This section is categorized in three main categories, namely pixel-wise classification, feature extraction and selection

In the HS literature, the concept of pixel-wise processing refers to the exclusive usage of the spectral information within an HS cube for extracting information from HS data. Recently, Ghamisi *et al.* performed a survey between the most commonly used classifiers in pixel-wise classification of HS images [30]. The most common classifiers used for the classification of HS images from a feature learning perspective are Support Vector Machines (SVMs), Random Forest (RF) and Multinomial

SVM is a binary classification algorithm proposed by Vapnik [31]. The algorithm finds the optimal hyperplane that maximizes the margin between samples belonging to different classes. Although it was originally designed for linear classification, an SVM classifier can be used for nonlinear classification problems by using different kernels to map the data into a higher dimensional space. SVM has been shown to provide competitive classification performance on HS data even with a limited train-

RF was firstly proposed by Breiman [33]. This algorithm consists of an ensemble of decision trees, where, in each decision tree, the training data are hierarchically

perform the classification task are learned by the supervised classifier.

methods, and the usage of both the spatial and spectral information.

*DOI: http://dx.doi.org/10.5772/intechopen.93960*

HSI processing in medical applications.

*images [27] and (B) in-vitro H&E brain tissue HS images [28].*

Logistic Regression (MLR) based approaches.

*3.2.1 Feature learning*

**Figure 3.**

*3.2.1.1 Pixel-wise classifiers*

ing sample size [32].

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications DOI: http://dx.doi.org/10.5772/intechopen.93960*

**Figure 3.**

*Multimedia Information Retrieval*

**3.1 Optical inverse modeling**

healthy conditions [24].

**3.2 Machine learning methods**

each class (**Figure 3**).

In optical inverse modeling techniques, a mathematical equation which models the interaction between the light and tissue is proposed, and the collected HS data is used to extract optical properties, such as the absorption or scattering of tissue. First, a physics-based model is proposed for the light propagation in tissues. Second, the HS data are used to extract optical properties from the proposed light propagation model. Although the number of studies which make use of this type of approach is limited, some researchers have used HS and light transport models in tissue to extract useful information for the detection of different diseases or conditions. Milanic *et al.* used Monte Carlo simulations of a light transport model in skin to extract information about the contents of melanin and blood saturation, with the goal of measuring cholesterol levels in human skin [22]. The same authors performed a similar processing analysis to skin HS data, but with the goal of detecting arthritis [23]. Claridge *et al.* demonstrated the utility of optical inverse modeling techniques for the estimation of the blood volume fraction of ex-vivo colon samples, showing statistically significant differences between the blood volume fraction of tumor and

The use of optical inverse modeling for information extraction in medical HSI presents some advantages and challenges. The main advantage is to count with an established physical-based model for correlating measured data, which are theoretically strong and contain tissue optical parameters that can be used for diagnostics. The main disadvantage of this approach is the possibility of bias in the model development and over-simplification of complex physical processes, which

Machine Learning (ML) methods are algorithms able to learn from data. ML algorithms enable solutions to difficult tasks which usually cannot be performed by a traditionally designed computer program [25]. There are different ML algorithms depending on the task they perform. In regression problems, a numerical variable is estimated from the data. In the context of medical HSI, Arimoto *et al.* used regression techniques to estimate the oxygen saturation map from human retina [26]. In classification problems, the objective is to assign a data sample to a fixed category. For example, Fabelo *et al.* used classification to identify normal tissue, tumor tissue, hypervascularized tissue and background in HS images from in-vivo human brain tissue [27]. The results of the classification of a medical HS image are usually represented as a classification map or heat map, where different colors are used for

ML algorithms can be classified as supervised and unsupervised. In unsupervised algorithms, the goal is to cluster similar data samples in groups, extracting the information from data features. In supervised algorithms, the data is comprised of the data features and associated labels [29]. For example, in the example of **Figure 3A**, the data features consist of the spectra of each pixel of the HS image, and the labels are the different categories into which each pixel can be categorized, i.e. normal tissue, tumor tissue, hypervascularized tissue and background. The main goal of supervised algorithms is to use data and their labels to train a model which can be used to perform predictions about new data. ML techniques can be categorized as Feature Learning (FL) or Deep Learning (DL) methods. In FL approaches, the inputs of a supervised classifier are given by features extracted from the data. For example, in an image processing framework, such features may be related to shape, texture or color. On the contrary, DL approaches are devoted

could result in suboptimal performance for information extraction.

**44**

*Example of classification and heat maps obtained through ML classification from (A) in-vivo brain tissue HS images [27] and (B) in-vitro H&E brain tissue HS images [28].*

to use all the data as input to a supervised classifier, and the important features to perform the classification task are learned by the supervised classifier.

There are challenges related with both types of ML approaches. On the one hand, in FL methods, the classification may be biased by which features are selected from the data for the classification, while the identification of features is performed automatically in a DL algorithm. On the other hand, DL methods usually require large amounts of data to succeed in the feature extraction and classification, while FL approaches may provide good performance with a limited dataset. Next, we provide a survey about the different ML approaches which are commonly used for HSI processing in medical applications.

#### *3.2.1 Feature learning*

In this section, we describe the most common FL approaches which have been employed for processing medical HS data. This section is categorized in three main categories, namely pixel-wise classification, feature extraction and selection methods, and the usage of both the spatial and spectral information.

#### *3.2.1.1 Pixel-wise classifiers*

In the HS literature, the concept of pixel-wise processing refers to the exclusive usage of the spectral information within an HS cube for extracting information from HS data. Recently, Ghamisi *et al.* performed a survey between the most commonly used classifiers in pixel-wise classification of HS images [30]. The most common classifiers used for the classification of HS images from a feature learning perspective are Support Vector Machines (SVMs), Random Forest (RF) and Multinomial Logistic Regression (MLR) based approaches.

SVM is a binary classification algorithm proposed by Vapnik [31]. The algorithm finds the optimal hyperplane that maximizes the margin between samples belonging to different classes. Although it was originally designed for linear classification, an SVM classifier can be used for nonlinear classification problems by using different kernels to map the data into a higher dimensional space. SVM has been shown to provide competitive classification performance on HS data even with a limited training sample size [32].

RF was firstly proposed by Breiman [33]. This algorithm consists of an ensemble of decision trees, where, in each decision tree, the training data are hierarchically

partitioned into smaller homogeneous groups. In RF, different decision trees are generated from the training data, and the different classification results are combined by a voting process. The main advantage of RF is a reduced training time. RF has been successfully used for the classification of HS images [34].

Finally, MLR [35] approaches exploit the posterior class distributions of the training data for making predictions, and these methods have been successfully applied for the classification of HS images. The main advantages of MLR are fast computation for training and customizability, which allows modifications to the original algorithm to provide better generalization, e.g. sparsity constraints or multiple feature learning.

In the context of medical HS classification, several authors have utilized the spectral information for the diagnosis of different diseases in a pixel-wise manner. The most commonly used pixel-wise classifier in medical HSI is SVM. In the context of surgical guidance, Akbari *et al.* processed HS images from the abdomen to detect intestinal ischemia [36]. For cancer detection, SVM and HSI have been used for the identification of gastric cancer [37], prostate cancer [38], tongue cancer [39], and skin cancer [40]. Although RF and MLR have been widely used for HS information extraction, their usage in medical HSI is limited. RF has been used for the detection of in-vivo oral cancer [41], while MLR has been considered for identification of ulcerative colitis in histological slides [42]. The main challenge in this field is to determine which pixel-wise classifier is more suitable for the classification of certain HS data. In this sense, some authors have performed comparisons of performance of different pixel-wise classifiers for the detection of brain cancer in histological slides [43], or the detection of the tumor margins in head and neck ex-vivo tissue [44]. Although SVM has been shown to outperform other classifiers, a deeper comparison between different classifiers should be urgently performed to definitively demonstrate which pixel-wise classifier performs better with HSI across multiple applications.

#### *3.2.1.2 Feature extraction and feature selection*

HS data are characterized by a high dimensionality. For this reason, instead of exploiting the complete spectral signature for image analysis, one trend in HSI processing is the use of Dimensionality Reduction (DR) methods. These methods are devoted to reduce dimensionality of the original data while preserving the most relevant information [45]. DR methods have been extensively used for HS image processing. There are two main types of DR approaches: feature extraction and feature selection methods.

On the one hand, in feature extraction methods, a transformation is applied to the data to generate a new representation with lower dimensionality, but similar information content. The most studied DR algorithm for HSI is Principal Component Analysis (PCA). The goal of PCA [46] is searching for a linear transformation of the data by using orthogonal projections which minimize the covariance matrix of the original data. On the other hand, several data transformation approaches have been proposed for dimensionality reduction, such as wavelet transformations [47], different orthogonal projection approaches, or the exploitation of manifold embedding [48].

Nevertheless, in feature extraction methods the data are transformed, and thus the physical information about specific wavelengths is lost, which means that the provided interaction between light and tissue cannot be analyzed, which may affect certain applications. For this reason, feature selection methods are devoted to find the most relevant features from the original data by keeping the most relevant information. In the context of HSI, feature selection methods are also known as

**47**

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications*

about more sophisticated band selection techniques can be found in [51].

of Fourier Series coefficients for breast cancer detection [61].

compared to the full-spectra counterpart.

The use of band selection methods for medical HSI applications is not as extended as in other fields, such as remote sensing. However, some researchers have successfully exploited different band selection methods in HSI. Goto *et al.* used the Mahalanobis distance to determine the optimal wavelengths for gastric cancer, correctly identifying normal and tumor mucosa [62]. Additionally, mRMR has been used for the identification of the most relevant bands for ex-vivo breast cancer detection [61], and for in-vivo head and neck cancer [63]. Finally, Martinez-Vega *et al.* proposed a search-based method based on different optimization algorithms for the identification of the most relevant wavelengths for brain tumor detection within in-vivo HS images [64]. The optimization function was the pixel-wise classification performance metrics obtained by an SVM classifier. The results demonstrated that a GA optimization slightly improves tumor identification

Both feature selection and feature extraction methods aim to reduce the dimensionality of HS data while retaining the most important information. Successful application of these techniques leads to reduced computational time, which is

In the context of medical HSI, feature extraction methods are used both as standalone methods and as a preprocessing stage before further data analysis. The former approach is to enhance the visualization of data, while the latter reduces the complexity of the data for being processed by other machine learning approaches. As an example of the direct application of PCA for tissue visualization enhancement, Zuzak *et al.* applied PCA to abdominal HS images in order to enhance the visualization of biliary trees using in-vivo samples [52]. Also, Wilson *et al.* demonstrated the ability of HSI for melanin detection in histological unstained specimens of melanocytic lesions in the skin and the eye using PCA and false-color representations of data [53]. PCA has been used for extracting the most important features of HS data prior to classification in different applications, such as the detection of in-vivo oral cancer [54], prostate cancer in histological slides [55], the identification of white blood cells in blood smear slides [56] or the intraoperative delineation of brain tumors [57]. Another example of the utility of feature extraction methods was demonstrated by Hadoux *et al.*, where relevant differences between the retinal spectral data from patients with Alzheimer and healthy patients were found after applying an orthogonal projection of data [58]. Such differences in the spectral signature from different disease states were not possible using the raw spectral signature of tissue. Beyond PCA and orthogonal projection methods, Ravi *et al.* proposed a modification of the t-Distributed Stochastic Neighbor Embedding feature extraction algorithm, a non-linear dimensionality reduction technique, prior to the identification of tumor tissue within in-vivo brain samples [59]. Other feature extraction methods used in medical HSI prior to classification are the use of wavelet transformation for the detection of prostate cancer in mice models [60], or the use

band selection methods, which also seek to identify the most relevant spectral features for a certain application. There are several types of band selection methods. In this chapter, we only describe the most prominent methods used in medical HSI. In a large-dissimilarity criteria approach, the goal is to select the most dissimilar spectral bands. Conversely, in a low-correlation criterion, the spectral bands showing low correlation between each other are selected. An example of this kind of algorithm is Maximum Relevance Minimum Redundancy (mRMR). In search-based methods, the band selection is performed by solving an optimization problem driven by a given optimization function. These algorithms search for the best bands to solve such optimization problem. Some search-based methods used in HSI are Genetic Algorithm (GA) [49] or Particle Swarm Optimization (PSO) [50]. Further details

*DOI: http://dx.doi.org/10.5772/intechopen.93960*

#### *Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications DOI: http://dx.doi.org/10.5772/intechopen.93960*

band selection methods, which also seek to identify the most relevant spectral features for a certain application. There are several types of band selection methods. In this chapter, we only describe the most prominent methods used in medical HSI. In a large-dissimilarity criteria approach, the goal is to select the most dissimilar spectral bands. Conversely, in a low-correlation criterion, the spectral bands showing low correlation between each other are selected. An example of this kind of algorithm is Maximum Relevance Minimum Redundancy (mRMR). In search-based methods, the band selection is performed by solving an optimization problem driven by a given optimization function. These algorithms search for the best bands to solve such optimization problem. Some search-based methods used in HSI are Genetic Algorithm (GA) [49] or Particle Swarm Optimization (PSO) [50]. Further details about more sophisticated band selection techniques can be found in [51].

In the context of medical HSI, feature extraction methods are used both as standalone methods and as a preprocessing stage before further data analysis. The former approach is to enhance the visualization of data, while the latter reduces the complexity of the data for being processed by other machine learning approaches. As an example of the direct application of PCA for tissue visualization enhancement, Zuzak *et al.* applied PCA to abdominal HS images in order to enhance the visualization of biliary trees using in-vivo samples [52]. Also, Wilson *et al.* demonstrated the ability of HSI for melanin detection in histological unstained specimens of melanocytic lesions in the skin and the eye using PCA and false-color representations of data [53]. PCA has been used for extracting the most important features of HS data prior to classification in different applications, such as the detection of in-vivo oral cancer [54], prostate cancer in histological slides [55], the identification of white blood cells in blood smear slides [56] or the intraoperative delineation of brain tumors [57]. Another example of the utility of feature extraction methods was demonstrated by Hadoux *et al.*, where relevant differences between the retinal spectral data from patients with Alzheimer and healthy patients were found after applying an orthogonal projection of data [58]. Such differences in the spectral signature from different disease states were not possible using the raw spectral signature of tissue. Beyond PCA and orthogonal projection methods, Ravi *et al.* proposed a modification of the t-Distributed Stochastic Neighbor Embedding feature extraction algorithm, a non-linear dimensionality reduction technique, prior to the identification of tumor tissue within in-vivo brain samples [59]. Other feature extraction methods used in medical HSI prior to classification are the use of wavelet transformation for the detection of prostate cancer in mice models [60], or the use of Fourier Series coefficients for breast cancer detection [61].

The use of band selection methods for medical HSI applications is not as extended as in other fields, such as remote sensing. However, some researchers have successfully exploited different band selection methods in HSI. Goto *et al.* used the Mahalanobis distance to determine the optimal wavelengths for gastric cancer, correctly identifying normal and tumor mucosa [62]. Additionally, mRMR has been used for the identification of the most relevant bands for ex-vivo breast cancer detection [61], and for in-vivo head and neck cancer [63]. Finally, Martinez-Vega *et al.* proposed a search-based method based on different optimization algorithms for the identification of the most relevant wavelengths for brain tumor detection within in-vivo HS images [64]. The optimization function was the pixel-wise classification performance metrics obtained by an SVM classifier. The results demonstrated that a GA optimization slightly improves tumor identification compared to the full-spectra counterpart.

Both feature selection and feature extraction methods aim to reduce the dimensionality of HS data while retaining the most important information. Successful application of these techniques leads to reduced computational time, which is

*Multimedia Information Retrieval*

multiple feature learning.

multiple applications.

feature selection methods.

*3.2.1.2 Feature extraction and feature selection*

exploitation of manifold embedding [48].

partitioned into smaller homogeneous groups. In RF, different decision trees are generated from the training data, and the different classification results are combined by a voting process. The main advantage of RF is a reduced training time. RF

Finally, MLR [35] approaches exploit the posterior class distributions of the training data for making predictions, and these methods have been successfully applied for the classification of HS images. The main advantages of MLR are fast computation for training and customizability, which allows modifications to the original algorithm to provide better generalization, e.g. sparsity constraints or

In the context of medical HS classification, several authors have utilized the spectral information for the diagnosis of different diseases in a pixel-wise manner. The most commonly used pixel-wise classifier in medical HSI is SVM. In the context of surgical guidance, Akbari *et al.* processed HS images from the abdomen to detect intestinal ischemia [36]. For cancer detection, SVM and HSI have been used for the identification of gastric cancer [37], prostate cancer [38], tongue cancer [39], and skin cancer [40]. Although RF and MLR have been widely used for HS information extraction, their usage in medical HSI is limited. RF has been used for the detection of in-vivo oral cancer [41], while MLR has been considered for identification of ulcerative colitis in histological slides [42]. The main challenge in this field is to determine which pixel-wise classifier is more suitable for the classification of certain HS data. In this sense, some authors have performed comparisons of performance of different pixel-wise classifiers for the detection of brain cancer in histological slides [43], or the detection of the tumor margins in head and neck ex-vivo tissue [44]. Although SVM has been shown to outperform other classifiers, a deeper comparison between different classifiers should be urgently performed to definitively demonstrate which pixel-wise classifier performs better with HSI across

HS data are characterized by a high dimensionality. For this reason, instead of exploiting the complete spectral signature for image analysis, one trend in HSI processing is the use of Dimensionality Reduction (DR) methods. These methods are devoted to reduce dimensionality of the original data while preserving the most relevant information [45]. DR methods have been extensively used for HS image processing. There are two main types of DR approaches: feature extraction and

On the one hand, in feature extraction methods, a transformation is applied

Nevertheless, in feature extraction methods the data are transformed, and thus the physical information about specific wavelengths is lost, which means that the provided interaction between light and tissue cannot be analyzed, which may affect certain applications. For this reason, feature selection methods are devoted to find the most relevant features from the original data by keeping the most relevant information. In the context of HSI, feature selection methods are also known as

to the data to generate a new representation with lower dimensionality, but similar information content. The most studied DR algorithm for HSI is Principal Component Analysis (PCA). The goal of PCA [46] is searching for a linear transformation of the data by using orthogonal projections which minimize the covariance matrix of the original data. On the other hand, several data transformation approaches have been proposed for dimensionality reduction, such as wavelet transformations [47], different orthogonal projection approaches, or the

has been successfully used for the classification of HS images [34].

**46**

required in applications such as surgical guidance. Nevertheless, for biomedical HS applications, there are some relevant advantages of using band selection methods instead of feature extraction methods. The first advantage is that the information about the concrete wavelengths that are used is retained. This fact allows further analysis about the physical response of different tissues to specific wavelengths. The second advantage of band selection methods is the possibility of developing custom HS cameras which only captures the most relevant spectral channels for a given application. Such reduced-band cameras would be able to acquire HS video, which would be also convenient for some surgical guidance applications.

#### *3.2.1.3 Spatial-spectral information*

Although the aforementioned data processing methods rely on the spectral information, a HS cube is a 3D data structure containing both the spatial and the spectral information of a scene. In a recent review manuscript, He *et al.* provided a survey about different spatial-spectral techniques which have been used for the classification of HSI [65]. The inclusion of both spectral and spatial information is motivated by the limitations found in the spectral data. First, the high dimensionality of spectral data together with a limited dataset can lead to the *curse of dimensionality*. This phenomenon offers more detailed information about the captured scene, but it also contains redundant information and increases the computational time required to process the data [4]. Second, the high variability shown in the spectral data due to different lighting conditions, instrumentation noise, or other phenomena, makes the classification based only on the spectral information a challenge. In addition, high intra-class and low inter-class variability of the spectral signatures produces difficulties in the differentiation between classes. This problem is particularly challenging in biomedical data, where data originate from multiple patients. For these reasons, researchers within the HSI processing community have successfully improved the classification of pixel-wise approaches by the utilization of spatial and spectral features from HS images.

In [65], the authors proposed a classification of spatial-spectral approaches in three main types, depending on how the spatial information is integrated in the processing framework. In pre-processing approaches, spatial and spectral features are extracted from the HS cube, and then such features are used for the classification. In integrated classification, both spatial and spectral features are used to train the classifier. Finally, in post-processing approaches, the spatial information is employed to refine the results of a pixel-wise processing of the HS cube.

In the context of medical HSI processing, most of the spatial-spectral approaches have been focused in pre-processing and post-processing schemes. Some pre-processing approaches are the following. In leukemia detection in blood smear slides, Wang *et al.* evaluated the usage of three types of inputs for a supervised classifier: spatial features, spectral features, and spatial-spectral features. The results of this study suggest that the exploitation of both the spatial and the spectral features significantly improves the quality of the classification [66]. Similarly, Li *et al.* evaluated the feasibility of utilizing HSI for Red Blood Cell (RBC) counting. After conducting the RBC counting using uniquely spatial or spectral features of blood cells, the authors found an improvement in the under-counting and overcounting rates when they performed the image analysis using both types of features together [67]. Ortega *et al.* make use of the spatial information of the HS data by performing superpixel segmentation [68]. In post-processing approaches, Fabelo *et al.* proposed the incorporation of the spatial information to the SVM pixel-wise classification by using a K-nearest neighbors spatial filter which makes use of a onedimensional representation of the HS cube extracted using PCA for the identification of in-vivo brain tumor [57].

**49**

**Figure 4.**

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications*

Deep Learning is a family of machine learning algorithms that learn abstract features to best represent and make predictions about new data that is presented. More specifically, neural networks (NNs) consist of consecutive layers of neurons that have non-linear activations that connect the input data, extract features, and connect to logical outputs representing the classes of labels to provide prediction probabilities. Neural networks can have various dimensionalities, which largely depends on the size and dimensions of the input data. For example, utilizing only spectral signature information, a 1-D NN can extract features with fully-connected layers or 1-D convolutions. However, HS cameras acquire spatial information and spectral signatures simultaneously. Therefore, to exploit both sets of features, pseudo 3-D HS data can be input directly into a 2D-CNN and extract spatial features with learned convolutional kernels in the spatial domain, and these filters are connected across the entire spectral domain of the HS data. Lastly, 3D-CNN can utilize the full pseudo 3D HS data as input and extract spatial-spectral features with 3D convolutional kernels. There are numerous approaches, but these methods require more computational processing as more features and dimensions are involved.

The most widely used approach is 2D-CNNs. Aggressive brain tumors, such as glioblastoma, often require surgical resection for treatment, and surgeons often implement multiple imaging modalities, including fluorescence, to aid in this very challenging task. In a pilot study to aid brain surgeons with label-free HSI, Fabelo *et al.* compared both 2D-CNN and 1D-DNN, considering spectral-only and spectral-spatial classification using DL [69]. In HSI digital histology, Ortega *et al.* detected glioblastoma brain cancer in digital slides using a patch-based 2D-CNN approach [70]. Additionally, Halicek *et al.* has employed very deep 2D-CNNs for classification, specifically the widely-used Inception v4 model (**Figure 4**) implemented in a sliding patch-based approach for head and neck squamous cancer [71] and thyroid and salivary gland cancers [72]. For comparing 2D-CNN and 3D-CNNs, in [73] Halicek *et al.* explored spatial-spectral convolutions in 3D CNNs with 3D convolutional kernels to 2D approaches. Although data were limited to only 12 patients, the preliminary results suggest 3D convolutions outperformed 2D

convolutions for CNN design at the cost of computational power and speed.

Another desired application of DL for HSI is semantic segmentation, which allows the entire scene to be classified altogether from spectral-spatial features in the entire scene. Semantic segmentation does not require image reconstruction like patch-based 2D-CNN approaches. The most commonly used method is the U-Net, as first used in

*Schematic diagram of the modified inception v4 CNN architecture. The CNN was customized to operate on the 25 × 25 × 91 patch-size selected. The receptive field size and number of convolutional filters is shown at bottom of each inception block. The convolutional kernel size used for convolutions is shown in italics inside each convolution box. Squeeze-and-excitation modules were added to the CNN to increase performance [72].*

*DOI: http://dx.doi.org/10.5772/intechopen.93960*

*3.2.2 Deep learning methods*

*Information Extraction Techniques in Hyperspectral Imaging Biomedical Applications DOI: http://dx.doi.org/10.5772/intechopen.93960*

### *3.2.2 Deep learning methods*

*Multimedia Information Retrieval*

*3.2.1.3 Spatial-spectral information*

required in applications such as surgical guidance. Nevertheless, for biomedical HS applications, there are some relevant advantages of using band selection methods instead of feature extraction methods. The first advantage is that the information about the concrete wavelengths that are used is retained. This fact allows further analysis about the physical response of different tissues to specific wavelengths. The second advantage of band selection methods is the possibility of developing custom HS cameras which only captures the most relevant spectral channels for a given application. Such reduced-band cameras would be able to acquire HS video, which

Although the aforementioned data processing methods rely on the spectral information, a HS cube is a 3D data structure containing both the spatial and the spectral information of a scene. In a recent review manuscript, He *et al.* provided a survey about different spatial-spectral techniques which have been used for the classification of HSI [65]. The inclusion of both spectral and spatial information is motivated by the limitations found in the spectral data. First, the high dimensionality of spectral data together with a limited dataset can lead to the *curse of dimensionality*. This phenomenon offers more detailed information about the captured scene, but it also contains redundant information and increases the computational time required to process the data [4]. Second, the high variability shown in the spectral data due to different lighting conditions, instrumentation noise, or other phenomena, makes the classification based only on the spectral information a challenge. In addition, high intra-class and low inter-class variability of the spectral signatures produces difficulties in the differentiation between classes. This problem is particularly challenging in biomedical data, where data originate from multiple patients. For these reasons, researchers within the HSI processing community have successfully improved the classification of pixel-wise

approaches by the utilization of spatial and spectral features from HS images.

In [65], the authors proposed a classification of spatial-spectral approaches in three main types, depending on how the spatial information is integrated in the processing framework. In pre-processing approaches, spatial and spectral features are extracted from the HS cube, and then such features are used for the classification. In integrated classification, both spatial and spectral features are used to train the classifier. Finally, in post-processing approaches, the spatial information is employed to refine the results of a pixel-wise processing of the HS cube. In the context of medical HSI processing, most of the spatial-spectral approaches have been focused in pre-processing and post-processing schemes. Some pre-processing approaches are the following. In leukemia detection in blood smear slides, Wang *et al.* evaluated the usage of three types of inputs for a supervised classifier: spatial features, spectral features, and spatial-spectral features. The results of this study suggest that the exploitation of both the spatial and the spectral features significantly improves the quality of the classification [66]. Similarly, Li *et al.* evaluated the feasibility of utilizing HSI for Red Blood Cell (RBC) counting. After conducting the RBC counting using uniquely spatial or spectral features of blood cells, the authors found an improvement in the under-counting and overcounting rates when they performed the image analysis using both types of features together [67]. Ortega *et al.* make use of the spatial information of the HS data by performing superpixel segmentation [68]. In post-processing approaches, Fabelo *et al.* proposed the incorporation of the spatial information to the SVM pixel-wise classification by using a K-nearest neighbors spatial filter which makes use of a onedimensional representation of the HS cube extracted using PCA for the identifica-

would be also convenient for some surgical guidance applications.

**48**

tion of in-vivo brain tumor [57].

Deep Learning is a family of machine learning algorithms that learn abstract features to best represent and make predictions about new data that is presented. More specifically, neural networks (NNs) consist of consecutive layers of neurons that have non-linear activations that connect the input data, extract features, and connect to logical outputs representing the classes of labels to provide prediction probabilities. Neural networks can have various dimensionalities, which largely depends on the size and dimensions of the input data. For example, utilizing only spectral signature information, a 1-D NN can extract features with fully-connected layers or 1-D convolutions. However, HS cameras acquire spatial information and spectral signatures simultaneously. Therefore, to exploit both sets of features, pseudo 3-D HS data can be input directly into a 2D-CNN and extract spatial features with learned convolutional kernels in the spatial domain, and these filters are connected across the entire spectral domain of the HS data. Lastly, 3D-CNN can utilize the full pseudo 3D HS data as input and extract spatial-spectral features with 3D convolutional kernels. There are numerous approaches, but these methods require more computational processing as more features and dimensions are involved.

The most widely used approach is 2D-CNNs. Aggressive brain tumors, such as glioblastoma, often require surgical resection for treatment, and surgeons often implement multiple imaging modalities, including fluorescence, to aid in this very challenging task. In a pilot study to aid brain surgeons with label-free HSI, Fabelo *et al.* compared both 2D-CNN and 1D-DNN, considering spectral-only and spectral-spatial classification using DL [69]. In HSI digital histology, Ortega *et al.* detected glioblastoma brain cancer in digital slides using a patch-based 2D-CNN approach [70]. Additionally, Halicek *et al.* has employed very deep 2D-CNNs for classification, specifically the widely-used Inception v4 model (**Figure 4**) implemented in a sliding patch-based approach for head and neck squamous cancer [71] and thyroid and salivary gland cancers [72]. For comparing 2D-CNN and 3D-CNNs, in [73] Halicek *et al.* explored spatial-spectral convolutions in 3D CNNs with 3D convolutional kernels to 2D approaches. Although data were limited to only 12 patients, the preliminary results suggest 3D convolutions outperformed 2D convolutions for CNN design at the cost of computational power and speed.

Another desired application of DL for HSI is semantic segmentation, which allows the entire scene to be classified altogether from spectral-spatial features in the entire scene. Semantic segmentation does not require image reconstruction like patch-based 2D-CNN approaches. The most commonly used method is the U-Net, as first used in

#### **Figure 4.**

*Schematic diagram of the modified inception v4 CNN architecture. The CNN was customized to operate on the 25 × 25 × 91 patch-size selected. The receptive field size and number of convolutional filters is shown at bottom of each inception block. The convolutional kernel size used for convolutions is shown in italics inside each convolution box. Squeeze-and-excitation modules were added to the CNN to increase performance [72].*

HSI by Trajanovski *et al.* for tongue cancer detection with a 2D input data using all HS channels for semantic segmentation of ex-vivo specimens [74]. Additionally, Kho et al. used ex-vivo specimens from patients with breast cancer and applied a standard U-net with 2D input HS data using all spectral channels for semantic segmentation [75].

More recently, several modern DL approaches with origins in computer-vision have been applied to medical HSI experimentally. In [76], a generative adversarial network (GAN) was applied to use DL to learn the association of RGB images and HS images to learn the ability to generate HS digital histology images from standard RGB digital histology images of breast cancer. Another modern approach is long-short-term-memory (LSTM) and recurrent neural networks (RNN) which can utilize spatial-spectral and time-based inputs to operate in real-time video approaches. In [77], RNNs are compared to and outperform 2D- and 3D-CNN methods for in-vivo cancer detection with the goal of real-time video endoscopy.

The use of DL for HS processing is currently a hot topic in the research community in different fields. The main advantage of DL approaches in HSI is their capability to exploit jointly the spatial and the spectral information for image processing tasks. Currently, researchers are experimenting with different DL architectures in order to find the most appropriate DL model for HSI [78]. In the context of medical HSI, the use of DL in medical HS have shown good performance in different applications, but its usage is still limited compared to other ML approaches. The main reason is the limited number of data due to the novelty of the technology. More publicly available datasets with a large number of patients are required in order to definitively establish an adequate comparative of DL and traditional ML techniques.
