4.1. Examples of digital pathology for leukemia

Reference Leukemia type Extracted features Classification

PB:L1 vs L2 vs L3 Nucleus & Cytoplasm & Cell:

PB: ALL vs AML Nucleus & cytoplasm & cell

Shape: Size Cytoplasm: Color: RGB and HSV components Texture:

Holes Nucleus:

Coarseness Intensity and shape: Auer Rodes as Cytoplasmic

Nucleoli count Texture: Frourier descriptor, Wavelet and Haralick coefficients (GLCM statistics): Contrast, Correlation, Energy, Homogeneity, Entropy

Shape: Area Nucleus & Cell:

Shape: Form Factor, Roundness, Compactness, Elongation, Perimeter Nucleus Indentation, Nucleoli count, Texture: Frourier descriptor, Wavelet and Haralick coefficients (GLCM statistics): Contrast, Correlation, Energy, Homogeneity, Entropy

Shape: Size ratio: Nucleus / Cell Nucleus & Cytoplasm: Color: RGB and HSV components Cytoplasm: Vacuole count Nucleus:

Mohapatra thesis 2013 [61]

102 Hematology - Latest Research and Clinical Advances

Mohapatra thesis 2013 [61]

Table 1. Leukemia subtype classification.

coefficients and GLCM statistics: Contrast, Correlation, Energy, Homogeneity, Entropy

(RBFN), Support Vector Machine (SVM)

Ensemble of classifiers (EOC), Naive Bayesian (NB), K-nearest neighbor (KNN), Multilayer Perceptron (MLP NN),

Support Vector Machine (SVM)

Ensemble of Classifiers (EOC), Naive Bayesian (NB), K-nearest neighbor (KNN), Multilayer Perceptron (MLP NN), Radial Basis Functional Network (RBFN), Support Vector Machine (SVM)

Radial Basis Functional Network (RBFN),

Shape: Form Factor, Roundness, Compactness, Elongation, Perimeter Color: mean intensity of R,G,B and Hue, Saturation, Lightness components Texture: Fourier transform: Mean, variance, skewness, kurtosis of the frequency components Boundary roughness: Fractal HD dimension Contour signature: Variance, skewness, kurtosis (center-contour)

Nucleus:

To provide examples of digital pathology's impact in leukemia classification, we summarize here a few of the recent studies. In one study, ALL cells were distinguished from healthy PB cells from shape and texture features extracted from the nucleus and cytoplasm (Gumble and Rode). These features included area, total white blood cells, total black pixels, perimeter, eccentricity, solidity, form factor, and bounding box parameters [51]. In another study, Mohapatra et al. added color and the Fourier descriptor as a cell-based nuclear feature to the shape, fractal, and texture parameters to distinguish ALL from healthy lymphoblasts/lymphocytes [52].

What literally do these features mean? In the Mohapatra et al. study, color features of a cell were calculated from the mean intensity of the nucleus color components in RGB or HSV color space and from a grayscale intensity map. In the case of RGB images, the mean intensity of the red, green, and blue channels and, in the case of HSV images, the mean intensity of the hue, saturation, and lightness components were computed. The same color features were calculated for the cytoplasm. The Fourier descriptors were the mean, variance, skewness, and kurtosis of the texture in the frequency domain. The fractal/HD of the nucleus boundary roughness was considered, as was the variance, skewness, and kurtosis computed between the cell's center and each contour point. Texture features from the cytoplasm included wavelet coefficients and metrics derived from the GLCM including contrast, correlation, energy, homogeneity, and entropy values. The area was calculated for the nucleus, cytoplasm, and the whole cell [52].

In addition to determining leukemia from cell-based features, AML can be distinguished from healthy tissue by extracting whole tissue/slide-based features as illustrated in two other studies (Madhukar et al., Agaian et al.) [53, 54].

Furthermore, AML can also be distinguished from ALL through comparing cellular features in patient smears, as shown by Jacob and Mundackal [50], Supardi et al. [55], and Harun et al. [56]. Jacob et al. and Supardi et al. used cellular metrics based on texture, shape, and Hausdorff dimension, while Harun et al. classified the two leukemias by cell and nuclear perimeters, areas of the cytoplasm and whole cells, and nucleus-cytoplasm ratio [56].

More specifically, AML and ALL subtypes have been discriminated based on cell-based features in three different studies. To classify AML subtypes, Kazemi et al. predicted five AML groups (M2, M3, M4, M5, and all the remaining subtypes (M0, M1, M6, M7) considered as one group) based on handcrafted morphological features from blood microscopic images. The features used were extracted from cells' nuclei: irregularity, Hausdorff dimension, shape, color, and texture features complemented by the nucleus-cytoplasm ratio. The same set of features allowed more accurate discrimination of healthy tissue vs. AML tissue than AML tissue vs. ALL tissue [57]. Reta et al. performed a similar analysis which discriminated L1, L2, M3, M3, and M5 subtypes in ALL and AML based on cellular features, with nucleus features proving to be the most discriminative [58]. An earlier study (Escalante et al.) was also able to discriminate multiple leukemia tissue types from the BM: ALL vs. AML, L1 vs. L2, M2 vs. M3 + M5, M3 vs. M2 + M5, M5 vs. M2 + M3, M1 vs. M3 vs. M5, and L1 vs. L2 vs. M1 vs. M3 vs. M5 [59]. However, in the latter study, there was no significant difference in model performance using features extracted from the nucleus and cytoplasm vs. the whole cell.

of the underlying molecular abnormalities and cannot be used to direct therapy. In addition, the subtype groups' underlying genetic patterns are not unique per subtype. Should morphological classification match genetic backgrounds, this could help speed up the diagnosis process. One study attempted to correlate morphological quantitative features in order to classify ALL lymphoblasts into the WHO subtypes and compare the results with flow cytometry analysis. To this aim, an unsupervised feature selection method was applied, and an optimal subset of the features was extracted to match the WHO classification [61]. This study and others that follow are helping pave the way for increasingly sophisticated means of classifying leukemia by images that enable incorporation of genetic and epigenetic details. Advances in

Quantitative-Morphological and Cytological Analyses in Leukemia

http://dx.doi.org/10.5772/intechopen.73675

105

Although engineered feature-based conventional machine learning algorithms provide fairly accurate predictions, they do not reach the capability of human perception. The feature engineering process requires defining a carefully chosen set of features. This is a laborious process, and the feature parameters are very sensitive to the specific training set from where they were extracted. Due to this rigidity, a conventional machine learning algorithm likely could not be applied to a second dataset without parameter tweaking. To overcome these limitations, deep learning algorithms trained on large amounts of data can extract generalized features to

When a large amount of data is available, for identifying morphological features in leukemia, a deep learning approach can be applied. Deep learning can self-discover new, hierarchical features in images (feature learning) allowing better pattern recognition for classification. These features are identified without human knowledge, and the learning approach is called "domainagonistic," where the computational system alone is able to distinguish distinct tissue types in any type of cancer. Today, with the increasing computing capacity of modern computers and the availability of big data storage, huge amounts of data can now be extracted and analyzed to identify key features for classification. This has enabled deep learning methods to outperform previous conventional machine learning approaches and to achieve higher accuracy [39, 66].

Deep learning is the extension of conventional, artificial neural networks where, instead of a single-layered network, a multilayered connected network processes input data and generates output. The network design is dependent on the input dataset and classification target. For pattern classification problems, convolutional neural networks (CNNs) are the ideally suited network design. The network learns from the example images fed to it and extracts hierarchical features automatically layer by layer (e.g., from low-level features like edges to higher-level features such as the cell, tissue, and then organ) without expert human intervention while

The input of the CNN is a series of images, cropped from the whole slide image, and the images are processed in batch. For WBC classification, one cropped image contains one whole cell. Contrary to the cell-based analysis, for tissue classification, the images are slide-based, so the features are learned directly from the spatial pattern. The image size and the number of images fed to the network should be chosen carefully, and the variety of images should represent the

computational methods are too, as the next section describes.

perform human-level pattern recognition [64, 65].

retaining highly expressive power (Figure 3) [65–67].

4.2.2. Deep learning methods

This contradicts other studies that suggest classification based on subcellular morphometry improves AML [60] and ALL [61] subtype recognition. In particular, these groups found that color and shape information in the cytoplasmic holes, which indicate vacuoles, and color and shape information on the nucleus, which indicate nucleoli, can reveal the presence of Auer rods discriminating AML from ALL where Auer rods are absent [61].

In addition to the large number of publications characterizing acute forms of leukemia, studies (Vaghela et al.) have suggested measurements of WBC roundness and counts can discriminate chronic myeloid vs. chronic lymphoid leukemia [62].
