5. Conclusions and future outlook

Standard leukemia diagnosis and therapy are currently based on morphological classification of patients' bone marrow smears and biopsies, peripheral blood smears, and molecular and cytogenetic analyses to identify genetic abnormalities. However, morphological and genetic classification analysis is insufficient to fully predict appropriate response to therapy, while emerging nonstandard methods to improve and personalize leukemia classification can be expensive and time-consuming. Digital pathology is emerging as a powerful, inexpensive tool to enhance biopsy- and smear-based decisions.

This review discussed how computational cytology can help improve leukemia diagnosis by enhancing pathologist smear-based decisions and improve leukemia diagnosis with automated, biologically meaningful pattern recognition. Techniques summarized in this review extract quantitative imaging features from stained bone marrow and peripheral blood smear samples to detect and classify leukemia. To identify morphological features, conventional machine learning approaches have been broadly applied to classify leukemia types and subtypes based on feature engineering. However, to acquire a new set of morphological features in leukemia, a deep learning approach would provide higher accuracy.

variability of the tissue type. Grayscale images are two-dimensional: width and height. Color images have a third dimension, depth, representing the RGB color channels [38, 65, 67, 68].

Once the set of images is defined and labeled, feature maps are created by sliding a series of filters representing shapes, textures, or colors over the input image (convolution), thus identifying local dependencies. The filters representing the features are learned during the training process through backpropagation and a gradient descent algorithm. After convolution, an activation process introduces nonlinear properties to the linear convolution to improve the model accuracy and to avoid overfitting. The convolutional layer then is down-sampled (pooling). This is successively repeated as many times as necessary according to the hierarchical complexity of the image. The last feature map is then flattened into a one-dimensional vector to feed a fully connected layer for neural network (NN) classification. The NN classification process can be replaced by a different classification scheme such as an SVM or random forest [38, 65, 67, 68].

Convolutional neural networks are ideally suited for pattern recognition and medical image analysis. In fact, CNNs have been successfully applied to feature learning to detect and diagnose a number of different cancers, including leukemia cells. Deep learning methods have been used for white blood cell detection and classification [68], lymphocyte detection [38], and lymphoma subtype classification [38] by identifying three subtypes of lymphoma: chronic lymphocytic leukemia (CLL), follicular lymphoma (FL), and mantle cell lymphoma (MCL). It also has been applied to the analysis of ALL cellular images to classify ALL subtype histopa-

Although the current research in pattern recognition is dominated by the supervised deep learning approach, the unsupervised approach is expected to provide breakthrough results in the near future, and extensive research is currently ongoing to optimize these algorithms [65, 66].

Standard leukemia diagnosis and therapy are currently based on morphological classification of patients' bone marrow smears and biopsies, peripheral blood smears, and molecular and cytogenetic analyses to identify genetic abnormalities. However, morphological and genetic classification analysis is insufficient to fully predict appropriate response to therapy, while

thology [67, 69].

5. Conclusions and future outlook

Figure 3. Feature learning for classification.

106 Hematology - Latest Research and Clinical Advances

For most of the cases reviewed in this chapter, the image processing pipeline implements a supervised classification scheme, where the morphometric features are extracted from a set of labeled data (ALL vs. AML, FAB, M1, etc.) and then are validated on a test dataset. In future studies, supervised morphological analysis can be complemented with unsupervised classification schemes such as unbiased clustering. This approach could reveal whether entirely new classification schemes should be implemented for ALL or AML, independent from known acute or chronic leukemia subtype morphological classification. It also could potentially reveal common underlying genetic or proteomic patterns.

Emerging omics analysis methods are determining protein expression signatures for leukemia patients; however, these new processes can be time and labor intensive. To determine genetic information and protein signature membership rapidly and without the time delay required for proteomic-based signature assignment, advances in digital pathology offer potentially exciting, inexpensive, rapid alternatives. If morphological surrogates that reliably correlate with clinical, genetic, or proteomic features, either individually or in combinatorial patterns, can be identified directly from histology images, then this could significantly speed up leukemia diagnosis, reduce the cost of the diagnostic workup, optimize the assignment of patients to a particular therapy, and potentially uncover new pathways for drug targeting.

Cell metrics can be predefined manually, and often metrics are those known to be pertinent to leukemia cells. These algorithms, which together are employed as part of a "feature engineering process," extract metrics from images based on features of cells (e.g., size or nucleus shape). Using a supervised classification approach, the metrics are extracted from predefined leukemia subtypes. As an example, a set of quantitative morphological features defining a leukemia subtype are trained on a labeled dataset according to the FAB morphological classes, and the resulting developed classifier is then used to predict the leukemia subtypes on a test set.

In the unsupervised classification approach, new clusters of leukemia subtypes are created from the engineered features. Contrary to the feature engineering process, learning algorithms self-discover features representative of leukemia cell types (feature learning) where features are learned from annotated (supervised) or unannotated (unsupervised) data (Figure 2).
