4.2.1. Traditional machine learning methods

Support vector machine (SVM) is a common classification method in leukemia (Jacob et al. [50]; Agaian et al. [53]; Kazemi et al. [57]; Madhukar et al. [54]). However, other methods (Supardi et al., Gumble and Rode) have been applied with success to classify AML and ALL histology and cytology images including k-nearest neighbor classifiers [51, 55], a hybrid multilayer neural network (HMLNN) (Harun et al.) [56], and an ensemble particle swarm model selection method (EPSMS) (Escalante et al.) [59]. Alternatively, Kumar et al. suggested using a shallow neural network (NN) classifier after the AML slide is processed using wavelet transformation [37]. Other groups (Mohapatra et al.; Reta et al.; Escalante et al.) have compared multiple classifiers on leukemia image datasets and found that depending on the target, different classification methods appear to be the optimal solution [52, 58, 59].

When a small amount of data is available, conventional feature engineering-based machine learning algorithms provide fairly accurate predictions [39]. The accuracy of feature engineering proposed models depends on the distinct leukemia databases studied, the number and quality of the images, and the image acquisition mode; these require different data preprocessing steps. These methods are mainly based on supervised classification of leukemia subtypes. When the set of quantitative morphological features of the leukemia subtype is trained on a labeled dataset, then classifiers have been able to predict the four major leukemia types or the FAB classes applied to a test set. In case of insufficient number of training samples, Kasmin et al. proposed reinforcement learning to classify ALL, AML, CLL, and CML from PB cellular nucleus' geometrical, texture, color, and statistical parameters [63].

Although these previous studies found new morphological features from the digitalized leukemia patient histology slides, and were successfully able to identify the major leukemia types and M0–M7 and L1–L3 subtypes, morphological features from the leukemia cells were not correlated with non-morphological information such as genetic mutations and clinical data. The morphological classification methods currently are not sufficient to recognize the majority of the underlying molecular abnormalities and cannot be used to direct therapy. In addition, the subtype groups' underlying genetic patterns are not unique per subtype. Should morphological classification match genetic backgrounds, this could help speed up the diagnosis process. One study attempted to correlate morphological quantitative features in order to classify ALL lymphoblasts into the WHO subtypes and compare the results with flow cytometry analysis. To this aim, an unsupervised feature selection method was applied, and an optimal subset of the features was extracted to match the WHO classification [61]. This study and others that follow are helping pave the way for increasingly sophisticated means of classifying leukemia by images that enable incorporation of genetic and epigenetic details. Advances in computational methods are too, as the next section describes.
