2. Machine learning strategies

This overview is limited to classical software-based tools and techniques for brevity's sake. Several hardware-based approaches are mentioned in the Future Prospects section.

The ML ecosystem is both extensive and complex [3–5], with many possible ways to subdivide or classify its members. One frequently used classification scheme outlines two broad groups of ML algorithms: supervised learning, where the model is presented with both a set of labelled example inputs and desired outputs (called the training dataset), with the goal to learn a mapping from inputs to outputs, and unsupervised learning, where no labels are given to the model, leaving it to learn the input-output mapping in unstructured data. A notable specific case of supervised learning is reinforcement learning (RL), where training data consists only of positive ("reward") and negative ("punishment") feedback, given according to the model's performance in the training environment.

Another informative approach to classifying ML algorithms is based on the desired type of output of the given model, such as classification (division of the input data into two (binary classification) or more (multi-label classification) predetermined groups), clustering (similar to classification but with the groups not known beforehand), dimensionality reduction (simplification of high-dimensional input data by mapping them into a lower-dimensional space), search, etc. Of these, clustering is particularly notable due to its broad and general applicability A Review on Machine Learning and Deep Learning Techniques Applied to Liquid Biopsy http://dx.doi.org/10.5772/intechopen.79404 51

otherwise be difficult to process and interpret. Additionally, these fields themselves provide new challenges for machine learning that can ultimately advance existing ML techniques and

The mutual history of machine learning and biological and medical disciplines is both long and complex. An early ML technique, the perceptron, was made in attempt to model the behaviour of biological neurons [1] and was used early on to define the start sites of translation initiation sequences in E. coli [2], and can be considered the starting point of the entire field of machine learning. In the last few decades, the power, flexibility, and accessibility of ML and DL techniques have grown considerably, and it can be expected that they will provide significant assistance in the discovery and understanding of the mounting volume of biological and

In this chapter, we first provide an overview of the commonly used ML and DL techniques and strategies and outline their broad areas of applicability with regard to processing and analysis of biological and medical data. Next, we attempt to summarise the available corpus of research and development concerning the application of ML and DL techniques to the process of analysis and interpretation of biomedical data, focusing on liquid biopsy analysis, outline several of the main avenues of such research, and predict the potential improvements and changes in this highly dynamic and quickly developing field. Expertise in ML is not a prerequisite for this chapter, although we assume basic overall familiarity with the most well-known

This overview is limited to classical software-based tools and techniques for brevity's sake.

The ML ecosystem is both extensive and complex [3–5], with many possible ways to subdivide or classify its members. One frequently used classification scheme outlines two broad groups of ML algorithms: supervised learning, where the model is presented with both a set of labelled example inputs and desired outputs (called the training dataset), with the goal to learn a mapping from inputs to outputs, and unsupervised learning, where no labels are given to the model, leaving it to learn the input-output mapping in unstructured data. A notable specific case of supervised learning is reinforcement learning (RL), where training data consists only of positive ("reward") and negative ("punishment") feedback, given according to the

Another informative approach to classifying ML algorithms is based on the desired type of output of the given model, such as classification (division of the input data into two (binary classification) or more (multi-label classification) predetermined groups), clustering (similar to classification but with the groups not known beforehand), dimensionality reduction (simplification of high-dimensional input data by mapping them into a lower-dimensional space), search, etc. Of these, clustering is particularly notable due to its broad and general applicability

Several hardware-based approaches are mentioned in the Future Prospects section.

give rise to new ones.

50 Liquid Biopsy

medical data.

ML and DL models, techniques, and methodologies.

model's performance in the training environment.

2. Machine learning strategies

Figure 1. The structure of a typical feed-forward deep neural network, with a fully connected input layer I, an unspecified number of hidden processing layers H, and an output layer O.

and the wide range of models, methods, and algorithms [6–8] that can be employed to carry out cluster analysis.

The notion of "cluster" is often not precisely defined and tends to serve as an umbrella term for various types of data objects, typically groups of data points with small distances (appropriately defined) between group members, higher-density areas of some parameter space, particular statistical distributions, etc. The desired clustering algorithm, therefore, depends on both the given data set and the intended application of the returned results. Due to these complications, clustering, like many other data analysis methods, is typically not fully automated even within the domain of machine learning but instead tends to partially rely on preprocessing and initial parameter selection, based on the specifications of the task at hand.

Deep learning is a subclass of machine learning problems, the distinction being based on the training data representations instead of specific algorithms. Similar to ML in general, deep learning can be both supervised and unsupervised [9]. Deep learning models tend to be vaguely similar to information processing patterns in biological brains (and are therefore often called artificial neural networks), in that they use multiple layers [10] of non-linear processing units (frequently called "neurons", even though their similarity to biological neurons is usually limited) for pattern recognition and transformation, with each successive layer using as inputs the output from a previous layer, forming a hierarchy of representations and levels of abstraction. The number of hidden layers of an artificial neural network broadly determines the "computational power" of the network [11] (Figure 1).

Machine learning models have been applied to a wide variety of fields and problem classes, including computer vision, natural language processing, machine translation, bioinformatics and biochemistry [12], with results often similar or superior [13] to those produced by human domain experts.
