*3.1.1 Unsupervised learning (UL)*

Unsupervised learning identifies the patterns that are undetected previously. Machines classify them without any guidance from any source. It groups the information in a logical way after comparing and categorizing the unlabeled data, thus performs more complicated process compared to other forms of deep learning. UL is an auto-correction technique based on interpretation and identification to amend the issues of unpredictability. A more commodious AI aid can be developed by taking unsupervised learning principles to ameliorate the effectiveness and precision of health systems. Priority for the health in new generations causes a great number of clinicians to focus specifically on the use of UL to upgrade the efficiency of applications in medical sciences [4].

#### *3.1.2 Supervised learning (SL)*

Supervised learning uses the already existing labeled data to generate the correct conclusions from the samples given. Machines becomes more accurate to give conclusions as number of the samples increases. Machines in SL have already been trained by the previously labeled correct and appropriate input data. This data input helps the machine to further plan a correct output when new unsolved tasks are subsequently given to it. Various algorithms and computational methods are used in SL techniques. Some frequently used learning methods in SL are Neural Networks, Naïve Bayes, Linear Regression, Logistic Regression, Support Vector machine, K-nearest neighbor, and Random Forest for accurate data predictions [5].

#### *3.1.3 Reinforcement learning (RL)*

Reinforcement learning is the science of creating verdict, which is akin to the process that appeared previously to focus in animal behavioral psychology. In this deep learning method, positive and negative reinforcement plays a key role to give reward for machine learning. Unlike supervised learning, in RL, machine is always bound to learn from its own experiences and does not use already labeled correct data for any favorable outcome (**Figure 1A**). RL gives output based on its own exploration of data with a balance between scrutiny of a given data and exploitation of the basic knowledge of machine for that data [6].

Three main approaches are there to apply in Reinforcement Learning [6]: policy based, value based and model based (**Figure 1B**).

#### *3.1.3.1 Policy based*

Policy is the core element of RL. Policy of RL has been made when an agent's behavior at a particular time mapped by the machine and perceived by the environment.

#### **Figure 1.**

*Reinfiorcement learning. (A) Steps of reinfiorcement learning. (B) Model based, value based, and policy based reinfiorcement learning.*
