*2.3.2 Automated planning (self-driving) process*

Once the dosimetric goals have been established and the technique chosen, automatic plan generation is also possible [14].

Some attempts [71, 72] have made to solve various aspects of this problem by predicting the best beam orientations. The larger task of automated treatment planning, however, is well suited for reinforcement learning method [14]. Reinforcement is extensively used in games, self-driving cars, and other popularculture applications. In reinforcement learning method, an algorithm learns to navigate a set of rules, given some constraints, by self-correcting its decisions. Basically, the algorithm will take a decision (for instance, increase the weight of a given constraint) and learn from the simulator (the treatment planning system) whether the decision resulted in the right direction [14]. This technique has successfully used by Google Brain to develop an algorithm capable of beating a Go world champion [73]. So, reinforcement technique could provide performance at the level of our best dosimetrists if properly implemented.

Overall, one challenge of achieving full automatic planning using reinforcement learning lies in the close integration and need for robust treatment planning systems (TPSs) [14]. The future vision is toward a fully-automated planning process, from contouring to plan creation [62], with the human experts (dosimetrists, physicists, and physicians) evaluating, supervising, and providing QA to the given results.
