**3.1 Big data in radiation oncology: challenges?**

There are ongoing community-wide efforts in term of big data in radiation oncology, e.g., [9, 10, 50, 51] have made available and established validation frameworks [50] used as a benchmark for the evaluation of different algorithms. Deep learning [61] based models have indicated superiority among the other alternatives for the most prediction tasks in radiation oncology. However, it requires a lot of annotated datasets (across multiple institutions) to tune the algorithm (even when transfer learning is used [14]) to obtain high prediction accuracy. This can prove challenging in radiation oncology, where datasets are limited. Standardizing the radiation oncology nomenclature (i.e., clinical, dosimetric, imaging, etc.), which is aided by the AAPM task group TG-263 efforts [104], and developing standards for data collection process (structures) of the patient data are also essential for training models using datasets from multiple institutions.
