**Abstract**

The prediction of the onset of different complications of disease, in general, is challenging due to the existence of unmeasured risk factors, imbalanced data, timevarying data due to dynamics, and various interventions to the disease over time. Scholars share a common argument that many Artificial Intelligence techniques that successfully model disease are often in the form of a "black box" where the internal workings and complexities are extremely difficult to understand, both from practitioners' and patients' perspective. There is a need for appropriate Artificial Intelligence techniques to build predictive models that not only capture unmeasured effects to improve prediction, but are also transparent in how they model data so that knowledge about disease processes can be extracted and trust in the model can be maintained by clinicians. The proposed strategy builds probabilistic graphical models for prediction with the inclusion of informative hidden variables. These are added in a stepwise manner to improve predictive performance whilst maintaining as simple a model as possible, which is regarded as crucial for the interpretation of the prediction results. This chapter explores this key issue with a specific focus on diabetes data. According to the literature on disease modelling, especially on major diseases such as diabetes, a patient's mortality often occurs due to the associated complications caused by the disease over time and not the disease itself. This is often patient-specific and will depend on what type of cohort a patient belongs to. Another main focus of this study is patient personalisation via precision medicine by discovering meaningful subgroups of patients which are characterised as phenotypes. These phenotypes are explained further using Bayesian network analysis methods and temporal association rules. Overall, this chapter discussed the earlier research of the chapter's author. It explores Artificial Intelligence (IDA) techniques for modelling the progression of disease whilst simultaneously stratifying patients and doing so in a transparent manner as possible. To this end, it reviews the current literature on some of the most common Artificial Intelligent (AI) methodologies, including probabilistic modelling, association rule mining, phenotype discovery and latent variable discovery by using diabetes as a case study.

**Keywords:** diabetes, complex disease progression, artificial intelligence in medicine, patient model, Bayesian statistics, causal networks, data mining, hidden risk factors
