1. Introduction

The joint models for longitudinal data and time-to-event data are aimed to measure the association between the longitudinal marker level and the hazard rate for an event. The longitudinal data are collected repeatedly for several subjects. In this data, there are two types of covariates, namely, time-independent covariates and time-dependent covariates. Furthermore, there are also two different categories of time-dependent covariates, namely, external and internal covariates. In clinical studies, internal time-dependent longitudinal outcomes are often applied to monitor disease progression and failure time.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and eproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In modern survival analysis, Cox [1] has been considered as a very popular joint model to be used for time-independent covariates. These models measured the effect of time-independent covariates on the hazard rate for an event. Subsequently, the extended Cox model was developed for external time-dependent covariates. However, these latter models cannot handle longitudinal biomarkers. Therefore, Rizopoulos [2] introduced joint models for internal timedependent covariates and the risk for an event based on linear mixed-effects models and relative risk models.

Model 1) and penalized spline joint model with a piecewise-constant baseline risk function (referred to as Model 2). The joint likelihood, score functions and the ECM algorithm for estimation are presented in Section 3. We then validate the proposed algorithm using extensive simulation studies and then apply it for AIDS data in Section 4. Finally, Section 5 gives

In this section, we introduce the joint models using penalized spline with truncated polynomial basis. The proposed parametrization is based on the standard joint models of Rizopoulos

censored, this means that we lose its follow up, or the subject has died from other causes, we

For a longitudinal response, suppose that we have n subjects in the sample and the actual

at ni time points. Thus, the longitudinal data consists of the measurements yij <sup>¼</sup> yi tij ; <sup>j</sup> <sup>¼</sup> 1;…; nig taken at time points tij: We denote the true and unobserved value of the longitudinal

When survival function S tð Þ is assumed to have a specific parametric form associating with a longitudinal submodel, estimations for parameters of interest are usually based on the likelihood function [2]. In the maximum likelihood method, there are different treatments for different types of covariates in the longitudinal submodel. Here, we present the two different categories of time-dependent covariates and the estimation techniques for these covariates will

th subject up to time t. According to Kalbfleisch and Prentice [7], the exogenous covariates are

Prðs ≤ Ti < s þ dsjTi ≥ s;Yið Þs Þ ¼ Prð Þ s ≤ Ti < s þ dsjTi ≥ s;Yið Þt , (2)

be introduced in the following sections. We let the time-dependent covariate for the i

ð Þt . We let YiðÞ¼ t yi

for all s, t such that 0 < s ≤ t and ds ! 0. An equivalent definition is

<sup>i</sup> be the true survival time and Ci

<sup>i</sup> ≤ Ci

ð Þt . We measure the i

ð Þt and mið Þt as

ð Þ<sup>s</sup> ; <sup>0</sup> <sup>≤</sup> <sup>s</sup> <sup>&</sup>lt; <sup>t</sup> denote the covariate history of the

ðÞ¼ t mið Þþ t εið Þt , (1)

th subject is not censored,

. The observed

th subject is

107

th subject

th subject

th subject ð Þ <sup>i</sup> <sup>¼</sup> <sup>1</sup>; …; <sup>n</sup> . Ti denotes the observed failure time for

Penalized Spline Joint Models for Longitudinal and Time-To-Event Data

http://dx.doi.org/10.5772/intechopen.75975

<sup>i</sup> ;Ci . If an i

[2] and the regression model of a longitudinal response using penalized spline.

this means that we have observed its survival time, we will have Ti ≤Ci. If an i

will have Ti <sup>&</sup>gt; Ci. Furthermore, we define the event indicator as <sup>δ</sup><sup>i</sup> <sup>¼</sup> I T<sup>∗</sup>

yi

Notations in this section are taken from Rizopoulos [2]. Let T<sup>∗</sup>

th subject ð Þ <sup>i</sup> <sup>¼</sup> <sup>1</sup>;…; <sup>n</sup> , which is defined as Ti <sup>¼</sup> min T<sup>∗</sup>

observed longitudinal data for each subject-i at time point t is yi

outcome at time t as mið Þt . We assume the relation between yi

concluding remarks.

be the censoring time for the i

where <sup>ε</sup>iðÞ� <sup>t</sup> <sup>N</sup> <sup>0</sup>; <sup>σ</sup><sup>2</sup>

at time t be denoted by yi

i

the i

2. The penalized spline joint models

data for survival outcome are Ti ð Þ ; δ<sup>i</sup> , i ¼ 1, …, n.

ε .

the covariates satisfying the condition:

The basic assumption for the standard joint models proposed by Rizopoulos [2] is that the hazard rate at a given time of the dropout process is associated with the expected value of the longitudinal responses at the same time. The whole history of response has an influence on the survival function. Thus, it is crucial to obtain good estimates for the subject-specific trajectories in order to have an accurate estimation of the survival function. In addition, an important feature that we need to account for is that many observations in the sample often show non-linear and fluctuated longitudinal trajectories in time. Each observation has its own trajectory. Therefore, flexibility is needed for subject-specific longitudinal submodels in the joint models to improve the predictions.

There are several previous works to flexibly model the subject-specific longitudinal profiles in the joint models. Brown et al. [3] applied B-splines with multidimensional random effects. In particular, Brown et al. [3] assumed that both subject and population trajectories have the same number of basis functions. By doing this, the number of parameters in the longitudinal submodel is reasonably large. If we have to deal with the roughness of the fit for this model, the computational problems will increase especially when the dimension of the random effects vector is large. Ding and Wang [4] proposed the use of B-splines with a single multiplicative random effect to link the population mean function with the subject-specific profile. This simple model can gain an easy estimation for parameters, however may not be appropriate for many practical applications [5]. Rizopoulos [5] considered more flexible models using natural cubic splines with the expansion of the random effects vector. The roughness of the fit is still not mentioned in these models.

In this chapter, we present new approaches to model non-linear shapes of subjects-specific evolutions for joint models by extending the standard joint models of Rizopoulos [2]. In particular, we implement penalized splines using a truncated polynomial basis for the longitudinal submodel. Following this, the linear mixed-effects approach is applied to model the individual trajectories and impose smoothness over adjacent coefficients respectively. The ECM algorithm is used for parameter estimation. In addition, corresponding standard errors are calculated using the observed information matrix. However, as the matrices of random effects covariates in our models are different from the matrices of random effects covariates in the standard joint models, the JM package of Rizopoulos [6] cannot be used for our models. Therefore, a set of R codes are written for the penalized spline joint models to implement the proposed procedures on the simulated data and a case study respectively.

The chapter is organized as follows. Section 2 describes the penalized splines with truncated polynomial basis for the joint models. In this section, the two models are specified as penalized spline joint model with hazard rate at base line having Gompertz distribution (referred to as Model 1) and penalized spline joint model with a piecewise-constant baseline risk function (referred to as Model 2). The joint likelihood, score functions and the ECM algorithm for estimation are presented in Section 3. We then validate the proposed algorithm using extensive simulation studies and then apply it for AIDS data in Section 4. Finally, Section 5 gives concluding remarks.
