**2. Notation and definitions**

In the following sections, *R* is the randomization indicator, where *R* = 2 for subjects randomized to the test treatment and *R* = 1 for subjects randomized to the control. Similarly, *X* indicates actual (received) treatment that may not be randomized under protocol violations such as noncompliance, where *X* = 2 for subjects who received the test treatment, *X* = 1 for subjects who received the control, and *X* = 0 for subjects who received no treatment. The observed outcome is *Y* and *YX*=*<sup>x</sup>* is the counterfactual value (or equally potential outcome) of *Y* if treatment *X* was set to *x* (Rubin, 1974, 1978, 1990). ACE is defined as ACE ≡ E(*YX*=2) – E(*YX*=1). Note that ITT and PP estimators are represented by ITT ≡ E(*Y*|*R* = 2) – E(*Y*|*R* = 1) and PP ≡ E(*Y*|*X* = 2, *R* = 2) – E(*Y*|*X* = 1, *R* = 1), respectively. Furthermore, we use the notation *Exr* = E(*Y*|*X* = *x*, *R* = *r*) and *px*|*<sup>r</sup>* = Pr(*X* = *x*|*R* = *r*); then, PP ≡ *E*22 – *E*11. We require the consistency assumption that *YX*=*<sup>x</sup>* = *Y* for all subjects, so that the value of *Y* that would have been observed if *X* had been set to what it in fact was is equal to the value of *Y* that was in fact observed. Thus, this assumption indicates that E(*YX*=*<sup>x</sup>*|*X* = *x*) = E(*Y*|*X* = *x*) and furthermore E(*YX*=*<sup>x</sup>*|*X* = *x*, *R* = *r*) = E(*Y*|*X* = *x*, *R* = *r*) (= *Exr*). We assume that *YX*=*<sup>x</sup>* is independent from *X* given *R* and *Z*, where *Z* is a confounder or a set of confounders between *X* and *Y*. In Sections 3-5, we also require the instrumental variable (IV) assumption, which states that the potential outcome *YX*=*<sup>x</sup>* is not affected directly by the treatment assignment *R*; rather, *YX*=*<sup>x</sup>* is influenced only by the treatment actually received (Holland, 1986; Angrist et al., 1996). Thus, subjects' potential outcomes are independent of treatment assignment and are constant across the sub-populations of subjects assigned to different treatment arms. The IV assumption is formalized as follows:

> ASSUMPTION 1: Instrumental variable (IV) E(*YX*=*<sup>x</sup>*|*R* = 2) = E(*YX*=*<sup>x</sup>*|*R* = 1).

This assumption may hold in successfully blinded randomized trials, because subjects are not aware of their assigned treatments and so the assigned treatments do not affect the potential outcomes. However, this often may not hold in unblinded trials, in which subjects are aware of the assigned treatment and this knowledge may affect the potential outcomes, and needs to be critically evaluated. Assumption 1 is used in Sections 3-5, but is relaxed in Section 6.

#### **3. Biases of estimates**

316 Health Management – Different Approaches and Solutions

Which subgroups to compare to estimate the treatment effect correctly is an important problem. From the viewpoint of treatment compliance, it is considered best to compare the proportion of deaths for the compliers in each group: 106/708 – 274/1813 = –0.14%. This comparison is called the per-protocol (PP) analysis. The PP analysis generally yields biased estimates of treatment effects, because whether patients comply with the assigned treatment is not randomized and several factors may affect it. This problem can be avoided by intention-to-treat (ITT) analysis, in which patients are analyzed according to the assigned treatment regardless of the treatment actually received (Fisher et al., 1990; Lee et al., 1991): 194/1065 – 523/2695 = –1.19%. The ITT estimate may represent the effect of the treatment intended, but generally does not represent the treatment effect itself (Schwartz & Lellouch,

Noncompliance data may be obtained from actual clinical trials, as in the CDP trial. To estimate the treatment effect correctly from such data, we should consider the expected outcomes if all patients had received the test treatment and the control, and compare them. The effect yielded from such a comparison is called the average causal effect (ACE) (Robins & Tsiatis, 1991; Robins & Greenland, 1994). Several researchers have discussed methodology to estimate ACE (Pearl, 2000; Manski, 2003; Sato, 2006), but as yet, no standard methodology has been developed. Nevertheless, we can derive bounds on ACE using the deterministic causal model (e.g., Pearl, 1995; Cai et al., 2007; Chiba, 2009b). In this chapter, we discuss how estimates from major analyses, such as ITT and PP, are biased and present bounds on ACE

To achieve these objectives, this chapter is organized as follows. In Section 2, notation and definitions are provided. Sections 3 and 4 discuss noncompliance by switching the treatment, which, in contrast to the CDP trial, means that non-compliers in a sub-population assigned to treatment A receive treatment B and those assigned to treatment B receive treatment A. We discuss biases from major analyses such as ITT and PP in Section 3, and discuss the bounds on ACE in Section 4. Section 5 discusses noncompliance by receiving no treatment, as in the CDP trial. As in many publications, the instrumental variable (IV) assumption is used in these sections, but this assumption is relaxed in Section 6. Finally, Section 7 offers some concluding remarks. The derivations of equations and inequalities

In the following sections, *R* is the randomization indicator, where *R* = 2 for subjects randomized to the test treatment and *R* = 1 for subjects randomized to the control. Similarly, *X* indicates actual (received) treatment that may not be randomized under protocol violations such as noncompliance, where *X* = 2 for subjects who received the test treatment, *X* = 1 for subjects who received the control, and *X* = 0 for subjects who received no treatment. The observed outcome is *Y* and *YX*=*<sup>x</sup>* is the counterfactual value (or equally potential outcome) of *Y* if treatment *X* was set to *x* (Rubin, 1974, 1978, 1990). ACE is defined as ACE ≡ E(*YX*=2) – E(*YX*=1). Note that ITT and PP estimators are represented by ITT ≡ E(*Y*|*R* = 2) – E(*Y*|*R* = 1) and PP ≡ E(*Y*|*X* = 2, *R* = 2) – E(*Y*|*X* = 1, *R* = 1), respectively. Furthermore, we use the notation *Exr* = E(*Y*|*X* = *x*, *R* = *r*) and *px*|*<sup>r</sup>* = Pr(*X* = *x*|*R* = *r*); then, PP ≡ *E*22 – *E*11. We require the consistency assumption that *YX*=*<sup>x</sup>* = *Y* for all subjects, so that the value of *Y* that would have been observed if *X* had been set to what it in fact was is equal to the value of *Y* that was in fact observed. Thus, this assumption indicates that E(*YX*=*<sup>x</sup>*|*X* = *x*) = E(*Y*|*X* = *x*) and furthermore E(*YX*=*<sup>x</sup>*|*X* = *x*, *R* = *r*) = E(*Y*|*X* = *x*, *R* = *r*) (= *Exr*). We assume that *YX*=*<sup>x</sup>* is

1967; Sheiner & Rubin, 1995).

under certain assumptions.

**2. Notation and definitions** 

presented in this chapter are outlined in Section 8.

In this section and the next section, we discuss noncompliance by switching the treatment, which means that non-compliers in a sub-population assigned to treatment A receive treatment B and those assigned to treatment B receive treatment A. In this type of noncompliance, all subjects have the value *X* = 1 or 2 (and not *X* = 0) for both *R* = 1 and 2. Thus, *p*0|*<sup>r</sup>* = 0 and *p*1|*<sup>r</sup>* + *p*2|*<sup>r</sup>* = 1. The derivations of equations in this section are given in Section 8.1.

In this section, we discuss how estimates from major analyses, such as ITT and PP, are biased. To do so, we introduce the following *R*-specific bias factors due to confounding between *X* and *Y* (Brumback et al, 2004; Chiba et al., 2007):

$$a\_r \equiv \mathbb{E}(Y\_{X \star 2} \mid X = \mathbf{2}, R = r) - \mathbb{E}(Y\_{X \star 2} \mid X = \mathbf{1}, R = r),$$

$$\beta\_r \equiv \mathbb{E}(Y\_{X \star 1} \mid X = \mathbf{2}, R = r) - \mathbb{E}(Y\_{X \star 1} \mid X = \mathbf{1}, R = r),$$

where *r* = 1, 2. *αr* and *βr* are confounding effects that would arise from *R*-stratified comparisons of those with *X* = 2 versus those with *X* = 1. When *αr* > 0 and *βr* > 0, E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = *r*) > E(*YX*=*<sup>x</sup>*|*X* = 1, *R* = *r*), which means that the subjects who received the test treatment tend to have larger outcome values than those who received the control, leading to positive confounding. Conversely, when *αr* < 0 and *βr* < 0, E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = *r*) < E(*YX*=*<sup>x</sup>*|*X* = 1, *R* = *r*), which means that the subjects who received the test treatment tend to have smaller outcome values than those who received the control, leading to negative confounding. No confounding occurs between *X* and *Y* when *αr* = *βr* = 0.

Under Assumption 1, using *αr* and *βr*, E(*YX*=2) and E(*YX*=1) are expressed as:

$$\mathbb{E}(\mathbf{Y}\_{\lambda \sim 2}) = E\_{2r} - a\_{\mathbb{P}} p\_{1 \mid r\_{\mathcal{I}}} \tag{3.1}$$

$$\mathbb{E}(\mathbf{Y}\_{\mathbf{X}^{\*}}) = E\_{1r} + \beta\_{\|}p\_{2\mid r, \mathbf{I}} \tag{3.2}$$

Using these equations, ITT ≡ E(*Y*|*R* = 2) – E(*Y*|*R* = 1) can be expressed by a function of ACE ≡ E(*YX*=2) – E(*YX*=1) and bias factors:

Causal Inference in Randomized Trials with Noncompliance 319

In Section 4.1.1, we introduce bounds on ACE under Assumption 1 only. Because the bounds generally have a broad width, we present the bounds with narrower widths by

When the outcome *Y* has a finite range [*K*0, *K*1], the bounds on ACE under Assumption 1 are

ACE min max .

Note that *K*0 = 0 and *K*1 = 1 in the case of a binary outcome. Furthermore, using a method of linear programming in the case of a binary outcome, Balke and Pearl (1997) presented the

minACE

where *Pyx*|*<sup>r</sup>* = Pr(*Y* = *y*, *X* = *x*|*R* = *r*) (*y* = 0, 1). Inequality (4.2), which is the bounds on ACE having the narrowest width without adding any other assumptions, gives bounds with a narrower width than inequality (4.1) in some situations. However, these bounds generally have broad widths. Thus, in Sections 4.1.2 and 4.1.3, we derive bounds with narrower

To derive narrower bounds, Manski (1997) presented the following monotone treatment

ASSUMPTION 2.1: Monotone treatment response (MTR) *YX*=*<sup>s</sup>* ≥ *YX*=*t for all subjects, where s* ≥ *t*. For (*s*, *t*) = (2, 1), the MTR means that a subject takes a larger outcome value if he/she received the test treatment than if he/she received the control. This holds when it is

ACE ≥ max{ITT, –ITT}. (4.3)

Under Assumptions 1 and 2.1, the lower bound on ACE is improved as follows:

 

 

 

 

1 1|1 21 2|1 11 1|1 0 2|1 1 1|2 22 2|2 12 1|2 0 2|2

> 1 1

1|112|112|121|011|12 1|111|121|022|012|12 1|011|12 2|012|12 2|021|011|022|012|12 1|021|011|122|012|02 2|111|02 1|112|02

 

*PPPPP PPPPP PP PP PPPPP PPP PP PP PP*

(4.1)

,

(4.2)

 

 

*Kp E p E p Kp Kp E p E p Kp*

0 1|1 21 2|1 11 1|1 1 2|1 0 1|2 22 2|2 12 1|2 1 2|2

 

*K p E p E p K p Kp E p E p Kp*

adding some plausible assumptions in Sections 4.1.2 and 4.1.3.

max min

1 1

2|012|021|111|021|01 1|211|022|112|022|01 1|111|02 2|112|02 2|112|021|111|122|12 1|111|022|112|121|12 2|011|12 1|012|12

 

*PPPPP PPPPP PP PP PPPPP PPPPP PP PP*

widths by adding some plausible assumptions.

apparent that the test treatment has a positive effect.

**4.1.2 The monotone treatment response** 

**4.1 Assumptions and bounds** 

**4.1.1 The instrumental variable** 

as follows (Robins, 1989; Manski, 1990):

following bounds under Assumption 1 only:

max

 

 

response (MTR) assumption:

$$\text{ITT} = \text{ACE} + \{a\_2 - (E\_{22} - E\_{12})\} p\_{1\mid 2} + \{\beta\_1 - (E\_{21} - E\_{11})\} p\_{2\mid 1}.\tag{3.3}$$

Thus, the ITT estimator is generally a biased estimator of ACE, and can be unbiased when *α*<sup>2</sup> = *E*22 – *E*12 and *β*1 = *E*21 – *E*11, i.e., E(*YX*=2|*X* = *x*, *R* = *r*) = E(*YX*=1|*X* = *x*, *R* = *r*) for *x* ≠ *r*. This equation implies that the ITT estimate can be unbiased when no treatment effect exists for all subjects (under the sharp null hypothesis: *YX*=2 = *YX*=1 for all subjects). Furthermore, equation (3.3) shows that, if we know whether the treatment effect is positive or negative, we can know the sign of bias of the ITT estimate.

Likewise, it can be demonstrated that the PP estimator is generally a biased estimator of ACE, because the difference between equation (3.1) with *r* = 2 and equation (3.2) with *r* = 1 derives:

$$\text{IPP} = \text{ACE} + a\_2 p\_{1/2} + \beta\_1 p\_{2/1}.\tag{3.4}$$

This equation shows that the PP estimate can be unbiased when *α*2 = 0 and *β*1 = 0, which imply that whether subjects receive the test treatment or control treatment is randomly determined (no confounder exists between *X* and *Y*). Furthermore, if we know the common sign of confounding effects (the common signs of *αr* and *βr*), we can know the sign of the bias of the PP estimate.

In addition to the ITT and PP estimators, the IV estimator has been developed (Cuzick et al., 1997; Greenland, 2000; Hernán & Robins, 2006). The estimate is calculated by the following formula:

$$\text{IV} \equiv \{ \text{E}(Y \mid R=2) - \text{E}(Y \mid R=1) \} / \left( p\_{2 \mid 2} - p\_{2 \mid 1} \right)$$

for *p*2|2 ≠ *p*2|1. Although the IV estimator may yield a less biased estimate of ACE, it is also generally biased. This is because the IV estimator is expressed using bias factors as follows (Chiba, 2010a):

$$\text{IV} = \text{ACE} - w\_1(a\_1 - \beta\_1) + w\_2(a\_2 - \beta\_2), \tag{3.5}$$

where *wr* = *p*1|*rp*2|*<sup>r</sup>*/(*p*2|2 – *p*2|1) and *p*2|2 ≠ *p*2|1. Thus, the IV estimate can be unbiased when *α<sup>r</sup>* = *βr*, i.e., E(*YX*=2 – *YX*=1|*X* = 2, *R* = *r*) = E(*YX*=2 – *YX*=1|*X* = 1, *R* = *r*). Similar to the ITT estimate, the IV estimate can also be unbiased when no treatment effect exists for all subjects (under the sharp null hypothesis: *YX*=2 = *YX*=1 for all subjects). Additionally, the IV estimate can be unbiased even when E(*YX*=2 – *YX*=1|*X* = *x*, *R* = 2) = E(*YX*=2 – *YX*=1|*X* = *x*, *R* = 1) (Robins, 1989). Furthermore, as an alternative to the IV estimator, Chiba (2010b) proposed the following estimator of ACE:

$$\mathbf{IV'} \equiv (E\_{22}p\_{1\mid 1} + E\_{12}p\_{2\mid 1} - E\_{21}p\_{1\mid 2} - E\_{11}p\_{2\mid 2}) / \langle p\_{2\mid 2} - p\_{2\mid 1} \rangle.$$

This estimator is also generally a biased estimator of ACE, and the estimate can be unbiased under *α*1 = *α*2 and *β*1 = *β*2, which may be reasonable when the influence of confounding between *X* and *Y* is equal in both assigned groups.

#### **4. Bounds on average causal effect**

In randomized trials with noncompliance by switching the treatment, we cannot generally estimate ACE in an unbiased manner (Section 3). Thus, in this section, we discuss bounds on ACE. We introduce the bounds under some assumptions in Section 4.1, and illustrate them by using data from a classic randomized trial in Section 4.2. The derivations of inequalities in this section are outlined in Section 8.2.

#### **4.1 Assumptions and bounds**

318 Health Management – Different Approaches and Solutions

Thus, the ITT estimator is generally a biased estimator of ACE, and can be unbiased when *α*<sup>2</sup> = *E*22 – *E*12 and *β*1 = *E*21 – *E*11, i.e., E(*YX*=2|*X* = *x*, *R* = *r*) = E(*YX*=1|*X* = *x*, *R* = *r*) for *x* ≠ *r*. This equation implies that the ITT estimate can be unbiased when no treatment effect exists for all subjects (under the sharp null hypothesis: *YX*=2 = *YX*=1 for all subjects). Furthermore, equation (3.3) shows that, if we know whether the treatment effect is positive or negative,

Likewise, it can be demonstrated that the PP estimator is generally a biased estimator of ACE, because the difference between equation (3.1) with *r* = 2 and equation (3.2) with *r* = 1

This equation shows that the PP estimate can be unbiased when *α*2 = 0 and *β*1 = 0, which imply that whether subjects receive the test treatment or control treatment is randomly determined (no confounder exists between *X* and *Y*). Furthermore, if we know the common sign of confounding effects (the common signs of *αr* and *βr*), we can know the sign of the bias of the PP estimate. In addition to the ITT and PP estimators, the IV estimator has been developed (Cuzick et al., 1997; Greenland, 2000; Hernán & Robins, 2006). The estimate is calculated by the following

IV ≡ {E(*Y*|*R* = 2) – E(*Y*|*R* = 1)}/(*p*2|2 – *p*2|1) for *p*2|2 ≠ *p*2|1. Although the IV estimator may yield a less biased estimate of ACE, it is also generally biased. This is because the IV estimator is expressed using bias factors as follows

where *wr* = *p*1|*rp*2|*<sup>r</sup>*/(*p*2|2 – *p*2|1) and *p*2|2 ≠ *p*2|1. Thus, the IV estimate can be unbiased when *α<sup>r</sup>* = *βr*, i.e., E(*YX*=2 – *YX*=1|*X* = 2, *R* = *r*) = E(*YX*=2 – *YX*=1|*X* = 1, *R* = *r*). Similar to the ITT estimate, the IV estimate can also be unbiased when no treatment effect exists for all subjects (under the sharp null hypothesis: *YX*=2 = *YX*=1 for all subjects). Additionally, the IV estimate can be unbiased even when E(*YX*=2 – *YX*=1|*X* = *x*, *R* = 2) = E(*YX*=2 – *YX*=1|*X* = *x*, *R* = 1) (Robins, 1989). Furthermore, as an alternative to the IV estimator, Chiba (2010b) proposed the following

IV' ≡ (*E*22*p*1|1 + *E*12*p*2|1 – *E*21*p*1|2 – *E*11*p*2|2)/(*p*2|2 – *p*2|1). This estimator is also generally a biased estimator of ACE, and the estimate can be unbiased under *α*1 = *α*2 and *β*1 = *β*2, which may be reasonable when the influence of confounding

In randomized trials with noncompliance by switching the treatment, we cannot generally estimate ACE in an unbiased manner (Section 3). Thus, in this section, we discuss bounds on ACE. We introduce the bounds under some assumptions in Section 4.1, and illustrate them by using data from a classic randomized trial in Section 4.2. The derivations of inequalities

we can know the sign of bias of the ITT estimate.

between *X* and *Y* is equal in both assigned groups.

**4. Bounds on average causal effect** 

in this section are outlined in Section 8.2.

derives:

formula:

(Chiba, 2010a):

estimator of ACE:

ITT = ACE + {*α*2 – (*E*22 – *E*12)}*p*1|2 + {*β*1 – (*E*21 – *E*11)}*p*2|1. (3.3)

PP = ACE + *α*2*p*1|2 + *β*1*p*2|1. (3.4)

IV = ACE – *w*1(*α*1 – *β*1) + *w*2(*α*2 – *β*2), (3.5)

In Section 4.1.1, we introduce bounds on ACE under Assumption 1 only. Because the bounds generally have a broad width, we present the bounds with narrower widths by adding some plausible assumptions in Sections 4.1.2 and 4.1.3.

#### **4.1.1 The instrumental variable**

When the outcome *Y* has a finite range [*K*0, *K*1], the bounds on ACE under Assumption 1 are as follows (Robins, 1989; Manski, 1990):

$$\begin{split} & \max \begin{Bmatrix} K\_0 p\_{1|1} + E\_{21} p\_{2|1} \\ K\_0 p\_{1|2} + E\_{22} p\_{2|2} \end{Bmatrix} - \min \begin{Bmatrix} E\_{11} p\_{1|1} + K\_1 p\_{2|1} \\ E\_{12} p\_{1|2} + K\_1 p\_{2|2} \end{Bmatrix} \\ & \leq \text{ACE} \leq \min \begin{Bmatrix} K\_1 p\_{1|1} + E\_{21} p\_{2|1} \\ K\_1 p\_{1|2} + E\_{22} p\_{2|2} \end{Bmatrix} - \max \begin{Bmatrix} E\_{11} p\_{1|1} + K\_0 p\_{2|1} \\ E\_{12} p\_{1|2} + K\_0 p\_{2|2} \end{Bmatrix} . \end{split} \tag{4.1}$$

Note that *K*0 = 0 and *K*1 = 1 in the case of a binary outcome. Furthermore, using a method of linear programming in the case of a binary outcome, Balke and Pearl (1997) presented the following bounds under Assumption 1 only:

$$\max\begin{pmatrix}P\_{12\mid2}+P\_{0\mid1}-1\\P\_{12\mid1}+P\_{0\mid2}-1\\P\_{12\mid1}-P\_{1\mid2}-P\_{1\mid2}-P\_{0\mid2\mid}-P\_{1\mid1}\\P\_{12\mid2}-P\_{1\mid2}-P\_{1\mid1}-P\_{0\mid2}-P\_{1\mid2}\\\vdots\\P\_{0\mid2\mid1}-P\_{0\mid2}-P\_{1\mid1}\\P\_{0\mid2}-P\_{0\mid2}-P\_{1\mid2}-P\_{0\mid2}-P\_{1\mid2}\\P\_{0\mid1}-P\_{0\mid2}-P\_{1\mid1}-P\_{0\mid2}-P\_{0\mid2}\end{pmatrix} \leq \text{ACE}\leq\min\begin{pmatrix}1-P\_{0\otimes2}+P\_{1\mid1}\\1-P\_{0\otimes1}+P\_{1\mid2}\\P\_{0\otimes2}+P\_{0\otimes2}+P\_{1\otimes1}+P\_{0\mid1}-P\_{0\otimes2}\\P\_{1\otimes2}+P\_{0\otimes2}+P\_{0\otimes1}-P\_{0\otimes2}\\P\_{1\otimes2}+P\_{0\otimes2}\\P\_{1\otimes2}+P\_{0\otimes1}\\P\_{1\otimes2}+P\_{0\otimes1}+P\_{1\otimes2}-P\_{1\mid1}\\P\_{1\otimes3}+P\_{0\mid3}+P\_{1\mid3}+P\_{1\mid2}-P\_{1\mid3}\end{pmatrix},\tag{4.2}$$

where *Pyx*|*<sup>r</sup>* = Pr(*Y* = *y*, *X* = *x*|*R* = *r*) (*y* = 0, 1). Inequality (4.2), which is the bounds on ACE having the narrowest width without adding any other assumptions, gives bounds with a narrower width than inequality (4.1) in some situations. However, these bounds generally have broad widths. Thus, in Sections 4.1.2 and 4.1.3, we derive bounds with narrower widths by adding some plausible assumptions.

#### **4.1.2 The monotone treatment response**

To derive narrower bounds, Manski (1997) presented the following monotone treatment response (MTR) assumption:

$$\begin{array}{c} \text{ASSUMATION 2.1:} \text{ Monotone treatment response (MTR)}\\ \text{Y}\_{\text{X-s}} \cong \text{Y}\_{\text{X-t}} \text{ for all subjects, where } \text{s} \ge t. \end{array}$$

For (*s*, *t*) = (2, 1), the MTR means that a subject takes a larger outcome value if he/she received the test treatment than if he/she received the control. This holds when it is apparent that the test treatment has a positive effect.

Under Assumptions 1 and 2.1, the lower bound on ACE is improved as follows:

$$\text{ACE} \ge \max\{\text{ITT}, -\text{ITT}\}.\tag{4.3}$$

Causal Inference in Randomized Trials with Noncompliance 321

The other assumption to derive narrower bounds is the following monotone treatment

ASSUMPTION 4.1: Monotone treatment selection (MTS) E(*YX*=*<sup>x</sup>*|*X* = *s*, *R* = *r*) ≥ E(*YX*=*<sup>x</sup>*|*X* = *t*, *R* = *r*) *for s* ≥ *t*. For (*s*, *t*) = (2, 1), the MTS means that subjects who received the test treatment tend to have larger outcome values than those who received the control within each study treatment-arm subpopulation. For example, when patients with a worse condition prefer to receive the new treatment (*X* = 2), it should be anticipated that the incidence proportion of a bad event (*Y* = 1) such as death will be higher, compared with those who receive the standard treatment (*X*

 ACE ≤ min{*E*21, *E*22} – max{*E*11, *E*12}. (4.4) Specifically, when min{*E*21, *E*22} = *E*22 and max{*E*11, *E*12} = *E*11, the upper bound is equal to the PP estimator. Thus, ACE is no more than the PP estimate when the MTS holds. Note that this is also verified from equation (3.4) because Assumption 4.1 implies that *α<sup>r</sup>* ≥ 0 and *β<sup>r</sup>* ≥ 0.

ASSUMPTION 4.2: Reverse monotone treatment selection (RMTS) E(*YX*=*<sup>x</sup>*|*X* = *s*, *R* = *r*) ≤ E(*YX*=*<sup>x</sup>*|*X* = *t*, *R* = *r*) *for s* ≥ *t*. In contrast to the MTS, for (*s*, *t*) = (2, 1), the RMTS means that subjects who received the test treatment tend to have smaller outcome values than those who received the control within each study treatment-arm subpopulation. The lower bound on ACE under the RMTS is ACE ≥ max{*E*21, *E*22} – min{*E*11, *E*12}, implying that ACE is not less than the PP estimate when the

It is obvious that the combination of Assumptions 2.1 and 4.1 improves both the lower and

max{ITT, –ITT} ≤ ACE ≤ min{*E*21, *E*22} – max{*E*11, *E*12}.

 max{*E*21, *E*22} – min{*E*11, *E*12} ≤ ACE ≤ min{ITT, –ITT}. (4.5) These inequalities show that ACE exists between ITT and PP estimates under these

By extending a theory developed in the context of observational studies (VanderWeele, 2008a; Chiba, 2009a), Chiba (2009b) presented another assumption that derives the same

ASSUMPTION 5.1: Monotone confounding (MC) *Both* E(*Y|X = 2, R = r, Z = z*) *and* Pr(*X = 2|R = r, Z = z) are non-decreasing or non-increasing in z for all r, and the components of Z are independent of each other.* For an assumption corresponding to the RMTS (Assumption 4.2), Assumption 5.1 is

Likewise, under the combination of Assumptions 2.2 and 4.2, bounds on ACE are

Under Assumptions 1 and 4.1, the upper bound on ACE is improved as follows:

Similar to the RMTR, the following reverse MTS (RMTS) assumption can be applied:

**4.1.3 The monotone treatment selection** 

= 1); this indicates that the MTS holds.

RMTS holds.

upper bounds:

combinations of assumptions.

changed as follows:

upper bound as that under the MTS (Assumption 4.1):

selection assumption (Manski & Pepper, 2000; Chiba, 2010c):

Thus, we can say that ACE is not less than the ITT estimate when the MTR holds. Note that the second and third terms in equation (3.3) are not less than 0 under the MTR, because E(*YX*=2|*X* = *x*, *R* = *r*) ≥ E(*YX*=1|*X* = *x*, *R* = *r*), i.e., *α*<sup>2</sup> ≥ *E*22 – *E*12 and *β*<sup>1</sup> ≥ *E*21 – *E*11, hold under the MTR. Using the reverse sign of the inequality in Assumption 2.1, the following reverse MTR (RMTR) assumption can be applied:

> ASSUMPTION 2.2: Reverse monotone treatment response (RMTR) *YX*=*<sup>s</sup>* ≤ *YX*=*t for all subjects, where s* ≥ *t*.

In contrast to the MTR, for (*s*, *t*) = (2, 1), the RMTR means that a subject takes a smaller outcome value if he/she received the test treatment than if he/she received the control. This holds when it is apparent that the test treatment has a negative effect. Under Assumptions 1 and 2.2, the upper bound on ACE is improved as ACE ≤ min{ITT, –ITT}, implying that ACE is not more than the ITT estimate when the RMTR holds.

Assumptions 2.1 and 2.2 are very strict assumptions, because the inequalities must hold for all subjects. In the case of a binary outcome variable, we can use an alternative assumption that is weaker than Assumptions 2.1 and 2.2, but can derive the same bound as those under these assumptions. This is introduced below after the concept of principal stratification (Frangakis & Rubin, 2002).

Based on principal stratification, four types of potential outcomes are defined as follows: doomed {*YX*=2 = 1, *YX*=1 = 1}, which consists of subjects who always experience the event, regardless of the treatment received; preventive {*YX*=2 = 0, *YX*=1 = 1}, which consists of subjects who do not experience the event when they receive the test treatment but do when they receive the control; causative {*YX*=2 = 1, *YX*=1 = 0}, which consists of subjects who experience the event when they receive the test treatment, but not when they receive the control; and immune {*YX*=2 = 0, *YX*=1 = 0}, which consists of subjects who never experience the event, regardless of the treatment received (Greenland & Robins, 1986). Because *X* and *Y* are binary, the potential outcomes could be any of these four types. Note that Assumption 2.1 implies that no preventive subject exists: Pr(*YX*=2 = 0, *YA*=1 = 1) = 0, because *YX*=2 = 0 and *YX*=1 = 1 cannot hold simultaneously under *YX*=2 ≥ *YX*=1. Likewise, Assumption 2.2 implies that no causative subject exists.

We can obtain inequality (4.3) even under the following assumption (Chiba, 2011):

$$\Pr(Y\_{\lambda=2} = 1 \mid Y\_{\lambda=1} = 0 \mid X = \ge, R = r) \ge \Pr(Y\_{\lambda=2} = 0 \mid Y\_{\lambda=1} = 1 \mid X = \ge, R = r).$$

This assumption indicates that the number of causative subjects is not less than the number of preventive subjects within all strata with *X* = *x* and *R* = *r*. Thus, Assumption 3.1 is weaker than Assumption 2.1, because Assumption 2.1 requires that no preventive subject exists but this is not the case for Assumption 3.1.

Likewise, the following assumption, 3.2, can derive the same upper bound as that under Assumption 2.2:

#### ASSUMPTION 3.2 Pr(*YX=*2 = 1, *YX=*1 = 0|*X* = *x*, *R* = *r*) ≤ Pr(*YX=*2 = 0, *YX=*1 = 1|*X* = *x*, *R* = *r*).

In contrast to Assumption 3.1, this assumption implies that the number of causative subjects is not more than the number of preventive subjects within all strata with *X* = *x* and *R* = *r*. Again, note that Assumption 2.2 implies that no causative subject exists and thus Assumption 3.2 is a weaker assumption than Assumption 2.2.

#### **4.1.3 The monotone treatment selection**

320 Health Management – Different Approaches and Solutions

Thus, we can say that ACE is not less than the ITT estimate when the MTR holds. Note that the second and third terms in equation (3.3) are not less than 0 under the MTR, because E(*YX*=2|*X* = *x*, *R* = *r*) ≥ E(*YX*=1|*X* = *x*, *R* = *r*), i.e., *α*<sup>2</sup> ≥ *E*22 – *E*12 and *β*<sup>1</sup> ≥ *E*21 – *E*11, hold under the MTR. Using the reverse sign of the inequality in Assumption 2.1, the following reverse MTR

ASSUMPTION 2.2: Reverse monotone treatment response (RMTR) *YX*=*<sup>s</sup>* ≤ *YX*=*t for all subjects, where s* ≥ *t*. In contrast to the MTR, for (*s*, *t*) = (2, 1), the RMTR means that a subject takes a smaller outcome value if he/she received the test treatment than if he/she received the control. This holds when it is apparent that the test treatment has a negative effect. Under Assumptions 1 and 2.2, the upper bound on ACE is improved as ACE ≤ min{ITT, –ITT}, implying that ACE

Assumptions 2.1 and 2.2 are very strict assumptions, because the inequalities must hold for all subjects. In the case of a binary outcome variable, we can use an alternative assumption that is weaker than Assumptions 2.1 and 2.2, but can derive the same bound as those under these assumptions. This is introduced below after the concept of principal stratification

Based on principal stratification, four types of potential outcomes are defined as follows: doomed {*YX*=2 = 1, *YX*=1 = 1}, which consists of subjects who always experience the event, regardless of the treatment received; preventive {*YX*=2 = 0, *YX*=1 = 1}, which consists of subjects who do not experience the event when they receive the test treatment but do when they receive the control; causative {*YX*=2 = 1, *YX*=1 = 0}, which consists of subjects who experience the event when they receive the test treatment, but not when they receive the control; and immune {*YX*=2 = 0, *YX*=1 = 0}, which consists of subjects who never experience the event, regardless of the treatment received (Greenland & Robins, 1986). Because *X* and *Y* are binary, the potential outcomes could be any of these four types. Note that Assumption 2.1 implies that no preventive subject exists: Pr(*YX*=2 = 0, *YA*=1 = 1) = 0, because *YX*=2 = 0 and *YX*=1 = 1 cannot hold simultaneously under *YX*=2 ≥ *YX*=1. Likewise, Assumption 2.2 implies

We can obtain inequality (4.3) even under the following assumption (Chiba, 2011):

ASSUMPTION 3.1 Pr(*YX=*2 = 1, *YX=*1 = 0|*X* = *x*, *R* = *r*) ≥ Pr(*YX=*2 = 0, *YX=*1 = 1|*X* = *x*, *R* = *r*). This assumption indicates that the number of causative subjects is not less than the number of preventive subjects within all strata with *X* = *x* and *R* = *r*. Thus, Assumption 3.1 is weaker than Assumption 2.1, because Assumption 2.1 requires that no preventive subject exists but

Likewise, the following assumption, 3.2, can derive the same upper bound as that under

ASSUMPTION 3.2 Pr(*YX=*2 = 1, *YX=*1 = 0|*X* = *x*, *R* = *r*) ≤ Pr(*YX=*2 = 0, *YX=*1 = 1|*X* = *x*, *R* = *r*). In contrast to Assumption 3.1, this assumption implies that the number of causative subjects is not more than the number of preventive subjects within all strata with *X* = *x* and *R* = *r*. Again, note that Assumption 2.2 implies that no causative subject exists and thus

(RMTR) assumption can be applied:

(Frangakis & Rubin, 2002).

that no causative subject exists.

this is not the case for Assumption 3.1.

Assumption 3.2 is a weaker assumption than Assumption 2.2.

Assumption 2.2:

is not more than the ITT estimate when the RMTR holds.

The other assumption to derive narrower bounds is the following monotone treatment selection assumption (Manski & Pepper, 2000; Chiba, 2010c):

$$\begin{array}{c} \text{ASSUMATION 4.1: Monotone treatment selection (MTS)}\\ \text{E}(Y\_{\text{X-x}} \mid X=\text{s, R}=r) \ge \text{E}(Y\_{\text{X-x}} \mid X=\text{t, R}=r) \text{ for } s \ge t. \end{array}$$

For (*s*, *t*) = (2, 1), the MTS means that subjects who received the test treatment tend to have larger outcome values than those who received the control within each study treatment-arm subpopulation. For example, when patients with a worse condition prefer to receive the new treatment (*X* = 2), it should be anticipated that the incidence proportion of a bad event (*Y* = 1) such as death will be higher, compared with those who receive the standard treatment (*X* = 1); this indicates that the MTS holds.

Under Assumptions 1 and 4.1, the upper bound on ACE is improved as follows:

$$\text{ACE} \le \min\{E\_{21\nu}, E\_{22}\} - \max\{E\_{11\nu}, E\_{12}\}.\tag{4.4}$$

Specifically, when min{*E*21, *E*22} = *E*22 and max{*E*11, *E*12} = *E*11, the upper bound is equal to the PP estimator. Thus, ACE is no more than the PP estimate when the MTS holds. Note that this is also verified from equation (3.4) because Assumption 4.1 implies that *α<sup>r</sup>* ≥ 0 and *β<sup>r</sup>* ≥ 0. Similar to the RMTR, the following reverse MTS (RMTS) assumption can be applied:

$$\begin{array}{c} \text{ASSUMATION 4.2:} \text{ Revere monotone treatment selection (RMTS)}\\ \text{E}(Y\_{\text{X-x}} \mid X=\text{s}, R=r) \le \text{E}(Y\_{\text{X-x}} \mid X=\text{t}, R=r) \text{ for } s \ge t. \end{array}$$

In contrast to the MTS, for (*s*, *t*) = (2, 1), the RMTS means that subjects who received the test treatment tend to have smaller outcome values than those who received the control within each study treatment-arm subpopulation. The lower bound on ACE under the RMTS is ACE ≥ max{*E*21, *E*22} – min{*E*11, *E*12}, implying that ACE is not less than the PP estimate when the RMTS holds.

It is obvious that the combination of Assumptions 2.1 and 4.1 improves both the lower and upper bounds:

$$
\max\{\text{ITT}, -\text{ITT}\} \le \text{ACE} \le \min\{E\_{21\prime}, E\_{22}\} - \max\{E\_{11\prime}, E\_{12}\}.
$$

Likewise, under the combination of Assumptions 2.2 and 4.2, bounds on ACE are

$$
\max\{E\_{21}, E\_{22}\} - \min\{E\_{11}, E\_{12}\} \le \text{ACE} \le \min\{\text{ITT}, -\text{ITT}\}.\tag{4.5}
$$

These inequalities show that ACE exists between ITT and PP estimates under these combinations of assumptions.

By extending a theory developed in the context of observational studies (VanderWeele, 2008a; Chiba, 2009a), Chiba (2009b) presented another assumption that derives the same upper bound as that under the MTS (Assumption 4.1):

ASSUMPTION 5.1: Monotone confounding (MC) *Both* E(*Y|X = 2, R = r, Z = z*) *and* Pr(*X = 2|R = r, Z = z) are non-decreasing or non-increasing in z for all r, and the components of Z are independent of each other.*

For an assumption corresponding to the RMTS (Assumption 4.2), Assumption 5.1 is changed as follows:

Causal Inference in Randomized Trials with Noncompliance 323

In general, health-conscious individuals may tend not to die from CHD and quit smoking compared with individuals who are not health-conscious. Trial subjects would likely have had similar tendencies, and subjects who quit smoking would logically tend not to have died from CHD. Therefore, it is considered that Assumption 4.2 (RMTS: E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = *r*) ≤ E(*YX*=*<sup>x</sup>*|*X* = 1, *R* = *r*) for *x* = 1, 2 and *r* = 1, 2) is valid. Although Assumption 1 may not hold because this trial was an unblinded trial (the details are discussed in Section 6), we here use

The arguments presented above demonstrate that Assumptions 3.2 and 4.2 can be assumed. Thus, from inequality (4.5), the bounds on ACE become −0.92% ≤ ACE ≤ −0.13%. This result indicates that quitting smoking would prevent death from CHD. Note that the bounds under Assumption 1 only become −11.31% ≤ ACE ≤ 72.60%, where inequalities (4.1) and (4.2) yield the same bounds. While the bounds under Assumption 1 only do not give enough information about ACE, adding Assumptions 3.2 and 4.2 greatly improves the bounds.

While noncompliance by switching the treatment was discussed in Sections 3 and 4, this section discusses noncompliance by receiving no treatment, which means that noncompliers receive no treatment. In this type of noncompliance, subjects who are allocated to *R* = 2 take the value of *X* = 0 or 2 (and not *X* = 1) and those who are allocated to *R* = 1 take the value of *X* = 0 or 1 (and not *X* = 2). Thus, *p*0|2 + *p*2|2 = 1 and *p*0|1 + *p*1|1 = 1. The derivations of equations and inequalities in this section are similar to those in Sections 3 and 4, and can be achieved straightforwardly by replacing *x* = 1, 2 in Sections 3 and 4 to *x* = 0, 1

By following a similar discussion to Section 3, we show that the ITT and PP estimators generally yield biased estimates of ACE. Unfortunately, the IV estimator cannot be defined

To express the biases of ITT and PP estimators, we introduce the following bias factors

*γ* ≡ E(*YX*=2|*X* = 2, *R* = 2) – E(*YX*=2|*X* = 0, *R* = 2),

*δ* ≡ E(*YX*=1|*X* = 1, *R* = 1) – E(*Y X*=1|*X* = 0, *R* = 1). Similar to *αr* and *βr*, *γ* and *δ* are also confounding effects. *γ* is interpreted as a confounding effect that would arise from comparisons of those with *X* = 2 versus those with *X* = 0 for the test treatment group. When *γ* > 0, E(*YX*=2|*X* = 2, *R* = 2) > E(*YX*=2|*X* = 0, *R* = 2), which means that the subjects who received the test treatment tend to take larger outcome values than those who received no treatment. Conversely, when *γ* < 0, E(*YX*=2|*X* = 2, *R* = 2) < E(*YX*=2|*X* = 0, *R* = 2), which means that the subjects who received the test treatment tend to take smaller outcome values than those who received no treatment. Whether subjects in the test treatment group actually receive the treatment is randomly determined when *γ* = 0. *δ* is

Biases of ITT and PP estimators can be explained in a similar manner to Section 3, using *γ* and *δ*. Because E(*YX*=2) and E(*YX*=1) are expressed as E(*YX*=2) = *E*22 – *γp*0|2 and E(*YX*=1) = *E*11 –

this assumption for illustrative purposes.

and *x* = 0, 2. Thus, they are omitted.

**5.1 Biases of estimates** 

in this type of noncompliance.

instead of *αr* and *βr* in Section 3:

*δp*0|1, the ITT estimator is given by:

**5. Noncompliance by receiving no treatment** 

interpreted using a similar process in the control group.

ASSUMPTION 5.2: Reverse monotone confounding (RMC)

 *One of* E(*Y|X = 2, R = r, Z = z) and* Pr(*X = 2|R = r, Z = z) is non-decreasing and the other is non-increasing in z for all r, and the components of Z are independent of each other.*

Although the MTS and MC (RMTS and RMC) give the same upper (lower) bound on ACE, the relationship between them has not been clear. In Section 8.2, we demonstrate that the MC implies the MTS, but it is unclear whether the converse holds.

#### **4.2 Application**

For illustration, the assumptions and bounds presented in this section are applied to data from the Multiple Risk Factor Intervention Trial (MRFIT) (MRFIT Research Group, 1982). The MRFIT was a large field trial to test the effect of a multifactorial intervention program on mortality from coronary heart disease (CHD) in middle-aged men with sufficiently high risk levels attributed to cigarette smoking, high serum cholesterol, and high blood pressure. Intervention consisted of dietary advice on ways to reduce blood cholesterol, smoking cessation counseling, and hypertension medication. All subjects were randomly assigned to the intervention program or the control group.

For this illustration, attention is restricted to the effects of cessation of cigarette smoking. This restriction follows other studies (Mark & Robins, 1993; Matsui, 2005; Chiba, 2010a) and was applied due to the paucity of differences achieved for the other risk factors. Table 2 summarizes the incidence of subject mortality due to CHD during the 7-year follow-up period based on the assigned treatment and the actual subject smoking status 1 year after study entry. *R* represents the assigned group (*R* = 2 for the test group and *R* = 1 for the control group), *X* is the actual smoking status 1 year after entry (*X* = 2 for smoking cessation and *X* = 1 for continued smoking), and *Y* is the incidence of CHD deaths (*Y* = 1 for dead and *Y* = 0 for alive). ITT and PP analyses yielded ITT = 69/3833 – 74/3830 = −0.13% and PP = 11/991 – 70/3456 = −0.92%, respectively. IV and IV' estimates were −0.82% and −0.72%, respectively.


Table 2. The status of cigarette smoking and the incidence of mortality due to CHD in the MRFIT during a 7-year follow-up period.

To derive the ACE bounds, it is necessary to discuss whether the assumptions in this section hold. It is clear that cessation of cigarette smoking prevents death from CHD. Thus, Assumption 2.2 (RMTR: *YX*=2 ≤ *YX*=1 for all subjects) holds (i.e., no causative subject, who died when they quit smoking but lived when they continued smoking, exists). However, it is possible that such subjects do exist, because the stress of quitting smoking might lead to CHD and this stress would have been lower if the subject had continued smoking (i.e., a causative subject existed). Under this observation, Assumption 2.2 does not hold. However, Assumption 3.2 would still hold, because even if a few causative subjects exist, the number would be the smallest in the four principal strata.

ASSUMPTION 5.2: Reverse monotone confounding (RMC)  *One of* E(*Y|X = 2, R = r, Z = z) and* Pr(*X = 2|R = r, Z = z) is non-decreasing and the other is non-increasing in z for all r, and the components of Z are independent of each other.* Although the MTS and MC (RMTS and RMC) give the same upper (lower) bound on ACE, the relationship between them has not been clear. In Section 8.2, we demonstrate that the

For illustration, the assumptions and bounds presented in this section are applied to data from the Multiple Risk Factor Intervention Trial (MRFIT) (MRFIT Research Group, 1982). The MRFIT was a large field trial to test the effect of a multifactorial intervention program on mortality from coronary heart disease (CHD) in middle-aged men with sufficiently high risk levels attributed to cigarette smoking, high serum cholesterol, and high blood pressure. Intervention consisted of dietary advice on ways to reduce blood cholesterol, smoking cessation counseling, and hypertension medication. All subjects were randomly assigned to

For this illustration, attention is restricted to the effects of cessation of cigarette smoking. This restriction follows other studies (Mark & Robins, 1993; Matsui, 2005; Chiba, 2010a) and was applied due to the paucity of differences achieved for the other risk factors. Table 2 summarizes the incidence of subject mortality due to CHD during the 7-year follow-up period based on the assigned treatment and the actual subject smoking status 1 year after study entry. *R* represents the assigned group (*R* = 2 for the test group and *R* = 1 for the control group), *X* is the actual smoking status 1 year after entry (*X* = 2 for smoking cessation and *X* = 1 for continued smoking), and *Y* is the incidence of CHD deaths (*Y* = 1 for dead and *Y* = 0 for alive). ITT and PP analyses yielded ITT = 69/3833 – 74/3830 = −0.13% and PP = 11/991 – 70/3456 =

Test 3833 69 Quit 991 11 Not quit 2842 58 Control 3830 74 Quit 374 4 Not quit 3456 70

Table 2. The status of cigarette smoking and the incidence of mortality due to CHD in the

To derive the ACE bounds, it is necessary to discuss whether the assumptions in this section hold. It is clear that cessation of cigarette smoking prevents death from CHD. Thus, Assumption 2.2 (RMTR: *YX*=2 ≤ *YX*=1 for all subjects) holds (i.e., no causative subject, who died when they quit smoking but lived when they continued smoking, exists). However, it is possible that such subjects do exist, because the stress of quitting smoking might lead to CHD and this stress would have been lower if the subject had continued smoking (i.e., a causative subject existed). Under this observation, Assumption 2.2 does not hold. However, Assumption 3.2 would still hold, because even if a few causative subjects exist, the number

Smoking status at 1 year

No. of subjects

CHD deaths

−0.92%, respectively. IV and IV' estimates were −0.82% and −0.72%, respectively.

CHD deaths

MC implies the MTS, but it is unclear whether the converse holds.

the intervention program or the control group.

Group No. of

subjects

Totals 7663 143

MRFIT during a 7-year follow-up period.

would be the smallest in the four principal strata.

**4.2 Application** 

In general, health-conscious individuals may tend not to die from CHD and quit smoking compared with individuals who are not health-conscious. Trial subjects would likely have had similar tendencies, and subjects who quit smoking would logically tend not to have died from CHD. Therefore, it is considered that Assumption 4.2 (RMTS: E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = *r*) ≤ E(*YX*=*<sup>x</sup>*|*X* = 1, *R* = *r*) for *x* = 1, 2 and *r* = 1, 2) is valid. Although Assumption 1 may not hold because this trial was an unblinded trial (the details are discussed in Section 6), we here use this assumption for illustrative purposes.

The arguments presented above demonstrate that Assumptions 3.2 and 4.2 can be assumed. Thus, from inequality (4.5), the bounds on ACE become −0.92% ≤ ACE ≤ −0.13%. This result indicates that quitting smoking would prevent death from CHD. Note that the bounds under Assumption 1 only become −11.31% ≤ ACE ≤ 72.60%, where inequalities (4.1) and (4.2) yield the same bounds. While the bounds under Assumption 1 only do not give enough information about ACE, adding Assumptions 3.2 and 4.2 greatly improves the bounds.

#### **5. Noncompliance by receiving no treatment**

While noncompliance by switching the treatment was discussed in Sections 3 and 4, this section discusses noncompliance by receiving no treatment, which means that noncompliers receive no treatment. In this type of noncompliance, subjects who are allocated to *R* = 2 take the value of *X* = 0 or 2 (and not *X* = 1) and those who are allocated to *R* = 1 take the value of *X* = 0 or 1 (and not *X* = 2). Thus, *p*0|2 + *p*2|2 = 1 and *p*0|1 + *p*1|1 = 1. The derivations of equations and inequalities in this section are similar to those in Sections 3 and 4, and can be achieved straightforwardly by replacing *x* = 1, 2 in Sections 3 and 4 to *x* = 0, 1 and *x* = 0, 2. Thus, they are omitted.

#### **5.1 Biases of estimates**

By following a similar discussion to Section 3, we show that the ITT and PP estimators generally yield biased estimates of ACE. Unfortunately, the IV estimator cannot be defined in this type of noncompliance.

To express the biases of ITT and PP estimators, we introduce the following bias factors instead of *αr* and *βr* in Section 3:

$$\gamma \equiv \mathbb{E}(Y\_{\lambda \vdash 2} \mid X = 2, \ R = 2) - \mathbb{E}(Y\_{\lambda \vdash 2} \mid X = 0, \ R = 2),$$

$$\delta \equiv \mathbb{E}(Y\_{\lambda \vdash 1} \mid X = 1, \ R = 1) - \mathbb{E}(Y\_{\lambda \vdash 1} \mid X = 0, \ R = 1).$$

Similar to *αr* and *βr*, *γ* and *δ* are also confounding effects. *γ* is interpreted as a confounding effect that would arise from comparisons of those with *X* = 2 versus those with *X* = 0 for the test treatment group. When *γ* > 0, E(*YX*=2|*X* = 2, *R* = 2) > E(*YX*=2|*X* = 0, *R* = 2), which means that the subjects who received the test treatment tend to take larger outcome values than those who received no treatment. Conversely, when *γ* < 0, E(*YX*=2|*X* = 2, *R* = 2) < E(*YX*=2|*X* = 0, *R* = 2), which means that the subjects who received the test treatment tend to take smaller outcome values than those who received no treatment. Whether subjects in the test treatment group actually receive the treatment is randomly determined when *γ* = 0. *δ* is interpreted using a similar process in the control group.

Biases of ITT and PP estimators can be explained in a similar manner to Section 3, using *γ* and *δ*. Because E(*YX*=2) and E(*YX*=1) are expressed as E(*YX*=2) = *E*22 – *γp*0|2 and E(*YX*=1) = *E*11 – *δp*0|1, the ITT estimator is given by:

Causal Inference in Randomized Trials with Noncompliance 325

defined as follows: doomed {*YX*=*<sup>x</sup>* = 1, *YX*=0 = 1}, which consists of subjects who always experience the event, regardless of whether they receive the assigned treatment; preventive {*YX*=*<sup>x</sup>* = 0, *YX*=0 = 1}, which consists of subjects who do not experience the event when they receive the assigned treatment but do when they receive no treatment; causative {*YX*=*<sup>x</sup>* = 1, *YX*=0 = 0}, which consists of subjects who experience the event when they receive the assigned treatment, but not when they receive no treatment; and immune {*YX*=*<sup>x</sup>* = 0, *YX*=0 = 0}, which consists of subjects who never experience the event, regardless of whether they receive the assigned treatment. In the definition, *x* = 2 for the test treatment group (*R* = 2) and *x* = 1 for the control group (*R* = 1). Similar to Section 4.1.2, Assumption 2.1 implies that no preventive subject exists, and Assumption 2.2 implies that

Under this definition of principal strata, alternative assumptions of Assumptions 2.1 and 2.2

ASSUMPTION 3.3 Pr(*YX=x* = 1, *YX=*0 = 0|*X* = *R* = *x*) ≥ Pr(*YX=x* = 0, *YX=*0 = 1|*X* = *R* = *x*) *for x* = 1, 2.

ASSUMPTION 3.4 Pr(*YX=x* = 1, *YX=*0 = 0|*X* = *R* = *x*) ≤ Pr(*YX=x* = 0, *YX=*0 = 1|*X* = *R* = *x*) *for x* = 1, 2. Assumption 3.3 implies that the number of causative subjects is not less than the number of preventive subjects, and Assumption 3.4 implies that the number of causative subjects is not more than the number of preventive subjects, within both assigned groups. Thus, these Assumptions are weaker than assumptions 2.1 and 2.2. Nevertheless, they can give the same

The MTS and RMTS (Assumptions 4.1 and 4.2) can also be applied to the case of noncompliance by receiving no treatment. For example, for (*s*, *t*) = (2, 0) and *r* = 2, Assumption 4.1 is E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = 2) ≥ E(*YX*=*<sup>x</sup>*|*X* = 0, *R* = 2), which means that subjects who received the assigned test treatment (i.e., compliers) tend to have larger outcome values than those who received no treatment (i.e., non-compliers) for the test treatment group. Under Assumptions 1 and 4.1, the upper bound of E(*YX*=*<sup>x</sup>*) becomes E(*YX*=*<sup>x</sup>*) ≥ *Exx* (E(*YX*=*<sup>x</sup>*) ≤ *Exx* under Assumptions 1 and 4.2) for *x* = 1, 2. Thus, the combination with inequality (5.1)

(*E*22*p*2|2 + *K*0*p*0|2) – *E*<sup>11</sup> ≤ ACE ≤ *E*22 – (*E*11*p*1|1 + *K*0*p*0|1).

E(*Y*|*R* = 2) – *E*<sup>11</sup> ≤ ACE ≤ *E*22 – E(*Y*|*R* = 1), because E(*Y*|*R* = *x*) ≤ E(*YX*=*<sup>x</sup>*) ≤ *Exx* for *x* = 1, 2. When both RMTR and RMTS hold, these

Finally, we note that the MC and RMC (Assumptions 5.1 and 5.2), which derive the same bounds as those under the MTS and RMTS (Assumptions 4.1 and 4.2), are changed as

ASSUMPTION 5.3: Monotone confounding (MC) *Both* E(*Y|X = R = x, Z = z*) *and* Pr(*X = x|R = x, Z = z*) a*re non-decreasing or non-increasing in z for x =* 1, 2 *and all r, and the components of Z are independent of each other.*

no causative subject exists.

derives bounds on ACE of:

bounds as those under Assumptions 2.1 and 2.2.

When both MTR and MTS hold, the bounds on ACE are:

follows for the case of noncompliance by receiving no treatment:

signs of inequalities for E(*YX*=*<sup>x</sup>*) are reversed.

are as follows:

ITT = ACE + {*γ* – (*E*22 – *E*02)}*p*0|2 – {*δ* – (*E*11 – *E*01)}*p*0|1.

Therefore, the ITT estimator is generally a biased estimator of ACE, and can be unbiased when *γ* = *E*22 – *E*02 and *δ* = *E*11 – *E*01, i.e., E(*YX*=*<sup>r</sup>*|*X* = 0, *R* = *r*) = E(*YX*=0|*X* = 0, *R* = *r*) for *r* = 1, 2. This equation indicates that the ITT estimate can be unbiased when no effect of the treatments exists against no treatment for all subjects (under the sharp null hypothesis: *YX*=*<sup>x</sup>* = *YX*=0 for all subjects, where *x* = 1, 2).

The PP estimator is given by:

$$\text{PP} = \text{ACE} + \eta p\_{0 \mid 2} - \delta p\_{0 \mid 1}.$$

Thus, the PP estimate can be unbiased when *γ* = 0 and *δ* = 0, implying that whether subjects receive the assigned treatment is randomly determined (no confounder exists between *X* and *Y*).

In contrast to the case of noncompliance by switching the treatment, it may be difficult to know the signs of biases of ITT and PP estimates.

#### **5.2 Bounds on average causal effect**

We extend the bounds concept introduced in Section 4.1 to the case of noncompliance by receiving no treatment.

The bounds under Assumption 1 only are as follows:

$$(E\_{22}p\_{2/2} + K\_{\mathbb{P}}p\_{0/2}) - (E\_{11}p\_{1/1} + K\_{\mathbb{P}}p\_{0/1}) \le \text{ACE} \le (E\_{22}p\_{2/2} + K\_{\mathbb{P}}p\_{0/2}) - (E\_{11}p\_{1/1} + K\_{\mathbb{P}}p\_{0/1}), \tag{5.1}$$

where [*K*0, *K*1] is a finite range of outcome *Y*. In the case of a binary outcome, this inequality is simplified to:

$$P\_{12}|\_2 + P\_{01}|\_1 - 1 \le \text{ACE} \le 1 - P\_{02}|\_2 - P\_{11}|\_1.$$

As in Section 4.1, the MTR and MTS assumptions and these reverse assumptions can be applied to obtain bounds on ACE with narrower widths. For example, for (*s*, *t*) = (2, 0), Assumption 2.1 is *YX*=2 ≥ *YX*=0, which means that a subject takes a larger outcome value if he/she received the test treatment than if he/she received no treatment. This holds when it is apparent that the test treatment has a positive effect compared with no treatment. The similar interpretation is given for (*s*, *t*) = (1, 0) (*YX*=1 ≥ *YX*=0) in place of the test treatment to the control.

Under Assumptions 1 and 2.1, the lower bound of E(*YX*=*<sup>x</sup>*) becomes E(*YX*=*<sup>x</sup>*) ≥ E(*Y*|*R* = *x*) for *x* = 1, 2, which is derived using *t* = 0 in Assumption 2.1. Likewise, E(*YX*=*<sup>x</sup>*) ≤ E(*Y*|*R* = *x*) under Assumptions 1 and 2.2. Although these bounds of E(*YX*=*<sup>x</sup>*) do not give a bound on ACE in contrast to that in Section 4.1.2, Assumption 2.1 can derive the following bounds by combination with inequality (5.1)1:

$$\mathbb{E}(Y \mid R=2) - (E\_{11}p\_{1\mid 1} + K\_1p\_{0\mid 1}) \le \text{ACE} \le (E\_{22}p\_{2\mid 2} + K\_1p\_{0\mid 2}) - \mathbb{E}(Y \mid R=1).$$

Similar to Assumptions 3.1 and 3.2, in the case of a binary outcome variable, we can make weaker assumptions that derive the same bounds as those under Assumptions 2.1 and 2.2, using the principal stratification approach. In the case of noncompliance by receiving no treatment, four types of potential outcomes, based on principal stratification, are re-

<sup>1</sup> If (*s, t*) = (2, 1) in Assumption 2.1 is used as in Section 4.1, the lower bound on ACE is improved to 0.

ITT = ACE + {*γ* – (*E*22 – *E*02)}*p*0|2 – {*δ* – (*E*11 – *E*01)}*p*0|1. Therefore, the ITT estimator is generally a biased estimator of ACE, and can be unbiased when *γ* = *E*22 – *E*02 and *δ* = *E*11 – *E*01, i.e., E(*YX*=*<sup>r</sup>*|*X* = 0, *R* = *r*) = E(*YX*=0|*X* = 0, *R* = *r*) for *r* = 1, 2. This equation indicates that the ITT estimate can be unbiased when no effect of the treatments exists against no treatment for all subjects (under the sharp null hypothesis:

PP = ACE + *γp*0|2 – *δp*0|1. Thus, the PP estimate can be unbiased when *γ* = 0 and *δ* = 0, implying that whether subjects receive the assigned treatment is randomly determined (no confounder exists between *X*

In contrast to the case of noncompliance by switching the treatment, it may be difficult to

We extend the bounds concept introduced in Section 4.1 to the case of noncompliance by

 (*E*22*p*2|2 + *K*0*p*0|2) – (*E*11*p*1|1 + *K*1*p*0|1) ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*0|2) – (*E*11*p*1|1 + *K*0*p*0|1), (5.1) where [*K*0, *K*1] is a finite range of outcome *Y*. In the case of a binary outcome, this inequality

*P*12|2 + *P*01|1 – 1 ≤ ACE ≤ 1 – *P*02|2 – *P*11|1. As in Section 4.1, the MTR and MTS assumptions and these reverse assumptions can be applied to obtain bounds on ACE with narrower widths. For example, for (*s*, *t*) = (2, 0), Assumption 2.1 is *YX*=2 ≥ *YX*=0, which means that a subject takes a larger outcome value if he/she received the test treatment than if he/she received no treatment. This holds when it is apparent that the test treatment has a positive effect compared with no treatment. The similar interpretation is given for (*s*, *t*) = (1, 0) (*YX*=1 ≥ *YX*=0) in place of the test treatment to

Under Assumptions 1 and 2.1, the lower bound of E(*YX*=*<sup>x</sup>*) becomes E(*YX*=*<sup>x</sup>*) ≥ E(*Y*|*R* = *x*) for *x* = 1, 2, which is derived using *t* = 0 in Assumption 2.1. Likewise, E(*YX*=*<sup>x</sup>*) ≤ E(*Y*|*R* = *x*) under Assumptions 1 and 2.2. Although these bounds of E(*YX*=*<sup>x</sup>*) do not give a bound on ACE in contrast to that in Section 4.1.2, Assumption 2.1 can derive the following bounds by

E(*Y*|*R* = 2) – (*E*11*p*1|1 + *K*1*p*0|1) ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*0|2) – E(*Y*|*R* = 1). Similar to Assumptions 3.1 and 3.2, in the case of a binary outcome variable, we can make weaker assumptions that derive the same bounds as those under Assumptions 2.1 and 2.2, using the principal stratification approach. In the case of noncompliance by receiving no treatment, four types of potential outcomes, based on principal stratification, are re-

1 If (*s, t*) = (2, 1) in Assumption 2.1 is used as in Section 4.1, the lower bound on ACE is improved to 0.

*YX*=*<sup>x</sup>* = *YX*=0 for all subjects, where *x* = 1, 2).

know the signs of biases of ITT and PP estimates.

The bounds under Assumption 1 only are as follows:

**5.2 Bounds on average causal effect** 

receiving no treatment.

is simplified to:

the control.

combination with inequality (5.1)1:

The PP estimator is given by:

and *Y*).

defined as follows: doomed {*YX*=*<sup>x</sup>* = 1, *YX*=0 = 1}, which consists of subjects who always experience the event, regardless of whether they receive the assigned treatment; preventive {*YX*=*<sup>x</sup>* = 0, *YX*=0 = 1}, which consists of subjects who do not experience the event when they receive the assigned treatment but do when they receive no treatment; causative {*YX*=*<sup>x</sup>* = 1, *YX*=0 = 0}, which consists of subjects who experience the event when they receive the assigned treatment, but not when they receive no treatment; and immune {*YX*=*<sup>x</sup>* = 0, *YX*=0 = 0}, which consists of subjects who never experience the event, regardless of whether they receive the assigned treatment. In the definition, *x* = 2 for the test treatment group (*R* = 2) and *x* = 1 for the control group (*R* = 1). Similar to Section 4.1.2, Assumption 2.1 implies that no preventive subject exists, and Assumption 2.2 implies that no causative subject exists.

Under this definition of principal strata, alternative assumptions of Assumptions 2.1 and 2.2 are as follows:

#### ASSUMPTION 3.3

$$\Pr\left(Y\_{X\ast\mathbf{x}}=\mathbf{1},\ Y\_{X\ast\mathbf{0}}=\mathbf{0} \mid X=R=\mathbf{x}\right) \ge \Pr\left(Y\_{X\ast\mathbf{x}}=\mathbf{0},\ Y\_{X\ast\mathbf{0}}=\mathbf{1} \mid X=R=\mathbf{x}\right) \text{ for } \mathbf{x}=\mathbf{1},\ \mathfrak{D}$$

#### ASSUMPTION 3.4

Pr(*YX=x* = 1, *YX=*0 = 0|*X* = *R* = *x*) ≤ Pr(*YX=x* = 0, *YX=*0 = 1|*X* = *R* = *x*) *for x* = 1, 2.

Assumption 3.3 implies that the number of causative subjects is not less than the number of preventive subjects, and Assumption 3.4 implies that the number of causative subjects is not more than the number of preventive subjects, within both assigned groups. Thus, these Assumptions are weaker than assumptions 2.1 and 2.2. Nevertheless, they can give the same bounds as those under Assumptions 2.1 and 2.2.

The MTS and RMTS (Assumptions 4.1 and 4.2) can also be applied to the case of noncompliance by receiving no treatment. For example, for (*s*, *t*) = (2, 0) and *r* = 2, Assumption 4.1 is E(*YX*=*<sup>x</sup>*|*X* = 2, *R* = 2) ≥ E(*YX*=*<sup>x</sup>*|*X* = 0, *R* = 2), which means that subjects who received the assigned test treatment (i.e., compliers) tend to have larger outcome values than those who received no treatment (i.e., non-compliers) for the test treatment group. Under Assumptions 1 and 4.1, the upper bound of E(*YX*=*<sup>x</sup>*) becomes E(*YX*=*<sup>x</sup>*) ≥ *Exx* (E(*YX*=*<sup>x</sup>*) ≤ *Exx* under Assumptions 1 and 4.2) for *x* = 1, 2. Thus, the combination with inequality (5.1) derives bounds on ACE of:

$$(E\_{22}p\_{2\mid 2} + K\_0p\_{0\mid 2}) - E\_{11} \le \text{ACE} \le E\_{22} - (E\_{11}p\_{1\mid 1} + K\_0p\_{0\mid 1}).$$

When both MTR and MTS hold, the bounds on ACE are:

$$\text{E}(Y \mid R=2) - E\_{11} \le \text{ACE} \le E\_{22} - \text{E}(Y \mid R=1),$$

because E(*Y*|*R* = *x*) ≤ E(*YX*=*<sup>x</sup>*) ≤ *Exx* for *x* = 1, 2. When both RMTR and RMTS hold, these signs of inequalities for E(*YX*=*<sup>x</sup>*) are reversed.

Finally, we note that the MC and RMC (Assumptions 5.1 and 5.2), which derive the same bounds as those under the MTS and RMTS (Assumptions 4.1 and 4.2), are changed as follows for the case of noncompliance by receiving no treatment:

#### ASSUMPTION 5.3: Monotone confounding (MC)

*Both* E(*Y|X = R = x, Z = z*) *and* Pr(*X = x|R = x, Z = z*) a*re non-decreasing or non-increasing in z for x =* 1, 2 *and all r, and the components of Z are independent of each other.*

Causal Inference in Randomized Trials with Noncompliance 327

response. Furthermore, in addition to smoking cessation counseling, the intervention consisted of dietary advice to reduction blood cholesterol and hypertension medication. These interventions might also have influenced the incidence of CHD independent of smoking status. Thus, in this section, we relax the IV assumption to the following monotone

ASSUMPTION 6.1: Monotone instrumental variable (MIV) E(*YX*=*<sup>x</sup>*|*R* = 2) ≥ E(*YX*=*<sup>x</sup>*|*R* = 1). The MIV assumption is only the replacement of equality in the IV assumption with inequality, and means that the values of potential outcomes for subjects assigned to *R* = 2 are overall larger than those assigned to *R* = 1. For example, consider an unblinded trial to compare a new treatment with a standard treatment, where the outcome is a measure such that a larger value is better for the subject's health. In such a trial, subjects may think that the new treatment is more effective than the standard treatment, and this thinking may give rise to better results for subjects assigned to the new treatment than those assigned to the

ASSUMPTION 6.2: Reverse monotone instrumental variable (RMIV) E(*YX*=*<sup>x</sup>*|*R* = 2) ≤ E(*YX*=*<sup>x</sup>*|*R* = 1). We discuss the bounds on ACE under Assumptions 6.1 and 6.2 instead of Assumption 1. Noncompliance by switching the treatment (as in Sections 4) is discussed in Section 6.1, and noncompliance by receiving no treatment (as in Section 5) is discussed in Section 6.2. The

The bounds introduced in Section 4 are extended to those under the MIV and RMIV

(*E*21*p*2|1 + *K*0*p*1|1) − (*E*12*p*1|2 + *K*1*p*2|2) ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*1|2) − (*E*11*p*1|1 + *K*0*p*2|1), (6.1)

 (*E*22*p*2|2 + *K*0*p*1|2) − (*E*11*p*1|1 + *K*1*p*2|1) ≤ ACE ≤ (*E*21*p*2|1 + *K*1*p*1|1) − (*E*12*p*1|2 + *K*0*p*2|2). (6.2) These inequalities correspond to inequalities when *a* or *b* in max{*a*, *b*} and min{*a*, *b*} in inequality (4.1) are used. Therefore, the MIV and RMIV assumptions yield bounds on ACE with the same or broader width in comparison with the bounds under the IV assumption. Even under the MIV (or RMIV) assumption, but not IV assumption, we can derive bounds on ACE with narower widths by applying assumptions in Section 4.2 (Chiba, 2010c). Each combination of the MIV or RMIV and the MTR or RMTR derives the improved lower or upper bounds on ACE in Table 3. Likewise, each combination of the MIV or RMIV and the

Assumptions Improved bound on ACE MIV + MTR ACE ≥ max{−ITT, 0} RMIV + MTR ACE ≥ max{ITT, 0} MIV + RMTR ACE ≤ min{ITT, 0} RMIV + RMTR ACE ≤ min{−ITT, 0} Table 3. Improved bound on ACE under the MIV or RMIV and the MTR or RMTR, where

(Assumptions 6.1 and 6.2). Under the MIV and RMIV, the bounds on ACE are:

MTS or RMTS derives the improved lower or upper bounds on ACE in Table 4.

instrumental variable (MIV) assumption (Manski & Pepper, 2000, 2009):

We can also consider the following reverse MIV (RMIV) assumption:

derivations of inequalities in this section are outlined in Section 8.3.

standard treatment; this indicates that the MIV holds.

**6.1 Noncompliance by switching the treatment** 

ITT ≡ E(*Y*|*R* = 2) – E(*Y*|*R* = 1).

ASSUMPTION 5.4: Reverse monotone confounding (RMC)

 *One of* E(*Y|X = R = x, Z = z*) *and* Pr(*X = x|R = x, Z = z*) *is non-decreasing and the other is nonincreasing in z for x =* 1, 2 *and all r, and the components of Z are independent of each other.*

In some actual situations, assumptions presented in this section may hold for one of the test treatment and control groups but not for the other. In such cases, the assumptions can be applied only to one group. This example is introduced in the next sub-section.

#### **5.3 Application**

We apply the assumptions and bounds presented in Section 5.2 to the CDP trial introduced in Section 1 (Table 1). *R* represents the assigned group (*R* = 2 for the clofibrate group and *R* = 1 for the placebo group), and *X* is the compliance status (*X* = 2 for compliers in the clofibrate group, *X* = 1 for compliers in the placebo group, and *X* = 0 for non-compliers). Here, compliers and non-compliers are patients receiving more or less than 80% of the assigned treatment, respectively. *Y* is the incidence of deaths (*Y* = 1 for dead and *Y* = 0 for alive). Again, we note that ITT and PP analyses yielded ITT = 194/1065 – 523/2695 = –1.19% and PP = 106/708 – 274/1813 = –0.14%, respectively.

As in Section 4.3, it is necessary to discuss whether the assumptions hold. There may be a placebo effect, but it is not thought that the proportion of deaths will increase by receiving the placebo. Thus, Assumptions 2.2 (RMTR) and 3.4 can be assumed for (*s*, *t*) = (1, 0) and *x* = 1. However, a preventive effect of clofibrate may not be present (i.e., these assumptions may not be assumed for (*s*, *t*) = (2, 0) and *x* = 2) because of side-effects. The World Health Organization (WHO) has reported that in a large randomized trial, there were 25% more deaths in the clofibrate group than in the comparable high serum cholesterol control group (WHO, 1980). Because it is not clear whether the clofibrate has a positive or negative effect, we cannot assume the MTR or RMTR (and Assumption 3.3 or 3.4) for the clofibrate group.

Relating to the patients in this trial, health-oriented subjects might tend not to die and be more likely to comply with the assigned treatment, compared with subjects not concerned about their health. Under this observation, the RMTS (Assumption 4.4) would hold for both assigned groups. However, we note that some researchers may criticize this because some patients might not receive the treatment due to side-effects. In such a case, the RMTS may not hold for the clofibrate group. Nevertheless, we assume the RMTS for both assigned groups for illustrative purposes. Assumption 1 would hold because this trial was a double-blinded trial.

The arguments presented above demonstrate that the RMTR and RMTS can be assumed for the placebo group. Therefore, the bounds of E(*YX*=1) are *E*<sup>11</sup> ≤ E(*YX*=1) ≤ E(*Y*|*R* = 1), which yield 15.11% ≤ E(*YX*=1) ≤ 19.41%. For the clofibrate group, the RMTS is assumed and then the bounds of E(*YX*=2) are *E*<sup>22</sup> ≤ E(*YX*=2) ≤ *E*22*p*2|2 + *K*1*p*0|2 for *K*1 = 1, which yields 14.97% ≤ E(*YX*=2) ≤ 43.47%. In conclusion, the bounds on ACE are −4.43% ≤ ACE ≤ 28.36%. Unfortunately, we cannot conclude whether clofibrate is effective. However, the bounds improve those under Assumption 1 only: −32.94% ≤ ACE ≤ 33.31%, especially the lower bound.

#### **6. Monotone instrumental variable**

Sections 3-5 assumed the IV assumption (Assumption 1). As mentioned in Section 2, however, this assumption often may not hold in unblinded trials, in which subjects are aware of the assigned treatment and this knowledge may affect the potential outcomes. In the MRFIT (Section 4.3), subjects would have been aware of their assigned group because it was an unblinded trial, and thus the intervention itself might have evoked a psychological

ASSUMPTION 5.4: Reverse monotone confounding (RMC)  *One of* E(*Y|X = R = x, Z = z*) *and* Pr(*X = x|R = x, Z = z*) *is non-decreasing and the other is nonincreasing in z for x =* 1, 2 *and all r, and the components of Z are independent of each other.* In some actual situations, assumptions presented in this section may hold for one of the test treatment and control groups but not for the other. In such cases, the assumptions can be

We apply the assumptions and bounds presented in Section 5.2 to the CDP trial introduced in Section 1 (Table 1). *R* represents the assigned group (*R* = 2 for the clofibrate group and *R* = 1 for the placebo group), and *X* is the compliance status (*X* = 2 for compliers in the clofibrate group, *X* = 1 for compliers in the placebo group, and *X* = 0 for non-compliers). Here, compliers and non-compliers are patients receiving more or less than 80% of the assigned treatment, respectively. *Y* is the incidence of deaths (*Y* = 1 for dead and *Y* = 0 for alive). Again, we note that ITT and PP analyses yielded ITT = 194/1065 – 523/2695 = –1.19%

As in Section 4.3, it is necessary to discuss whether the assumptions hold. There may be a placebo effect, but it is not thought that the proportion of deaths will increase by receiving the placebo. Thus, Assumptions 2.2 (RMTR) and 3.4 can be assumed for (*s*, *t*) = (1, 0) and *x* = 1. However, a preventive effect of clofibrate may not be present (i.e., these assumptions may not be assumed for (*s*, *t*) = (2, 0) and *x* = 2) because of side-effects. The World Health Organization (WHO) has reported that in a large randomized trial, there were 25% more deaths in the clofibrate group than in the comparable high serum cholesterol control group (WHO, 1980). Because it is not clear whether the clofibrate has a positive or negative effect, we cannot assume the MTR or RMTR (and Assumption 3.3 or 3.4) for the clofibrate group. Relating to the patients in this trial, health-oriented subjects might tend not to die and be more likely to comply with the assigned treatment, compared with subjects not concerned about their health. Under this observation, the RMTS (Assumption 4.4) would hold for both assigned groups. However, we note that some researchers may criticize this because some patients might not receive the treatment due to side-effects. In such a case, the RMTS may not hold for the clofibrate group. Nevertheless, we assume the RMTS for both assigned groups for illustrative purposes. Assumption 1 would hold because this trial was a double-blinded trial. The arguments presented above demonstrate that the RMTR and RMTS can be assumed for the placebo group. Therefore, the bounds of E(*YX*=1) are *E*<sup>11</sup> ≤ E(*YX*=1) ≤ E(*Y*|*R* = 1), which yield 15.11% ≤ E(*YX*=1) ≤ 19.41%. For the clofibrate group, the RMTS is assumed and then the bounds of E(*YX*=2) are *E*<sup>22</sup> ≤ E(*YX*=2) ≤ *E*22*p*2|2 + *K*1*p*0|2 for *K*1 = 1, which yields 14.97% ≤ E(*YX*=2) ≤ 43.47%. In conclusion, the bounds on ACE are −4.43% ≤ ACE ≤ 28.36%. Unfortunately, we cannot conclude whether clofibrate is effective. However, the bounds improve those under

applied only to one group. This example is introduced in the next sub-section.

Assumption 1 only: −32.94% ≤ ACE ≤ 33.31%, especially the lower bound.

Sections 3-5 assumed the IV assumption (Assumption 1). As mentioned in Section 2, however, this assumption often may not hold in unblinded trials, in which subjects are aware of the assigned treatment and this knowledge may affect the potential outcomes. In the MRFIT (Section 4.3), subjects would have been aware of their assigned group because it was an unblinded trial, and thus the intervention itself might have evoked a psychological

**6. Monotone instrumental variable** 

and PP = 106/708 – 274/1813 = –0.14%, respectively.

**5.3 Application** 

response. Furthermore, in addition to smoking cessation counseling, the intervention consisted of dietary advice to reduction blood cholesterol and hypertension medication. These interventions might also have influenced the incidence of CHD independent of smoking status. Thus, in this section, we relax the IV assumption to the following monotone instrumental variable (MIV) assumption (Manski & Pepper, 2000, 2009):

> ASSUMPTION 6.1: Monotone instrumental variable (MIV) E(*YX*=*<sup>x</sup>*|*R* = 2) ≥ E(*YX*=*<sup>x</sup>*|*R* = 1).

The MIV assumption is only the replacement of equality in the IV assumption with inequality, and means that the values of potential outcomes for subjects assigned to *R* = 2 are overall larger than those assigned to *R* = 1. For example, consider an unblinded trial to compare a new treatment with a standard treatment, where the outcome is a measure such that a larger value is better for the subject's health. In such a trial, subjects may think that the new treatment is more effective than the standard treatment, and this thinking may give rise to better results for subjects assigned to the new treatment than those assigned to the standard treatment; this indicates that the MIV holds.

We can also consider the following reverse MIV (RMIV) assumption:

$$\begin{array}{c} \text{ASSUMIPTION 6.2:} \text{ Reveres monotone instrumental variable (RMVV)}\\ \text{E}(Y\_{\lambda \text{-x}} \mid R=2) \le \text{E}(Y\_{\lambda \text{-x}} \mid R=1). \end{array}$$

We discuss the bounds on ACE under Assumptions 6.1 and 6.2 instead of Assumption 1. Noncompliance by switching the treatment (as in Sections 4) is discussed in Section 6.1, and noncompliance by receiving no treatment (as in Section 5) is discussed in Section 6.2. The derivations of inequalities in this section are outlined in Section 8.3.

#### **6.1 Noncompliance by switching the treatment**

The bounds introduced in Section 4 are extended to those under the MIV and RMIV (Assumptions 6.1 and 6.2). Under the MIV and RMIV, the bounds on ACE are:

$$(E\_{21}p\_{2\mid 1} + K\_0p\_{1\mid 1}) - (E\_{12}p\_{1\mid 2} + K\_1p\_{2\mid 2}) \le \text{ACE} \le (E\_{22}p\_{2\mid 2} + K\_1p\_{1\mid 2}) - (E\_{11}p\_{1\mid 1} + K\_0p\_{2\mid 1}), \tag{6.1}$$

$$(\mathbf{E}\_{22}p\_{2\mid 2} + \mathbf{K}p\_{\mathbf{1}\mid 2}) - (\mathbf{E}\_{11}p\_{1\mid 1} + \mathbf{K}p\_{2\mid 1}) \le \mathbf{A}\mathbf{C}\mathbf{E} \le (\mathbf{E}\_{21}p\_{2\mid 1} + \mathbf{K}p\_{\mathbf{1}\mid 1}) - (\mathbf{E}\_{12}p\_{1\mid 2} + \mathbf{K}p\_{2\mid 2}).\tag{6.2}$$

These inequalities correspond to inequalities when *a* or *b* in max{*a*, *b*} and min{*a*, *b*} in inequality (4.1) are used. Therefore, the MIV and RMIV assumptions yield bounds on ACE with the same or broader width in comparison with the bounds under the IV assumption.

Even under the MIV (or RMIV) assumption, but not IV assumption, we can derive bounds on ACE with narower widths by applying assumptions in Section 4.2 (Chiba, 2010c). Each combination of the MIV or RMIV and the MTR or RMTR derives the improved lower or upper bounds on ACE in Table 3. Likewise, each combination of the MIV or RMIV and the MTS or RMTS derives the improved lower or upper bounds on ACE in Table 4.


Table 3. Improved bound on ACE under the MIV or RMIV and the MTR or RMTR, where ITT ≡ E(*Y*|*R* = 2) – E(*Y*|*R* = 1).

Causal Inference in Randomized Trials with Noncompliance 329

MIV + MTR E(*Y*|*R* = 1) − (*E*22*p*2|2 + *K*1*p*0|2) ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*0|2) − E(*Y*|*R* = 1) RMIV + MTR E(*Y*|*R* = 2) − (*E*11*p*1|1 + *K*1*p*0|1) ≤ ACE ≤ *K*1 – (*E*02*p*0|2 + *K*0*p*2|2) MIV + RMTR *K*<sup>0</sup> − (*E*02*p*0|2 + *K*1*p*2|2) ≤ ACE ≤ E(*Y*|*R* = 2) − (*E*11*p*1|1 + *K*0*p*0|1) RMIV + RMTR (*E*22*p*2|2 + *K*0*p*0|2) – E(*Y*|*R* = 1) ≤ ACE ≤ E(*Y*|*R* = 1) − (*E*22*p*2|2 + *K*0*p*0|2)

Assumptions Bounds on ACE

Table 5. Bounds on ACE under the MIV or RMIV and the MTR or RMTR2.

Table 6. Bounds on ACE under the MIV or RMIV and the MTS or RMTS.

Table 7. Bounds on ACE under some combinations of assumptions3.

enough information about treatment effects of clofibrate.

MTR and the upper bound is 0 under the RMTR.

readily applied to the causal risk ratio when the outcome is binary.

lower bound is 0 under the MTR and that of the upper bound is 0 under the RMTR.

**7. Conclusion** 

Assumptions Bounds on ACE

Assumptions Bounds on ACE

MIV + MTS *K*0 – *K*<sup>1</sup> ≤ ACE ≤ *E*22 – (*E*11*p*1|1 + *K*0*p*0|1) RMIV + MTS (*E*22*p*2|2 + *K*0*p*0|2) – *E*<sup>11</sup> ≤ ACE ≤ *K*1 – *K*<sup>0</sup> MIV + RMTS *K*0 – *K*<sup>1</sup> ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*0|2) – *E*<sup>11</sup> RMIV + RMTS *E*22 – (*E*11*p*1|1 + *K*1*p*0|1) ≤ ACE ≤ *K*1 – *K*<sup>0</sup>

MIV + MTR + MTS E(*Y*|*R* = 1) − (*E*22*p*2|2 + *K*1*p*0|2) ≤ ACE ≤ *E*<sup>22</sup> − E(*Y*|*R* = 1) RMIV + MTR + MTS E(*Y*|*R* = 2) − *E*<sup>11</sup> ≤ ACE ≤ *K*1 – (*E*02*p*0|2 + *K*0*p*2|2) MIV + RMTR + RMTS *K*<sup>0</sup> − (*E*02*p*0|2 + *K*1*p*2|2) ≤ ACE ≤ E(*Y*|*R* = 2) − *E*<sup>11</sup> RMIV + RMTR + RMTS *E*22 – E(*Y*|*R* = 1) ≤ ACE ≤ E(*Y*|*R* = 1) − (*E*22*p*2|2 + *K*0*p*0|2)

For illustration, we apply the bounds presented here to the CDP trial (Table 1). Although the IV (Assumption 1) would hold in this trial because it was a double-blinded trial, we here relax this assumption to the MIV and RMIV (Assumptions 6.1 and 6.2), and yield bounds on ACE under both assumptions. As discussed in Section 5.3, we assume the RMTS for the clofibrate group and the RMTR and RMTS for the placebo group. Then, under the MIV, the bounds of E(*YX*=2) and E(*YX*=1) are *K*<sup>0</sup> ≤ E(*YX*=2) ≤ *E*22*p*2|2 + *K*1*p*0|2 and *E*<sup>11</sup> ≤ E(*YX*=1) ≤ *E*02*p*0|2 + *K*1*p*2|2, respectively, where *K*0 = 0 and *K*1 = 1 because *Y* is binary. These bounds yield bounds on ACE of −74.74% ≤ ACE ≤ 28.36%. Likewise, under the RMIV, the bounds on ACE become −4.43% ≤ ACE ≤ 90.05%, because *E*<sup>22</sup> ≤ E(*YX*=2) ≤ *K*1 and *E*22*p*2|2 + *K*0*p*0|2 ≤ E(*YX*=1) ≤ E(*Y*|*R* = 1). Unfortunately, these bounds have a very broad width, and thus they do not provide

This chapter has presented bounds on ACE in randomized trials with noncompliance. Although the results presented here are relevant to the causal differences, they can also be

2 If (*s*, *t*) = (2, 1) in the MTR and RMTR (Assumptions 2.1 and 2.2) is used, the lower bound is 0 under the

3 If (*s*, *t*) = (2, 1) in the MTR and RMTR (Assumptions 2.1 and 2.2) is additionally used, a candidate of the


Table 4. Improved bound on ACE under the MIV or RMIV and the MTS or RMTS.

Eight lower or upper bounds in Tables 3 and 4 yield the same or broader bounds as those under the IV assumption. Note that we can use Assumptions 3.1 and 3.2 instead of the MTR and RMTR (Assumptions 2.1 and 2.2), respectively, and Assumptions 5.1 and 5.2 instead of the MTS and RMTS (Assumptions 4.1 and 4.2), respectively. Further combinations of the above bounds can derive further improved bounds; for example, max{−ITT, 0} ≤ ACE ≤ PP under the MIV, MTR and MTS assumptions.

For illustration, we apply the bounds presented here to the MRFIT (Table 2), in which the IV assumption may not hold, as discussed above. Because the intervention consisted of dietary advice and hypertension medication as well as the therapy itself that might have evoked a psychological response, the potential incidence of CHD for subjects assigned to the test group might have been reduced, compared with subjects assigned to the control group. This observation shows that Assumption 6.2 (RMIV: E(*YX*=*<sup>x</sup>*|*R* = 2) ≤ E(*YX*=*<sup>x</sup>*|*R* = 1)) is reasonable. Additionally, as discussed in Section 4.3, the RMTR (or Assumption 3.2) and RMTS are reasonable assumptions in this trial. In conclusion, the RMIV, RMTR and RMTS can be assumed, and then bounds on ACE become PP ≤ ACE ≤ min{−ITT, 0}, which yield −0.92% ≤ ACE ≤ 0%. In comparison with the IV (plus RMTR and RMTS) in Section 4.2 (−0.92% ≤ ACE ≤ −0.13%), the lower bound is the same but the upper bound is larger.

#### **6.2 Noncompliance by receiving no treatment**

The bounds introduced in Section 5 are extended to those under the MIV and RMIV (Assumptions 6.1 and 6.2). Under these assumptions, the bounds on ACE are:

$$K\_0 - K\_1 \le \text{ACE} \le \left( E\_{22} p\_{2/2} + K\_1 p\_{0/2} \right) - \left( E\_{11} p\_{1/1} + K\_0 p\_{0/1} \right),\tag{6.3}$$

$$(E\_{22}p\_{2\mid 2} + K\_0 p\_{0\mid 2}) - (E\_{11}p\_{1\mid 1} + K\_1 p\_{0\mid 1}) \le \text{ACE} \le K\_1 - K\_0. \tag{6.4}$$

respectively. The upper bound in inequality (6.3) is equal to that in inequality (5.1) and the lower bound in inequality (6.4) is equal to that in inequality (5.1). Unfortunately, the respective lower and upper bounds in inequalities (6.3) and (6.4) do not give any information.

As discussed in the above sub-section, by combining the MTR (or RMTR) and MTS (or RMTS), the bounds on ACE can be improved. Table 5 summarizes the bounds under the MIV or RMIV and the MTR and RMTR, and Table 6 summarizes those under the MIV or RMIV and the MTS and RMTS. The bounds in Tables 5 and 6 include *K*0 or *K*1, which is the finite range of *Y*. Specifically, in Table 6, the lower or upper bounds are not improved even when the MTS or RMTS is added. Thus, the bounds may not be greatly improved. However, further combinations of these assumptions can remove *K*0 and *K*1 from one of the lower and upper bounds. Such bounds are summarized in Table 7.

Assumptions Improved bound on ACE MIV + MTS ACE ≤ *E*22 – *E*<sup>11</sup> RMIV + MTS ACE ≤ *E*21 – *E*<sup>12</sup> MIV + RMTS ACE ≥ *E*21 – *E*<sup>12</sup> RMIV + RMTS ACE ≥ *E*22 – *E*<sup>11</sup>

Eight lower or upper bounds in Tables 3 and 4 yield the same or broader bounds as those under the IV assumption. Note that we can use Assumptions 3.1 and 3.2 instead of the MTR and RMTR (Assumptions 2.1 and 2.2), respectively, and Assumptions 5.1 and 5.2 instead of the MTS and RMTS (Assumptions 4.1 and 4.2), respectively. Further combinations of the above bounds can derive further improved bounds; for example, max{−ITT, 0} ≤ ACE ≤ PP

For illustration, we apply the bounds presented here to the MRFIT (Table 2), in which the IV assumption may not hold, as discussed above. Because the intervention consisted of dietary advice and hypertension medication as well as the therapy itself that might have evoked a psychological response, the potential incidence of CHD for subjects assigned to the test group might have been reduced, compared with subjects assigned to the control group. This observation shows that Assumption 6.2 (RMIV: E(*YX*=*<sup>x</sup>*|*R* = 2) ≤ E(*YX*=*<sup>x</sup>*|*R* = 1)) is reasonable. Additionally, as discussed in Section 4.3, the RMTR (or Assumption 3.2) and RMTS are reasonable assumptions in this trial. In conclusion, the RMIV, RMTR and RMTS can be assumed, and then bounds on ACE become PP ≤ ACE ≤ min{−ITT, 0}, which yield −0.92% ≤ ACE ≤ 0%. In comparison with the IV (plus RMTR and RMTS) in Section 4.2 (−0.92% ≤ ACE ≤ −0.13%), the lower bound is the same but the upper

The bounds introduced in Section 5 are extended to those under the MIV and RMIV

 *K*<sup>0</sup> − *K*<sup>1</sup> ≤ ACE ≤ (*E*22*p*2|2 + *K*1*p*0|2) − (*E*11*p*1|1 + *K*0*p*0|1), (6.3)

 (*E*22*p*2|2 + *K*0*p*0|2) − (*E*11*p*1|1 + *K*1*p*0|1) ≤ ACE ≤ *K*<sup>1</sup> − *K*0, (6.4) respectively. The upper bound in inequality (6.3) is equal to that in inequality (5.1) and the lower bound in inequality (6.4) is equal to that in inequality (5.1). Unfortunately, the respective lower and upper bounds in inequalities (6.3) and (6.4) do not give any

As discussed in the above sub-section, by combining the MTR (or RMTR) and MTS (or RMTS), the bounds on ACE can be improved. Table 5 summarizes the bounds under the MIV or RMIV and the MTR and RMTR, and Table 6 summarizes those under the MIV or RMIV and the MTS and RMTS. The bounds in Tables 5 and 6 include *K*0 or *K*1, which is the finite range of *Y*. Specifically, in Table 6, the lower or upper bounds are not improved even when the MTS or RMTS is added. Thus, the bounds may not be greatly improved. However, further combinations of these assumptions can remove *K*0 and *K*1 from one of the lower and

(Assumptions 6.1 and 6.2). Under these assumptions, the bounds on ACE are:

Table 4. Improved bound on ACE under the MIV or RMIV and the MTS or RMTS.

under the MIV, MTR and MTS assumptions.

**6.2 Noncompliance by receiving no treatment** 

upper bounds. Such bounds are summarized in Table 7.

bound is larger.

information.


Table 5. Bounds on ACE under the MIV or RMIV and the MTR or RMTR2.


Table 6. Bounds on ACE under the MIV or RMIV and the MTS or RMTS.


Table 7. Bounds on ACE under some combinations of assumptions3.

For illustration, we apply the bounds presented here to the CDP trial (Table 1). Although the IV (Assumption 1) would hold in this trial because it was a double-blinded trial, we here relax this assumption to the MIV and RMIV (Assumptions 6.1 and 6.2), and yield bounds on ACE under both assumptions. As discussed in Section 5.3, we assume the RMTS for the clofibrate group and the RMTR and RMTS for the placebo group. Then, under the MIV, the bounds of E(*YX*=2) and E(*YX*=1) are *K*<sup>0</sup> ≤ E(*YX*=2) ≤ *E*22*p*2|2 + *K*1*p*0|2 and *E*<sup>11</sup> ≤ E(*YX*=1) ≤ *E*02*p*0|2 + *K*1*p*2|2, respectively, where *K*0 = 0 and *K*1 = 1 because *Y* is binary. These bounds yield bounds on ACE of −74.74% ≤ ACE ≤ 28.36%. Likewise, under the RMIV, the bounds on ACE become −4.43% ≤ ACE ≤ 90.05%, because *E*<sup>22</sup> ≤ E(*YX*=2) ≤ *K*1 and *E*22*p*2|2 + *K*0*p*0|2 ≤ E(*YX*=1) ≤ E(*Y*|*R* = 1). Unfortunately, these bounds have a very broad width, and thus they do not provide enough information about treatment effects of clofibrate.
