Other Statistical Techniques

#### **Chapter 7**

## Network Meta-Analysis Using R for Diabetes Data

*Nilgün Yildiz*

#### **Abstract**

The objective of a meta-analysis is usually to estimate the overall treatment effect and make inferences about the difference between the effects of the two treatments. Meta-analysis is a quantitative method commonly used to combine the results of multiple studies in the medical and social sciences. There are three common types of meta-analysis. Pairwise, Multivariate and Network Meta-analysis. In general, network meta-analysis (NMA) offers the advantage of enabling the combined assessment of more than two treatments. Statistical approaches to NMA are largely classified as frequentist and Bayesian frameworks Because part of NMA has indirect, multiple comparisons, As reports of network meta-analysis become more common, it is essential to introduce the approach to readers and to provide guidance as to how to interpret the results. In this chapter, the terms used in NMA are defined, relevant statistical concepts are summarized, and the NMA analytic process based on the frequentist and Bayesian framework is illustrated using the R program and an example of a network involving diabetes treatments. The aim of the article is to compare the basic concepts and analyzes of network meta-analysis using diabetes data and the treatment methods used.

**Keywords:** Network meta-analysis, fixed effect model, random-effects model, forest plot, network graph, direct evidence plot

#### **1. Introduction**

Meta-analysis is used to synthesize the results of more than one study and overall effect size is considered to be valid only when some required assumptions are satisfied [1]. An increasing number of options for alternative medical treatment has given rise to the need for comparative effectiveness research [2, 3]. A randomized, controlled trials used to compare different treatment options are generally seen to be infeasible, there is a need for other methodological approaches. Since it makes it possible to combine data from many different studies so that a total estimate of treatment effect can be provided, a meta-analysis integrated into a systematic review is generally seen to be a useful statistical tool. On the other hand, there is an important limitation of standard meta-analysis; only two interventions can be compared at a time. When you have several treatment options to capitalize on, only partial information can be provided by a series of individual meta-analysis since only the questions about pairs of treatments can be answered in this way, which leads to difficulties in making optimal clinical decisions since each meta-analysis is just one constituent of the whole picture.

There is an increasing need for a method to be used to summarize evidence across many interventions [4]. In order to assess a number of interventions in terms of their relative effectiveness and to synthesize evidence from a set of randomized trials, network meta-analysis (or multiple treatments meta-analysis or mixedtreatment comparison) was created [5–7]. This method is built on the analysis of direct evidence (coming from research that directly randomizes treatments of interest) and indirect evidence (coming from research that compares treatments of interest with a common comparator) [8]. The benefits incurred by network analysis becoming increasingly popular have been reported in some applications and methodological articles [2, 9, 10]. **Figure 1** shows the number of network meta-analysis (NMA) studies that have been published.

Despite the fact that network meta-analysis shares many underlying assumptions with pairwise meta-analysis, it is not so much accepted as pairwise meta-analysis and thus criticized more [11].

The assumptions required by NMA about similarity, transitivity, and consistency [12–17] are methodologically, logically, and statistically more strict [18, 19] because it should be examined whether each of these is satisfied or not [15, 20, 21].

For NMA, there are some methods to calculate the contribution of direct (and indirect) evidence of each comparison to its own NMA estimate, but how to define the contribution of each study to another estimate of treatment effect is an issue of greater ambiguity. There are a number of proposals made in the literature, each of which is based on a different approach but many of them are not without limitations and generally, there are contradictions between their results [20, 22–24]. There are some investigations having been conducted on the proportions of direct and indirect evidence in the past. One of these is the method of "back calculation" [21] introduced by Dias and some others have been proposed within a Bayesian framework [25]. There is even one proposed within a frequentist context [13]. In inverse variance method-based NMA, NMA estimates refer to linear combinations of treatment effect estimates from primary studies having coefficients that make up the rows of the hat matrix. It is easy to obtain the direct evidence proportion of a study or a comparison from the diagonal elements that the respective hat matrix has [13]. Dias and others proposed "node splitting" as an alternative. Node splitting refers to the estimation of the indirect evidence for comparison by modeling out all studies providing direct information for this comparison [25]. Additions were made to this method [26] and called "side splitting" by others [9]. There are different

**Figure 1.** *Number of network meta-analysis publications (search PubMed until January 2020).*

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

interpretations of the term "side" in the literature; for example, it was interpreted as an edge in the network graph by White [9] while it was interpreted as SIDE, an abbreviation of "Separating Indirect and Direct Evidence" by others [27, 28]. There is another way of quantifying the indirect evidence proposed by Noma and others, including the factorization of the total likelihood into separate component likelihoods [14]. Yet, none of these authors have attempted to make a definition or estimation of the contribution of each study to a given comparison in the network.

There are six basic steps that every NMA should follow, regardless of the analytic model chosen. These steps include

1.Understand network geometry,

2.Understand key concepts and assumptions,

3.Conduct analysis and present results,

4.Examine model assumptions through local and global tests,

5.Create a hierarchy of competing interventions (ranking),

6.Conduct heterogeneity and sensitivity analyses.

The network plot is fundamental to an NMA because it helps visualize the available studies and few of evidence across the multiple comparisons. In such a plot, each treatment/comparator identified in the review is represented by a node, and direct evidence comparing two interventions (i.e., studies which directly compared these two interventions) are represented via edges, connecting the respective nodes. The network plot of our example is presented in **Figure 2**.

In **Figure 2**, a network of treatments for type 2 diabetes is shown. The function served by the lines between the treatment nodes is to show which comparisons have been made in trials that are randomized. The absence of a line between two nodes means that there are no studies (that is, no direct evidence) comparing the two

**Figure 2.** *A graph of the network generated by using the net graph function for the diabetes data.*

drugs. A network meta-analysis refers to an analysis of the data from all of these randomized trials at the same time. By means of a network meta-analysis, it is possible to estimate the relative effectiveness of two treatments even if they are not compared by any studies. For example, no comparison has been made between rosi and acar in any study but by using a common comparator (placebo), an indirect comparison can be made between them. After denoting rosi, acar, and placebo as treatments A, B, and C, respectively, it is possible to have an indirect comparison (AB) by subtracting the meta-analytic estimates of all studies of acar versus placebo (BC) from the estimate of all studies of rosi versus placebo (AC): AB indirect metaanalysis \_ AC direct meta-analysis \_ BC direct meta-analysis. If there is direct evidence (such as metf vs. sulf in (**Figure 2**), direct and indirect estimates can be combined by the network meta-analysis and mixed effect size can be calculated as the weighted average of the direct evidence (studies comparing metf and sulf directly) and the indirect evidence (for example, studies comparing metf and acar via placebo). The network constructed by studies of metf versus acar, metf versus placebo, and acar versus placebo is often named as a loop of evidence. By using indirect estimates, information can be provided on comparisons for which there are no trials. In this way, the accuracy of the direct estimate can be enhanced through the reduction of the width of the CIs in comparison with the direct evidence alone [9].

In a network meta-analysis, all the direct and indirect evidence can be utilized. Empirical studies have concluded that compared to a single direct or indirect estimate, it can produce more precise estimates of the intervention effects [2, 29]. Moreover, network meta-analysis has the potential of yielding data for comparisons made between pairs of interventions having never been evaluated within individual randomized trials. The comparison of all interventions of interest simultaneously in the same analysis makes it possible to estimate their ranking relatively for a given result. The purpose of this study is to show how analysis can be done with the network meta-analysis method using the R package program. Network metaanalysis as a functional method. It is to show that it can be done flexibly and easily with the R program to help researchers interested in this subject.

This chapter is organized as follows, In the next sections, we present a review of the methods for NMA as identified in our literature search. In Section 2, we present key concepts and the basic methodology for NMA. In Section 3 Diabetes treatments data is used as an example. The last section presents conclusions about our research and results found by network meta-analysis of diabetes data using the R program.

#### **2. Conceptual issues and underlying assumptions in network meta-analysis**

There may be different alternatives for the treatment of the same health condition and what makes NMA special is that through the synthesis of direct and indirect estimates for their relative effects, it allows the selection of the best treatment. Head-to-head studies can be conducted to directly compare two treatments A and B (AB studies). It is also possible to get an indirect estimate from studies in which these two treatments are compared with a common comparator treatment C, namely, AC and BC studies [9]. If we have both direct and indirect estimates, then we can combine them to estimate a mixed-treatment effect, as you can see in the left panel of **Figure 3**. In practice, there are numerous interventions for most health conditions that have been compared in various randomized trials and build a network of evidence. For the comparison of treatments within such a network, there may be direct and many different indirect estimates obtained through many different comparators, as illustrated in the example in the right panel of **Figure 2**.

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

**Figure 3.** *Each circle represents an intervention, and lines represent direct comparisons.*

Using NMA, all these different pieces of information can be compared so that an internally consistent overall estimate of the relative effects of all treatments can be produced. Researchers are still disputing about how valid it is to use indirect treatment comparisons (indirect evidence) while making decisions. There are strong arguments against using such evidence especially when there are direct treatment comparisons (direct evidence) [11, 30–32]. A focus of criticism is the nature of the evidence provided by NMA. Although patients in a randomized clinical trial (RCT) are randomly assigned to each of the treatments that are compared, it cannot be argued that the treatments are randomized across the included trials.

Thus, indirect comparisons can be defined as non-randomized comparisons and correspondingly they provide observational evidence rather than randomized evidence. Consequently, indirect treatment comparisons may be more susceptible to biased treatment effect estimates, due, for example, to confounding (for example, when randomized AB and AC studies are systematically different from BC; [2] and selection bias (e.g., when the selection of comparator in the study is based on the relative treatment effect [33].

#### **2.1 Indirect comparisons**

Consider trial 1, a two-arm trial of the comparison "B–A", and trial 2, a two-arm trial of the comparison "C–B". If the estimated effect sizes in these trials are ^*δ AB* <sup>1</sup> in trial 1 and ^*δ BC* <sup>2</sup> in trial 2, then an indirect comparison of "C–A" may be obtained as ^*δ AC indirect* <sup>¼</sup> ^*<sup>δ</sup> AB* <sup>1</sup> <sup>+</sup> ^*<sup>δ</sup> BC:* 2

Through indirect comparison, the benefits of randomization can be maintained in each trial, and differences across the trials are allowed (e.g.,, in baseline risk) if only the prognosis of the participants but not their response to treatment is affected by these differences (in whichever metric is chosen as a measure of effect size). However, the indirect comparison is based on the assumption that the treatment named as B is the same in both trials so that its effects are nullified when "B-A" and "C-B" are added together. It is not possible to test whether the difference between A and C is truly reflected by an indirect comparison without having further information. The comparison of the indirect comparison with a direct comparison would be allowed by a third trial of "C–A" (yielding result ^*δ AC* <sup>3</sup> ). The network of these three trials can be said to be consistent only if the underlying treatment effects are related to each other as follows:

$$
\hat{\delta}\_3^{AC} = \hat{\delta}\_1^{AB} + \hat{\delta}\_2^{BC} \tag{1}
$$

Here ^*δ AB* <sup>1</sup> , ^*<sup>δ</sup> BC* <sup>2</sup> and ^*<sup>δ</sup> AC* <sup>3</sup> represent the actual effects that underlay the three studies. In practice, it is not very likely for Eq. (1) to hold for a particular set of three trials such as the ones described earlier. The reason behind this may be discussed either in terms of heterogeneity (because, within each treatment comparison, each individual study may not fully represent all studies in this particular comparison) or in terms of inconsistency (because, across treatment comparisons, important differences in the types of studies contributing to the comparisons exist). We will give more detailed information on these two concepts in subsequent sections [34].

#### **2.2 Heterogeneity**

The existing research has widely investigated heterogeneity in meta-analysis, referring to the situation where multiple studies focused on the same research question have different underlying values regarding the effect measure that is being estimated. The way of understanding heterogeneity in the network meta-analysis scenario is to keep the treatment comparison constant while changing the study index. In particular, the existence of heterogeneity can be argued for comparison 'B–A' if ^*δ AB <sup>i</sup>* 6¼ ^*<sup>δ</sup> AB <sup>j</sup>* for some pair of studies i and j. It has been claimed that heterogeneity is an inevitable part of a meta-analysis [35] indicating that it is not likely that two trials of the same pairwise comparison are to have equal underlying treatment effects. Hence, within the context of Eq. (1), it is unlikely that the equality holds since the particular instance of "C–A", which is examined in trial 3, will probably not represent all instances of "C–A" comparisons (and this holds true for trials 1 and 2 for their respective treatment comparisons). A random-effects model is a common way of allowing for heterogeneity. This assumes that the main effects in multiple runs of the same comparison arise from a common distribution, usually a normal distribution; namely,

$$
\delta\_i^{\rm{IK}} \sim \mathcal{N}\left(\delta^{\rm{IK}}, \tau\_{\rm{IK}}^2\right) \tag{2}
$$

for pairwise comparison JK (taking values AB, AC, or BC in the running example) [34].

#### **2.3 Consistency**

Consistency is the statistical manifestation of transitivity [12]. An additional way of making implicit inferences about the plausibility of the transitivity assumption is to check the network for consistency. What is meant by consistency is the statistical agreement between observed direct and (possibly many) indirect sources of evidence. A simple network can only contain treatments A, B, and C.

A consistency equation is generally used to express the relationship that is desirable between direct and indirect sources of evidence for a single comparison

$$
\delta^{AC} = \delta^{AB} + \delta^{BC} \tag{3}
$$

where the mean effect size across all studies of comparison JK is represented by JK. (Under a fixed-effect meta-analysis model where the absence of heterogeneity is assumed, dJK represents a fixed (common) treatment effect for comparison JK). We refer to evidence that satisfies the consistency equation as showing consistency. We show this in **Figure 4(a)** as a three (non-touching) solid-edge relationship triangle in a network with only two-arm trials. Each edge represents one or more

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

#### **Figure 4.**

*Graphical representation of consistency, loop inconsistency and design inconsistency.*

two-arm trials that compare two treatments identified at either end of the edge. Using the same line style (a solid line), we draw all three edges to describe the situation where there is no contradiction (inconsistency) between them, that is, Eq. (2) is valid [34].

#### **2.4 Loop inconsistency**

When studies focused on various treatment comparisons are highly different in such a way that their effect sizes are affected, the consistency Eq. (2) might not be valid; thus, the effect sizes are not "added up" around the loop in the figure. This is called loop inconsistency and is shown by drawing edges using different line styles (**Figure 4(b)**). Loop inconsistency may only result from when there are different comparisons made in at least three separate study groups (e.g., studies "B–A", "C–A" and "C–B"). Equivalently, it can only occur when we have both indirect and direct estimates of effect size (e.g., when "C–B" is measured both directly and through "A" indirectly) [34]. Some examples showing the causes of loop consistency are given below:

#### **2.5 Multi-arm trials**

Generally, some studies having more than two treatment arms are included in a network meta-analysis. In fact, about a quarter of randomized trials involve more

than two arms [36], so it is important to select appropriate methods while dealing with the condition.

When there is the presence of multi-arm trials in an evidence network, the definition of loop inconsistency becomes more complicated. It is not possible to loop inconsistency in a multi-arm trial. As a result, consistency can occur for a network either structurally (because all studies include all treatments) or through observation (when assumptions about equality of direct and various indirect comparisons hold across studies), or by means of a combination of the two.

Also, loop inconsistency cannot be properly defined using Eq. (2) anymore, since average effect sizes, *δJK*, refer to pairwise comparisons made from a combination of possibly inconsistent loops (e.g., from the two-arm trial) and naturally consistent loops (i.e., from multi-arm trials). In our drawings, multi-arm trials are shown using a closed (merged) polygon (**Figure 4(d)**) [34].

#### **2.6 Transitivity**

The purpose of an NMA is to improve the decision-making process for making choices between alternative treatments for a specific health condition and a target population. Hence, the estimates intended to be estimated in an NMA are the mean relative treatment effect sizes among the treatments competing with each other as they are expected to be present in the target population. If unbiased estimates are yielded by studies involved in the dataset and if a representative sample of the population addressed is constituted by these studies, then estimates generated by an NMA model for these parameters will be unbiased and consistent. The same set of assumptions is adopted by NMA as a pairwise meta-analysis [37], but there is also another assumption adopted by it which can be difficult to assess [38] and is called transitivity [39], (also called similarity [40, 41], or exchangeability [42]). Transitivity means that information for comparison between treatments A and B can be attained through another treatment C using comparisons A to C and B to C. It is not possible to test his assumption statistically, but it is possible to evaluate its validity in a conceptual and epidemiological way [21].

What is meant by the transitivity assumption is that direct evidence from studies AC and BC can be combined to gain insights (indirectly) about AB comparison. However, this will be open to questioning if there are significant differences in the distribution of effect modifiers (variables or characteristics that alter the observed relative effects, e.g., the mean age of participants and treatment dose) across the AC and BC trials, which yield insights about the indirect comparison [24, 39]. An effect modifier might have different effects across studies of the same comparison (e.g., the mean age of participants may differ across AC trials), but if its distribution across comparisons (AC and BC) is similar, the assumption of transitivity may still hold [21]. As a consequence, how plausible the transitivity assumption is can be assessed by reviewing the collection of studies for significant differences in the distribution of effect modifiers. Assuming that the studies are similar, the assumption of transitivity may be realistic, on the condition that there aren't any unknown modifiers of the relative treatment effect [43]. It is clear that such an assessment of transitivity may not be possible when the effect modifiers are not reported or when the number of studies per treatment comparison is low [12]. If there are significant differences identified and sufficient data is available, the transitivity of the network can be enhanced by using a network meta-regression. This might indicate, for example, that it is necessary for the common comparator treatment C to be similar in the AC and BC studies in terms of dose, modes of administration, duration, etc.

In an NMA of studies conducted to compare fluoride treatments administered to prevent dental caries, the definition of placebo differed between fluoride toothpaste

#### *Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

studies and fluoride rinse studies [44], casting doubt on how plausible the transitivity assumption is and thus challenging the reliability of the NMA results. In another example, Julious and Wang [45] focused on how the use of placebo as an intermediate comparator can result in the distortion of the results of indirect comparisons due to changes in the population's placebo response over the years; for instance, there might be a bias in the indirect estimate for A versus B when studies that compare treatment A versus placebo are older than studies that compare B versus placebo. Other ways used to formulate the transitivity assumption is to suppose that the true relative effect of A versus B is the same in the fixed-effects model or may vary across studies in the random-effects model, regardless of the treatments compared in each study [42, 46], that "missing" treatments in each trial are randomly missing [5] or, equivalently, that the choice of treatment comparisons in trials is not related directly or indirectly to the relative efficacy of the interventions. Finally, arguing that the patients included can be randomly distributed to any of the treatments in the network is an alternative way of postulating this assumption [21].

However, this does not mean that the assumption of transitivity will necessarily be valid. It should be stated that the absence of statistical inconsistency does not offer any evidence to prove the validity of the transitivity assumption that is essentially an assumption that cannot be tested as discussed in the previous section. Therefore, the conduct of an NMA should be preceded by a conceptual/theoretical evaluation of the transitivity assumption besides statistical tests for inconsistency [12] and the studies that are included in an NMA should always be reviewed for important differences that can be seen in patients, interventions, outcomes, study design, methodological characteristics, and reporting biases [2, 9, 14, 32, 43].

#### **2.7 Design inconsistency**

What is meant with the "design" of a study is a set of treatments that are compared within the study, recognizing that it is different from traditional interpretations made for the term. Then, differences in effect sizes among studies including different sets of treatments are referred to by design inconsistency. While allowing for this variation, it is implicitly assumed that different designs (i.e., different treatment sets included) can serve the function of a proxy for one or more important modifiers of effect [47]. Design inconsistency is depicted in **Figure 4(e)**, in which different line styles represent possible contradictions between study designs. The AC effect size depicted with a solid line in the three-arm trial is different from the AC effect size in the two-arm trial depicted with a dashed line. It is possible to see design inconsistency as a special case of heterogeneity since study designs correspond to a study-level covariate that has the potential to change effect sizes in the study, as can occur in a standard meta-regression analysis. It should be noted that in a network of only twoarm studies, additional insights provided by loop inconsistency cannot be provided by the concept of design inconsistency. In the case of a multi-arm trial, loop inconsistency in two-arm trials means design inconsistency (**Figure 4(f)**). The reason for this is that the multi-arm trial must be self-consistent, so the effect sizes of the multiarm trial should be different from those of at least one of the two-arm trials: our definition of design inconsistency. Nevertheless, what is implied by design inconsistency for loop inconsistency is less clear. Design consistency with one three-arm trial and two two-arm trials is shown in **Figure 4(g)**. It is possible to create a loop by subtracting the pairwise BC comparison from the three-arm trial and then by comparing it to the two-arm trials. But, in this way, the existence of a consistent loop in the three-arm experiment is overlooked and thus it is unclear whether this network should be defined as exhibiting loop inconsistency. Also, it is seen in **Figure 4(h)** that the two-arm trials are consistent among themselves, but the effect sizes are different

from the effect sizes of the multi-arm trial. Does this show design inconsistency without loop inconsistency? [34].

#### **2.8 Similarity**

In order to make a comparison among the clinical trial studies used for analysis, it must be assumed that there is a similarity in the methodology used in the studies [12, 44]. The assessment of similarity is qualitatively performed on each of the selected articles from a methodological point of view and is not a hypothesis that can be tested statistically. The technique used to investigate similarity is the population, intervention, comparison, and outcome (PICO) technique [17]. Examination of similarity among the studies used for analysis is based on the following four items: clinical characteristics of study subjects, treatment interventions, comparison treatments, and outcome measures. In cases where the similarity assumption is not satisfied, the other two assumptions are also negatively affected [24] and moreover, there is also a need to check for the heterogeneity error [18, 21].

#### *2.8.1 Network diagrams*

One way of graphically depicting the structure of a network of interventions is a network diagram [12]. Such a graph is comprised of nodes that represent the interventions in the network and lines that show the available direct comparisons between pairs of interventions. An example of a network diagram including four interventions is given in **Figure 3**. In this example, in order to show the presence of a three-arm study, distinct lines that form a closed triangular loop have been added. It should be noted that complex and useless network diagrams may be yielded by such presentation of multi-arm studies; in this case, a tabular format can be preferred to depict multi-arm studies (**Figure 5**).

#### **Figure 5.**

*Example of network diagram with four competing interventions and information on the presence of multi-arm randomized trials.*

### **3. Illustrating example**

The estimation of the relative effects on HbA1c change, of adding different oral glucose-lowering agents to a baseline sulfonylurea therapy in patients with type 2 diabetes, was the aim of the network meta-analysis in Diabetes. Systemic literature research was carried out on all relevant articles that were published from January 1993 to June 2009 in Medline and Embase. The search strategy was restricted to "randomized controlled 170 Statistical Methods in Medical Research 22(2) trials", "sulfonylurea or sulphonylurea" and "humans". This initial search was confirmed by combining each of the Medical Subject Headings key words "chlopropamide", "glibenclamide", "glyburide", "gliclazide", "glimepiride", "glipizide", "gliquidone", "tolbutamide" on the one hand and 'RCT' on the other hand. No language restriction was applied. R program was used to analyze the data (**Figure 6**).

An original dataset offered by Senn [48] will be used in our first network metaanalysis. In this dataset, there are effect size data obtained from randomized controlled trials that compare different medications for diabetes. The effect size obtained for all comparisons represents the mean difference (MD) of diabetic patients' HbA1c value in the posttest. What is represented by this value is the concentration of glucose found in the blood, which is aimed to be decreased with diabetic medication. As can be seen, there are 28 rows that represent the treatment comparisons and seven columns in the data. In the first column, TE, there is the effect size of each comparison, and the respective standard error is contained in se TE. In case effect size data that have already been calculated for each comparison might not be possessed.


## **Figure 6.**

*Diabetes example and view the data.*

The two treatments that are compared are represented by treat1. long, treat2. long, treat1, and treat2. As a shortened name of the original treatment name is contained in the variables treat1 and treat2, they are redundant.

We can now move forward by fitting our initial network meta-analysis model using the net metafunction. Now, we can look at the results of our first model, for now assuming a fixed-effects model.

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

As we have created our network meta-analysis model, we can go ahead and draw our network graph (**Figure 2**).

Several types of information are conveyed by this network graph.


There is also one multiarm trial in our network, represented by the triangle shown in blue in our network.

As a next step, our attention can be shifted towards the direct and indirect evidence in our network by looking at the rate of direct and indirect contribution to each comparison. A function has been prepared to this end with the name of direct.evidence.plot.

As can be seen in **Figure 7**, there are many estimates included in our network model that needed to be inferred by indirect evidence only. We are also provided with two additional metrics by the plot: The Minimal Parallelism and the Mean Path

**Figure 7.** *Direct evidence proportion for each network estimate.*

Length of each comparison. It is noted by König [49] that lower values of minimal parallelism and Mean Path Length >2 means that care should be taken while interpreting results for specific comparison.

Then we can look at our network's estimates for all possible combinations of treatments. In order to be able to do this, result matrices stored in our net meta results object under the fixed effects model can be used. Through a few preprocessing steps, the matrix can be made easier to read. First, the matrix is extracted from our data and the numbers in the matrix are rounded to three digits.


When the fact that a "triangle" in our matrix includes too much redundant information is considered, it seems to be possible to replace the lower triangle with an empty value.


The net league() function offers an extremely convenient way of exporting all estimated effect sizes. A matrix similar to the one given above can be generated by this function. Yet, in the matrix created by this function, only the pooled effect sizes belonging to the direct comparisons available in our network will be shown by the upper triangle, like the ones to be attained if a conventional meta-analysis had been conducted for each comparison. As there is no direct evidence for all comparisons, we will see some fields in the upper triangle empty. In this case, the network meta-analysis effect sizes for each comparison are contained by the lower triangle. The biggest advantage of this function is that it allows effect size estimates and confidence intervals to be shown together in each cell; the only thing that we need to tell the function is how the brackets for the confidence intervals should look like and how many digits we want our estimates to have behind the comma.

In a network meta-analysis, the most interesting question desired to be answered is: which intervention works the best? Such an ordering of treatments from most to least useful can be performed by the net rank() function implemented in net meta. The net rank() function is also built on a method of frequentist treatment ranking that uses Pscores. With these P-scores, the certainty that one treatment is better than another treatment is measured. It has been shown that this P-score is equivalent to the SUCRA score [50]. Our net meta object is needed as input by the function. Moreover, the small values parameter used to define whether smaller effect sizes in comparison are an indicator of a beneficial ("good") or harmful ("bad") effect should be specified. Now we will look at the output for our example:

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*


As can be seen, the Rosiglitazone treatment has the highest P-score, which indicates that this treatment may be particularly helpful. Contrarily, the P-score of Placebo is zero, supporting our intuition that placebo may not be the best treatment decision. It should be noted, however, that treatment should never be automatically concluded to be the best just because it has the highest score [51]. One of the good ways to be used to visualize the uncertainty in our network is to generate network forest plots with the "weakest" treatment as a comparison. The forest plot can also be used to do this. The reference group for the forest plot can be specified by using the reference group argument (**Figures 8** and **9**).

Now it can be seen that the results are more ambiguous than they seemed before; it is seen that several high-performing treatments having overlapping confidence intervals are available. This means we cannot make a firm judgment about which treatment is actually the best, but rather we see that there are a number of treatments that are more effective compared to placebo.

#### **3.1 Decomposition of heterogeneity statistics**

It is possible to decompose the Q total statistic (of the "whole network") into a Q statistic to assess heterogeneity between studies having the same design ("within designs") and a Q statistic to assess design inconsistency ("between designs"). The subsets of treatments that are compared with each other in a study are used to define designs.

**Figure 8.** *Forest plot for fixed effect model with placebo as reference.*

**Figure 9.**

*Forest plot for random-effects model with placebo as reference.*

For this analysis, the fixed-effect model has been used and it is seen that there is considerable heterogeneity/inconsistency within as well as between designs. The total within-design heterogeneity can be further decomposed into the contribution from each design.


As can be seen, the network meta-analysis includes 26 studies and these 26 studies use 15 different designs. Because only five designs for which more than one study exist, the remaining Q statistics specific to design are equal to zero and do not have any degrees of freedom. Except for design metf:rosi (p value = 0.67), heterogeneity is higher than would be expected between the contributing studies for all the other four designs; in the case of metf:plac a substantial amount more (*p* < 0:0001). Sources of this could be identified in a substantive application and thus the analysis could be updated appropriately.

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

Now the net heat plot, put forward by Krahn, König, and Binder [49] will be introduced. This is a graphical presentation where two types of information are shown in a single plot. These are:

1.For each network estimate, the contribution of each design to this estimate, and

2.For each network estimate, the extent of inconsistency due to each design.

Net heat plot is very useful in terms of evaluating the inconsistency in our network model, and what contributes to it (**Figure 10**).

A quadratic matrix is produced by the function so that each element in a row can be compared to all other elements in the columns. It should be noted here that rows and columns do not refer to all treatment comparisons in our network rather to specific designs. Thus, we also have rows and columns for the multiarm study, which had a design that compares "Plac", "Metf" and Acar. Comparison of treatments with only one type of evidence (i.e., indirect or indirect evidence) is not included in this chart, as we are dealing with cases of inconsistency between direct and indirect evidence. Moreover, the net heat plot has also two important properties: 1. Gray boxes. The Gray boxes for each design comparison show the extent to which one treatment comparison is important in terms of estimating another treatment comparison. The increasing size of the box indicates the increasing importance of comparison. This can be easily analyzed by going through the rows of the plot one after another, and then by checking for each row in which columns the gray boxes are the largest. In rows where the row comparison and the column comparison intersect, the boxes are large, which is a common finding and means that direct evidence was employed. For instance, it is possible to see a big gray box at the point where the "Plac vs Rosi2" row and the "Plac vc Rosi" column intersect [52].

The colored backgrounds which range from blue to red indicate the inconsistency of the comparison in a row, which can be attributed to the design in a column. Inconsistent fields are shown in the upper-left corner in red. For instance, it is seen that the entry in column "Metf vs. Sulf" is shown with red in the row for "Rosi vs. Sulf". This indicates that the evidence that "Metf vs. Sulf" provides for the "Metf vs. Sulf" estimation is not consistent with the other evidence. We can now remember

**Figure 10.**

*Net heat plot of the Senn data example based on a fixed-effect model.*

**Figure 11.** *Net heat plot of the Senn data example from a random-effects model.*

that the fixed effects model that we initially used for our network analysis forms the basis of these results. On the basis of the things we have seen so far, we can reach the conclusion that due to too much unexpected heterogeneity, justification is not provided for the fixed effects model. How the net heat graph changes when a random-effects model is assumed can be controlled by changing the random argument of the net heat function to TRUE. It is seen that this results in a significant reduction of inconsistency in our network (**Figure 11**).

#### *3.1.1 Net splitting*

Net splitting, also known as node splitting, is another method for checking consistency in our network. With this method, our network estimates are split into the contribution of direct and indirect evidence and in this way, we can control for inconsistency in specific comparisons in our network. To generate a net split and compare the results.


*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

Here, the important information is found in the p-value column. Any value that is *p* < 0.05 in this column is an indicator of a significant discrepancy (inconsistency) between the direct and indirect estimates. In the output, it is seen that there are indeed few comparisons showing significant discrepancies between direct and indirect evidence when the fixed effects model is used. Net split results can be visualized with a forest chart showing all comparisons for which both direct and indirect evidence are present in **Figure 12**.

#### **4. Conclusions**

For the estimation and comparison of treatment effects in a particular area, network meta-analysis can be used as a potentially powerful tool for using all the evidence. This approach has been depicted through an example from diabetes [48], which shows how to graph the network and explore a range of analyses. The results of our first model (fixed-effect model) Q value of DeFronzo1995 is highest with *Q* ¼ 30*:*89*:* As a network model, the effects of all treatments are displayed in comparison to the placebo condition, which is why there is no effect shown for placebo. We can say heterogeneity/inconsistency in our network model is high, with *I* <sup>2</sup> <sup>¼</sup> 84%. The heterogeneity between treatment designs reflects the actual inconsistency in our network, and is highly significant (*p* = 0.0021). In **Figure 2**, looking at the network graph, it is seen that Rosiglitazone has been compared to Placebo in many trials. The only multi-arm trial in our network is that of Willms 2003. We see that it is the Rosiglitazone treatment with the highest P score. It is necessary to look at network forest plots with the "weakest" treatment, as it can be misleading to conclude that a treatment is best just because it has the highest score.

#### *Computational Statistics and Applications*


#### **Figure 12.**

*Net split plot of the Senn data example from a fixed-effect model.*

Looking at the forest network plot, we see that there are several highperformance treatments with overlapping confidence intervals. From here, we looked at the net heat plot as we could not make a definitive decision.

The extent of the information obtained in a given treatment comparison by means of indirect evidence and the extent of heterogeneity can be defined as two important aspects of network meta-analysis. The net heat graph communicates information about both of these and the software allows for the decomposition of heterogeneity within and between designs. If there is clinically relevant heterogeneity, it is worth being explored further. Looking at **Figure 10**, a particularly large *Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

gray box is seen where the "Plac vs. Rosi2 row and the "Plac vs. Rosi" column intersect. Using the.random-effects model in **Figure 11**, we see that the inconsistency is significantly reduced.

Since it is not possible to conduct covariate adjustment at present with the software, one approach is to conduct study-specific (ideally individual participant data) analyses with appropriate covariate adjustment before the software presented here is used to perform network meta-analysis.

### **Author details**

Nilgün Yildiz Marmara University, Istanbul, Turkey

\*Address all correspondence to: ncelebi@marmara.edu.tr

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## **References**

[1] Shim SR, Yoon BY, Shin IS, Bae JM. Network meta-analysis: Application and practice using Stata. Korean Society of Epidemiology. 2017;**39**:e2017047. DOI: 10.4178/epih.e2017047

[2] Caldwell DM, Ades AE, Higgins JP. Simultaneous comparison of multiple treatments: Combining direct and indirect evidence. BMJ. 2005;**331**:897-900

[3] Li T, Vedula SS, Scherer R, Dickersin K. What comparative effectiveness research is needed? A framework for using guidelines and systematic reviews to identify evidence gaps and research priorities. Annals of Internal Medicine. 2012;**156**:367-377

[4] Mitka M. US government kicks off program for comparative effectiveness research. Journal of the American Medical Association. 2010;**304**:2230-2231

[5] Lu G, Ades AE. Assessing evidence inconsistency in mixed treatment comparisons. Journal of the American Statistical Association. 2006;**101**: 447-459

[6] Salanti G, Higgins JP, Ades AE, Ioannidis JP. Evaluation of networks of randomized trials. Statistical Methods in Medical Research. 2008;**17**:279-301

[7] Higgins JP, Whitehead A. Borrowing strength from external trials in a metaanalysis. Statistics in Medicine. 1996;**15**: 2733-2749

[8] Mills EJ, Ioannidis JP, Thorlund K, Schünemann HJ, Puhan MA, Guyatt GH. How to use an article reporting a multiple treatment comparison metaanalysis. Journal of the American Medical Association. 2012; **308**:1246-1253

[9] Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results of direct and indirect treatment comparisons in

meta-analysis of randomized controlled trials. Journal of Clinical Epidemiology. 1997;**50**:683-691

[10] Mills EJ, Ghement I, O'Regan C, Thorlund K. Estimating the power of indirect comparisons: A simulation study. PLoS One. 2011;**6**(1):e16237

[11] Ioannidis JP. Indirect comparisons: The mesh and mess of clinical trials. Lancet. 2006;**368**:1470-1472

[12] Cipriani A, Higgins JP, Geddes JR, Salanti G. Conceptual and technical challenges in network meta-analysis. Annals of Internal Medicine. 2013;**159**: 130-137

[13] Tonin FS, Rotta I, Mendes AM, Pontarolo R. Network meta-analysis: A technique to gather evidence from direct and indirect comparisons. Pharmacy Practice (Granada). 2017;**15**:943

[14] Hoaglin DC, Hawkins N, Jansen JP, Scott DA, Itzler R, Cappelleri JC, et al. Conducting indirect-treatmentcomparison and network meta-analysis studies: Report of the ISPOR Task Force on indirect treatment comparisons good research practices: Part 2. Value in Health. 2011;**14**:429-437

[15] Li T, Puhan MA, Vedula SS, Singh S, Dickersin K, Ad Hoc Network Metaanalysis Methods Meeting Working Group. Network meta-analysis-highly attractive but more methodological research is needed. BMC Medicine. 2011;**9**:79

[16] Mills EJ, Bansback N, Ghement I, Thorlund K, Kelly S, Puhan MA, et al. Multiple treatment comparison metaanalyses: A step forward into complexity. Clinical Epidemiology. 2011;**3**:193-202

[17] Reken S, Sturtz S, Kiefer C, Böhler YB, Wieseler B. Assumptions of *Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

mixed treatment comparisons in health technology assessments: Challenges and possible steps for practical application. PLoS One. 2016;**11**:e0160712

[18] Veroniki AA, Vasiliadis HS, Higgins JP, Salanti G. Evaluation of inconsistency in networks of interventions. International Journal of Epidemiology. 2013;**42**:332-345

[19] Bhatnagar N, Lakshmi PV, Jeyashree K. Multiple treatment and indirect treatment comparisons: An overview of network meta-analysis. Perspectives in Clinical Research. 2014; **5**:154-158

[20] Mills EJ, Thorlund K, Ioannidis JP. Demystifying trial networks and network meta-analysis. BMJ. 2013;**346**:f2914

[21] Salanti G. Indirect and mixedtreatment comparison, network, or multiple-treatments meta-analysis: Many names, many benefits, many concerns for the next generation evidence synthesis tool. Research Synthesis Methods. 2012;**3**:80-97

[22] Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Statistics in Medicine. 2004;**23**:3105-3124

[23] Jansen JP, Fleurence R, Devine B, Itzler R, Barrett A, Hawkins N, et al. Interpreting indirect treatment comparisons and network meta-analysis for health-care decision making: Report of the ISPOR Task Force on indirect treatment comparisons good research practices: Part 1. Value in Health. 2011; **14**:417-428

[24] Jansen JP, Naci H. Is network metaanalysis as valid as standard pairwise meta-analysis? It all depends on the distribution of effect modifiers. BMC Medicine. 2013;**11**:159

[25] Dakin HA, Welton NJ, Ades AE, Collins S, Orme M, Kelly S. Mixed

treatment comparison of repeated measurements of a continuous endpoint: An example using topical treatments for primary openangle glaucoma and ocular hypertension. Statistics in Medicine. 2011;**30**:2511-2535

[26] Schmitz S, Adams R, Walsh CD, Barry M, FitzGerald O. A mixed treatment comparison of the efficacy of anti-TNF agents in rheumatoid arthritis for methotrexate non-responders demonstrates differences between treatments: A Bayesian approach. Annals of the Rheumatic Diseases. 2012; **71**:225-230

[27] Jones B, Roger J, Lane PW, Lawton A, Fletcher C, Cappelleri JC, et al. Statistical approaches for conducting network meta-analysis in drug development. Pharmaceutical Statistics. 2011;**10**:523-531

[28] White IR. Network meta-analysis. The Stata Journal. 2015;**15**:951-985

[29] Cooper NJ, Peters J, Lai MC, et al. How valuable are multiple treatment comparison methods in evidence-based health-care evaluation? Value in Health. 2011;**14**(2):371-380

[30] Edwards SJ, Clarke MJ, Wordsworth S, Borrill J. Indirect comparisons of treatments based on systematic reviews of randomised controlled trials. International Journal of Clinical Practice. 2009;**63**:841-854. DOI: 10.1111/ j.1742-1241.2009.02072

[31] Gartlehner G, Moore CG. Direct versus indirect comparisons: A summary of the evidence. The International Journal of Technology Assessment in Health Care. 2008;**24**: 170-177. DOI: 10.1017/S02664623080 80240

[32] Efthimiou O, Debray TPA, vanValkenhoef G, Trelle S, Panayidou K, Moons KGM, et al. GetReal in network meta-analysis: A review of the methodology. Research Synthesis Methods. 2016;**7**:236-263. DOI: 10.1002/jrsm.1195

[33] Salanti G, Kavvoura FK, Ioannidis JP. Exploring the geometry of treatment networks. Annals of Internal Medicine. 2008;**148**:544-553

[34] Higgins JPT, Jackson D, Barrett JK, Lu G, Ades AE, White IR. Consistency and inconsistency in network metaanalysis: Concepts and models for multiarm studies. Research Synthesis Methods. 2012;**3**:98-110

[35] Higgins JPT. Commentary: Heterogeneity in meta-analysis should be expected and appropriately quantified. International Journal of Epidemiology. 2008;**37**:1158-1160

[36] Chan AW, Altman DG. Epidemiology and reporting of randomised trials published in PubMed journals. The Lancet. 2005;**365**: 1159-1162

[37] Dias S, Welton NJ, Caldwell DM, Ades AE. Checking consistency in mixed treatment comparison metaanalysis. Statistics in Medicine. 2010b; **29**:932-944. DOI: 10.1002/sim.3767

[38] Song F, Loke YK, Walsh T, Glenny AM, Eastwood AJ, Altman DG. Methodological problems in the use of indirect comparisons for evaluating healthcare interventions: Survey of published systematic reviews. BMJ. 2009;**338**:b1147

[39] Baker SG, Kramer BS. The transitive fallacy for randomized trials: If A bests B and B bests C in separate trials, is A better than C? BMC Medical Research Methodology. 2002;**2**:13

[40] Donegan S, Williamson P, Gamble C, Tudur SC. Indirect comparisons: A review of reporting and methodological quality. PLoS One. 2010;**5**:e11054. DOI: 10.1371

[41] Song F, Altman DG, Glenny AM, Deeks JJ. Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses. BMJ. 2003; **326**:472. DOI: 10.1136/bmj.326.7387.472

[42] Dias S, Welton NJ, Sutton AJ, Caldwell DM, Lu G, Ades AE. Evidence synthesis for decision making 4: Inconsistency in networks of evidence based on randomized controlled trials. Medical Decision Making. 2013d;**33**: 641-656. DOI: 10.1177/0272989X 12455847

[43] Donegan S, Williamson P, D'Alessandro U, Tudur SC. Assessing key assumptions of network metaanalysis: A review of methods. Research Synthesis Methods. 2013b;**4**:291-323. DOI: 10.1002/jrsm.1085

[44] Salanti G, Marinho V, Higgins JP. A case study of multiple-treatments metaanalysis demonstrates that covariates should be considered. Journal of Clinical Epidemiology. 2009;**62**:857-864. DOI: 10.1016/j.jclinepi.2008.10.001

[45] Julious SA, Wang SJ. How biased are indirect comparisons, particularly when comparisons are made over time in controlled trials. Drug Information Journal. 2008;**42**:625

[46] Lu G, Ades A. Modeling betweentrial variance structure in mixed treatment comparisons. Biostatistics. 2009;**10**:792-805. DOI: 10.1093/ biostatistics/kxp032

[47] Lumley T. Network meta-analysis for indirect treatment comparisons. Statistics in Medicine. 2002;**21**: 2313-2324. DOI: 10.1002/sim.1201

[48] Senn S, Gavini F, Magrez D, Scheen A. Issues in performing a network meta-analysis. Statistical Methods in Medical Research. 2013; **22**(2):169-189. DOI: 10.1177/ 0962280211432220

*Network Meta-Analysis Using R for Diabetes Data DOI: http://dx.doi.org/10.5772/intechopen.101788*

[49] König J, Krahn U, Binder H. Visualizing the flow of evidence in network meta-analysis and characterizing mixed treatment comparisons. Statistics in Medicine. 2013;**32**(30):5414-5429

[50] Rücker G, Schwarzer G. Ranking treatments in frequentist network metaanalysis works without resampling methods. BMC Medical Research Methodology. 2015;**15**(1):58

[51] Mbuagbaw L, Rochwerg B, Jaeschke R, Heels-Andsell D, Alhazzani W, Thabane L, et al. Approaches to interpreting and choosing the best treatments in network meta-analyses. Systematic Reviews. 2017;**6**(1):79

[52] Schwarzer G, Carpenter JR, Rücker G. Meta-Analysis with R. Switzerland: Springer International Publishing; 2015

## **Chapter 8** Variance Balanced Design

*D.K. Ghosh*

### **Abstract**

In this chapter binary, ternary and n-ary variance balanced design is constructed using balanced incomplete block, resolvable balanced incomplete block, semi regular group divisible, factorial, fractional factorial designs. Constructed variance balanced designs are with v, (v + 1), (v + 2) and (v + r) treatments. Method of construction of variance balanced designs are supported by suitable examples. It is found that all most all variance balanced designs are with high efficiency factors.

**Keywords:** incidence matrix, C – Matrix, resolvable balanced incomplete block designs, eigen values, balanced and group divisible designs

### **1. Introduction**

In literature balanced incomplete block designs are either variance balanced (VB), efficiency balanced (EB) or pairwise balanced. Raghvarao ([1], Theorem 4.5.2) discussed that among the class of connected designs the balanced designs are the most efficient designs. A design is said to be variance balanced, if the variance of the estimate of each of the possible elementary treatment contrast is the same, i.e., if ti denotes the estimate of ith treatment effects, then Var (ti - tj ) is constant for all i 6¼ j.

Chakrabarti [2] gave useful concept of C – matrix of design. It is known that balanced incomplete block designs are the most efficient but do not exist for all parametric specifications, and they are equi replicated and have equal block sizes. In some situations, balanced block designs with equal replicates or unequal block size or both are needed. The variance balanced designs can have both equal and unequal number of replications and block sizes. The importance of variance balanced designs in the context of experimental material is well known, as it yields optimal designs apart from ensuring simplicity in the analysis. Many practical situations demand designs with varying block sizes (Pearce, [3], or resolvable VB designs with unequal replications Mukerjee and Kageyama [4]). Rao [5], Headyat and Federer [6], Raghavarao [7] and Puri and Nigam [8] defined that a design is said to be variance balanced, if every normalized estimable linear function of treatment effect can be estimated with the same precision. They also discussed the necessary and sufficient conditions for the existence of such designs. John [9], Jones et al. [10], Kageyama [11, 12], Kageyama et al. [13], Pal and Pal [14], Roy [15], Sinha [16, 17], Sinha and Jones [18] and Tyagi [19] gave some more methods for constructing block designs with unequal treatment replications and unequal block sizes. Khatri [20], along with a method of construction of VB designs, gave a formula to measure over-all A-efficiency of variance balanced designs. Das and Ghosh [21] gave the methods of construction of variance balanced designs with augmented blocks and treatments. Mukerjee and Kageyama [22] introduced resolvable variance balanced designs. A technique for constructing variance balanced designs, which is based on

the unionizing block principle of Headayat and Federer [6], was described in Calvin [23]. Calvin and Sinha [24] extended his technique to produce designs with more than two distinct block sizes that permit fewer replications. Agarwal and Kumar [25] gave a method of construction of variance balanced designs which is associated with group divisible (GD) designs. Rao [5] observed that, if the information matrix C of a block design satisfied

$$\mathbf{C} = \boldsymbol{\Theta} \left[ \mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{1}}{\mathbf{v}} \mathbf{E}\_{\mathbf{v}\mathbf{v}} \right]$$

where, θ is non zero eigen value of C matrix, Iv is an identity matrix of order v, Evv is the matrix with v rows and v columns where, all the elements are unity, then such design is called Variance balanced designs. Since balanced incomplete block design (BIBD) satisfies this property and hence, balanced incomplete block design is a particular case of Variance balanced designs.

Das and Ghosh [21] defined generalized efficiency balanced (GEB) design, which include both VB as well as EB designs. Ghosh [26], Ghosh and Karmaker [27], Ghosh and Devecha [28], Ghosh, Divecha, and Kageyama [29], Ghosh et al. [30], Ghosh, et al. [31, 32] obtained several methods for construction of VB designs. Ghosh and Joshi [30] constructed VB design through GD design. Again, Ghosh and Joshi [33] Constructed VB Design through Triangular design. Kageyama [10] recommended the use of non-binary VB design, when binary VB designs are not available for given values of parameters. Ghosh and Ahuja [34] carried out VB design using fractional factorial designs. Agarwal and Kumar [35, 36] developed some methods of constructing ternary VB designs with vð Þ þ s treatments sð Þ ≥1 , having blocks of unequal sizes, through block designs with v treatment. Ghosh, Kageyama and Joshi [37] developed Ternary VB designs using BIB and GD design. Ghosh et al. [37] further obtained more VB designs using Latin square type PBIB design. Ghosh [38] studied the robustness of variance balanced design against the loss of k treatments and one block. Ghosh et al. [39] discuss construction of VB design using factorial designs. Hedayat and Stufken [40] established a relation between pair wise balanced and variance balanced designs. Jones [41] discussed the property of incomplete block designs. Gupta and Jones [42] constructed equal replicated VB designs.

#### **2. Method of construction**

Method of construction of Variance balanced design with equal/unequal replication sizes and equal/unequal block sizes is carried pot in this chapter. Section 3 discusses the construction of variance balanced design using Hadamard matrix. While construction of variance balanced design using semi regular group divisible design is discussed in Section 4. Variance balanced design is constructed by augmenting n more blocks which is discussed in Section 5. Construction of variance balanced design with (v + 1) treatments using unreduced balanced incomplete block design is shown in Section 6. Section 7 discusses the construction of variance balanced design using 2n symmetrical factorial experiments. Variance balanced design is constructed using incidence matrix also, and is shown in section – 8.

#### **3. Variance balanced design using Hadamard matrix**

**Theorem – 3.1:** Equi-replicated Variance balanced design with parameters v = n – 1, b = n, r = n/2, k = {n – 1, n/2–1} and Cim = ð Þ <sup>3</sup>*n*�<sup>4</sup> ð Þ *<sup>n</sup>*�<sup>1</sup> ð Þ *<sup>n</sup>*�<sup>2</sup> can always be constructed

from a Hadamard matrix of size n by deleting its first row and then considering rows as treatments and columns as blocks.

**Proof:** Consider a Hadamard matrix of size n. Delete its first row. The size of this matrix become (n – 1) x n. We replace �1 by 0, and call this matrix by N. This matrix contains (n-1) rows and n columns where, element "1" occurs n/2 times in each row, (n – 1) times in first column, and (n/2–1) times in the remaining columns. Consider matrix N as an incidence matrix of a variance balanced design, where rows are treatments and columns are blocks, so, v = n-1, b = n, r = n/2 and k = {n - 1, n/2–1} .

For variance balanced design, Cim = P*<sup>b</sup> j nijnmj <sup>n</sup>:<sup>j</sup>* , where, i 6¼ m = 1 to v.

Cim is computed as C1m = <sup>1</sup> *<sup>n</sup>*�<sup>1</sup> <sup>+</sup> <sup>1</sup> *<sup>n</sup>* <sup>2</sup>�<sup>1</sup> <sup>=</sup> ð Þ <sup>3</sup>*n*�<sup>4</sup> ð Þ *<sup>n</sup>*�<sup>1</sup> ð Þ *<sup>n</sup>*�<sup>2</sup> .

We can verify that, Cim gives same constant value for each pair of treatments. Now a block design Is said to variance balanced design, if C matrix satisfies, C = θ (Iv – Evv/v), where, θ is non zero eigen value of C matrix with multiplicity (v – 1), where,

$$\mathbf{C} = \text{diag}(\mathbf{r}\_1, \mathbf{r}\_2, \dots, \mathbf{r}\_\mathbf{v}) \text{-N } \mathbf{K}^{-1} \mathbf{N}'.$$

$$\mathbf{C} = \begin{bmatrix} n/2 & 0 & \dots & 0 \\ 0 & \frac{n}{2} & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \dots & n/2 \end{bmatrix} - \begin{bmatrix} n(n-2) & 3n-4 & \dots & 3n-4 \\ 3n-4 & n(n-2) & \dots & 3n-4 \\ \vdots & \vdots & \dots & \vdots \\ 3n-4 & 3n-4 & \dots & n(n-2) \end{bmatrix} / (\mathbf{n}-\mathbf{1}) (\mathbf{n}-\mathbf{2})$$

After simplification C reduces to

$$\mathbf{C} = \frac{n(n-2)(n-3) + 6n - 8}{2(n-1)(n-2)} \left[ \mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{v}\mathbf{v}}}{\mathbf{v}} \right] \tag{1}$$

Where, θ = *n n*ð Þ �<sup>2</sup> ð Þþ *<sup>n</sup>*�<sup>3</sup> <sup>6</sup>*n*�<sup>8</sup> <sup>2</sup>ð Þ *<sup>n</sup>*�<sup>1</sup> ð Þ *<sup>n</sup>*�<sup>2</sup> denotes the non-zero eigen value of C matrix with multiplicity (n – 2).

Eq. (1) satisfy the condition of variance balanced design. Hence, this is an equireplicated and two unequal block sizes variance balanced design.

#### **3.1 Efficiency factor of a variance balanced design**

The efficiency factor of a variance balanced design is defined as

$$\mathbf{E} = \frac{\mathbf{Var}\left(\hat{t}\_i - \hat{t}\_m\right)\_{RBD}}{\mathbf{Var}\left(\hat{t}\_i - \hat{t}\_m\right)\_{VB}}$$

Where,Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD* = (2/r) <sup>σ</sup><sup>2</sup> <sup>=</sup> <sup>2</sup> *<sup>n</sup>=*<sup>2</sup> <sup>σ</sup><sup>2</sup> and

$$\operatorname{Var}\left(\widehat{t}\_{i} - \widehat{t}\_{m}\right)\_{VB} = (2/\theta)\sigma^{2} = \frac{2}{\frac{n(n-2)(n-3) + 6n - 8}{2(n-1)(n-2)}}\sigma^{2}$$

$$\mathbf{E} = \frac{n(n-2)(n-3) + 6n - 8}{n(n-1)(n-2)}$$

**Example–3.1** Construct a variance balanced design from a Hadamard matrix of size 8.

Using Theorem – 3.1, we construct a variance balanced design from a Hadamard matrix of size 8 as following:

Hadamard Matrix of size 8 Incidence matrix of a Variance balanced design


N gives the incidence matrix of an equi-replicated and un equal block sizes variance balanced design with parameters v = 7, b = 8, r = 4, k = {7, 3}, Cim = 10/21 and information matrix,


After simplification, C reduces to

$$\mathbf{C} = \frac{280}{84} \left[ \mathbf{I}\_{\heartsuit} - \frac{\mathbf{E}\_{\heartsuit}}{7} \right] \qquad \qquad = \Theta \left[ \mathbf{I}\_{\heartsuit} - \frac{\mathbf{E}\_{\heartsuit}}{7} \right] \tag{2}$$

Where, θ = <sup>10</sup> <sup>3</sup> ,is the non zero eigen value of C matrix with multiplicity 6. Hence, it is a variance balanced design, with *t* b*<sup>i</sup>* = (1/ θ)Qi = (3/10)Qi,

$$\text{Var}(\hat{t}\_i - \hat{t}\_m)\_{\text{VB}} = (\text{2/0})\sigma^2 = (\text{6/10})\,\sigma^2.\text{Var}(\hat{t}\_i - \hat{t}\_m)\_{\text{RBD}} = (\text{2/r})\,\sigma^2 = \frac{2}{4}\,\sigma^2$$

and Efficiency factor, E = 5/6. This shows that efficiency factor is very high.

#### **4. Variance balanced design through semi regular group divisible designs**

In this section, we discuss the construction of variance balanced design by adding the blocks of semi-regular group divisible design with its groups, provided the following conditions (i) block sizes, k = λ2, (ii) λ<sup>1</sup> = 0 and (iii) number of groups are considered as number of blocks, are satisfied.

**Theorem – 4.1** Let the parameters of a semi regular group divisible design are v, b, r, k, λ<sup>1</sup> = 0, λ2, m and n, where k = λ2. By adding the b blocks of this semi regular group divisible design with number of groups as blocks, an equi-replicated and unequal block sizes variance balanced design is constructed with parameters v1 = v, b1 = b + mn, r1 = r + n, k1 = {k, n} and Cim = λ<sup>2</sup> / k or Cim = λ<sup>1</sup> / k + n/n.

**Proof:** Consider a semi regular group divisible design with parameters v, b, r, k, λ<sup>1</sup> = 0, λ<sup>2</sup> = k, m and n, where, m denotes number of groups and n number of treatments per group. Denote N as the incidence matrix of the resulting design. Consider one group as one block. Here, there are m groups and hence, we have m more blocks. Add b blocks of the semi-regular group divisible design with its m more blocks, provided m blocks are repeated n times. Hence, v1 = v, b1 = b + mn,

r1 = r + n and k1 = {k, n}. We can check, Cim = P*<sup>b</sup> j nijnmj <sup>n</sup>:<sup>j</sup>* , where i 6¼ m = 1 to v, for each pair of treatment as following. C1m = *<sup>λ</sup>*<sup>2</sup> *<sup>k</sup>* , for those pair of treatments, which occur λ<sup>2</sup> times. Since, λ<sup>2</sup> = k and hence, Cim = 1. Again, for those pair of treatments for which λ<sup>1</sup> = 0, C1m = *<sup>λ</sup>*<sup>1</sup> *<sup>k</sup>* + *<sup>n</sup> <sup>n</sup>* = 1. For variance balanced design, Cim should be the same for each pair of treatments. Hence, *<sup>λ</sup>*<sup>1</sup> *<sup>k</sup>* + *<sup>n</sup> <sup>n</sup>* <sup>=</sup> *<sup>λ</sup>*<sup>2</sup> *<sup>k</sup>* . This implies that, using this method, we can construct a variance balanced design from those semi - regular group divisible designs in which (λ<sup>2</sup> – λ1) = k holds true.

Again, a block design Is said to variance balanced design, if C matrix satisfies, C = θ (Iv – Evv/v), where, θ is non zero eigen value of C matrix with multiplicity (v – 1), and, C = diag (r1, r2, … ,rv) – N K�<sup>1</sup> N' .

$$\mathbf{C} = \begin{bmatrix} r+n & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & r+n & \dots & \mathbf{0} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & r+n \end{bmatrix} - \begin{bmatrix} \frac{r+k}{k} & \mathbf{1} & \dots & \mathbf{1} \\ \mathbf{1} & \frac{r+k}{k} & \dots & \mathbf{1} \\ \vdots & \vdots & \dots & \vdots \\ \mathbf{1} & \mathbf{1} & \dots & \frac{r+k}{k} \end{bmatrix} \mathbf{C}$$

Diagonal elements = [k (r + n) – (r + k)]/k, and off diagonal elements = � k/k = �1. After simplification, C reduces to

$$\mathbf{C} = \frac{\mathbf{k}(\mathbf{r} + \mathbf{n}) - r}{k} \left[ \mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{vv}}}{\mathbf{v}} \right] \tag{3}$$

Where, θ = k rð Þ� <sup>þ</sup><sup>n</sup> *<sup>r</sup> <sup>k</sup>* denotes the non-zero eigen value of C matrix with multiplicity (v – 1).

Eq. (3) satisfy the condition of variance balanced design. Hence, this is equi replicated and two unequal block sizes variance balanced design with parameters v1 = v, b1 = b + mn, r1 = r + n, k1 = {k, n}.

#### **4.1 Efficiency factor**

The efficiency factor of a variance balanced design is defined as

$$\mathbf{E} = \frac{\mathbf{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{RBD}}}{\mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{VB}}},\\\text{where, } \mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{RBD}} = \left(2/\mathbf{r}\right)\sigma^{2} = \frac{2}{r+n}\sigma^{2},\\\text{and}\\\mathbf{E}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{VB}} = \left(\frac{2}{\mathbf{r}\left(\mathbf{r}+\mathbf{n}\right)-r}\sigma^{2}\right)\mathbf{a}\mathbf{B} = \frac{\mathbf{k}\left(\mathbf{r}+\mathbf{n}\right)\left(-r\right)}{k\left(r+n\right)}$$

**Example – 4.1** Construct a variance balanced design with parameters v1 = 6, b1 = 18, r1 = 8, k1 = {3, 2} from a semi regular group divisible design SR – 20, having parameters v = 6, b = 12, r = 6, k = 3, λ<sup>1</sup> = 0, λ<sup>2</sup> = 3, m = 3 and n = 2. Where, group is (3,2).

Three groups each with 2 treatments are (1 4), (2 5), (3 6).

Blocks of the semi-regular group divisible design, SR – 20 are. (1 2 3), (2 4 6), (3 4 5), (1 5 6), (1 2 3), (2 4 6), (3 4 5), (1 5 6), (1 2 6), (1 3 5), (2 3 4), (4 5 6),

Using Theorem – 4.1, incidence matrix of the variance balanced design is given as.

100110011100100100 110011001010010010 N1 =1 0 1 0 1 0 1 0 0 1 1 0 0 0 1 0 0 1 011001100011100100 001100110101010010 010101011001001001

N1 gives the incidence matrix of an equi-replicated and un-equal block sizes variance balanced design with parameters v1 = 6, b1 = 18, r1 = 8, k1 = {3, 2} Cim = 1 and information matrix,

$$\mathbf{C} = \begin{bmatrix} \mathbf{8} & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & \mathbf{8} & \dots & \mathbf{0} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & \mathbf{8} \end{bmatrix} - \begin{bmatrix} \mathbf{18} & \mathbf{6} & \dots & \mathbf{6} \\ \mathbf{6} & \mathbf{18} & \dots & \mathbf{6} \\ \vdots & \vdots & \dots & \vdots \\ \mathbf{6} & \mathbf{6} & \dots & \mathbf{18} \end{bmatrix} / \mathbf{6} \end{bmatrix}$$

After simplification, C reduces to

$$\mathbf{C} = \mathbf{G} \left[ \mathbf{I}\_{\mathsf{T}} - \frac{\mathbf{E}\_{\mathsf{T}}}{\mathsf{T}} \right] \qquad = \boldsymbol{\Theta} \left[ \mathbf{I}\_{\mathsf{G}} - \frac{\mathbf{E}\_{\mathsf{G}}}{\mathsf{G}} \right]$$

Where, θ = 6 is the non zero eigen value of C matrix with multiplicity 5. Hence, it is a variance balance design with *t* b*<sup>i</sup>* = (1/ θ) Qi = (1/6) Qi, Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB*= (2/θ)σ<sup>2</sup> = (2/6)σ<sup>2</sup> ,Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD* =(2/r)σ<sup>2</sup> = (2/8) <sup>σ</sup><sup>2</sup> , and Efficiency factor, E = 3/4. This shows that efficiency factor is very high.

#### **5. Variance balanced design through augmenting n (**≥ **1) blocks**

In this section, variance balanced designs are obtained through balanced incomplete block design by augmenting one and more than one blocks, such that each augmented block contains each of the v treatments. The resulting design is an un equal replicated and un - equal blocks sizes variance balanced design.

**Theorem – 5.1** Let N be the incidence matrix of a balanced incomplete block design with parameters v, b, r, k and λ. Let n blocks are added with the blocks of the given balanced incomplete block design. The incidence matrix N1 defined as

$$\mathbf{N\_1} = \left[ (\mathbf{N})\_{v \ge b} \mathbf{1}\_{v \ge 1} \right],$$

gives variance balanced design with parameters v1 = v, b1 = b + n, r1 = {(r + n), b}, k1 = {k, v}, where, N1 is the incidence matrix of Variance balanced design.

**Proof:** Consider a balanced incomplete block design with parameters v, b, r, k and λ, whose incidence matrix is denoted by N. Next n more blocks are augmented, hence, for resulting design, v1 = v, b1 = b + n, r1 = (r + n), k1 = {k, v}. Cim = *<sup>λ</sup> <sup>k</sup>* + *<sup>n</sup>* <sup>v</sup> <sup>=</sup> *<sup>λ</sup>v*þ*nk* vk .

Again, a block design Is said to variance balanced design, if C matrix satisfies, C = θ (Iv – Evv/v), where, θ is non zero eigen value of C matrix with multiplicity (v – 1) and C = diag(r1, r2, … ,rv) – N K�<sup>1</sup> N' .

*Variance Balanced Design DOI: http://dx.doi.org/10.5772/intechopen.101847*

$$\mathbf{C} = \begin{bmatrix} r+n & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & r+n & \dots & \mathbf{0} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & r+n \end{bmatrix} - \begin{bmatrix} \mathbf{k}(\mathbf{b}+\mathbf{n}) & \lambda \mathbf{v}+\mathbf{n} \mathbf{k} & \dots & \lambda \mathbf{v}+\mathbf{n} \mathbf{k} \\ \lambda \mathbf{v}+\mathbf{n} \mathbf{k} & \mathbf{k}(\mathbf{b}+\mathbf{n}) & \dots & \lambda \mathbf{v}+\mathbf{n} \mathbf{k} \\ \vdots & \vdots & \cdots & \vdots \\ \lambda \mathbf{v}+\mathbf{n} \mathbf{k} & \lambda \mathbf{v}+\mathbf{n} \mathbf{k} & \dots & \mathbf{k}(\mathbf{b}+\mathbf{n}) \end{bmatrix} / \mathbf{v}\mathbf{k}$$

Diagonal elements are [vk(r + n) – k(b + n)]/vk, and off diagonal elements are ð Þ λ vþn k vk . After simplification, C reduces to C =

$$
\begin{bmatrix}
\mathbf{vk(r+n)} - k \ (b+n) & -(\lambda \mathbf{v} + \mathbf{n} \mathbf{k}) & \dots & -(\lambda \mathbf{v} + \mathbf{n} \mathbf{k}) \\
\vdots & \vdots & \dots & \vdots \\
\end{bmatrix}
\begin{bmatrix}
\mathbf{v} \\
\mathbf{v} \mathbf{k} \\
\mathbf{v} \mathbf{k}
\end{bmatrix}
$$

$$
\text{Finally, } \mathbf{C} = \frac{(\lambda \mathbf{v} + \mathbf{n} \mathbf{k})}{k} \begin{bmatrix}
\mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{v}}}{\mathbf{v}}
\end{bmatrix}
\tag{4}
$$

where, θ = ð Þ <sup>λ</sup>vþnk *<sup>k</sup>* is the non zero eigen value of C matrix with multiplicity (v – 1). Eq. (4) satisfy the condition of variance balanced design. Hence, this is equi replicated and two unequal block sizes variance balanced design with parameters v1 = v, b1 = b + n, r1 = (r + n), k1 = {k, v}.

#### **5.1 Efficiency factor**

The efficiency factor of a variance balanced design is defined as

$$\mathbf{E} = \frac{\mathbf{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{RBD}}}{\mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{VB}}}, \text{where, } \mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{RBD}} = \left(2/\mathbf{r}\right)\sigma^{2} = \frac{2}{r+n}\sigma^{2},$$

$$\mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}\right)\_{\mathrm{VB}} = \left(2/\theta\right)\sigma^{2} = \frac{2}{\frac{(\lambda\mathbf{v}+\mathbf{n}\mathbf{k})}{k}}\sigma^{2} = \frac{2k}{\left(\lambda\mathbf{v}+\mathbf{n}\mathbf{k}\right)}\text{ and } \mathbf{E} = \frac{\left(\lambda\mathbf{v}+\mathbf{n}\mathbf{k}\right)}{k(r+n)}.$$

**Example – 5.1** Construct a variance balanced design with parameters v1 = 9, b1 = 15, r1 = 7, k1 = {3, 9} from a balanced incomplete block design having parameters v = 9, b = 12, r = 4, k = 3, λ = 1.

Blocks of the balanced incomplete block design are (1 2 3), (4 5 6), (7 8 9), (1 4 7), (2 5 8), (3 6 9), (1 6 8), (2 4 9), (3 5 7), (1 5 9), (2 6 7), (3 4 8). Let n = 3. Using Theorem – 5.1, incidence matrix of the variance balanced design is given as. 1 0 0 1 0 0 1 0 0 1 0 0 1 1 1. 100010010010111 N1 =1 0 0 0 0 1 0 0 1 0 0 1 1 1 1 010100010001111 010010001100111 010001100010111 001100001010111 001010100001111 001001010100111

N1 gives the incidence matrix of an equi replicated and un - equal block sizes variance balanced design with parameters v1 = 9, b1 = 15, r1 = 7, k1 = {3, 9} Cim = 2/3 and information matrix,

$$\mathbf{C} = \begin{bmatrix} 7 & 0 & \dots & 0 \\ 0 & 7 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \dots & 7 \end{bmatrix} - \begin{bmatrix} 45 & 18 & \dots & 18 \\ 18 & 45 & \dots & 18 \\ \vdots & \vdots & \dots & \vdots \\ 18 & 18 & \dots & 45 \end{bmatrix} / 27$$
 
$$\text{After simplification, } \mathbf{C} \text{ reduces to } \mathbf{C} = \begin{bmatrix} 144 & -18 & \dots & -18 \\ -18 & 144 & \dots & -18 \\ \vdots & \vdots & \dots & \vdots \\ -18 & -18 & \dots & 144 \end{bmatrix} / 27$$

Finally,

$$\mathbf{C} = \frac{\mathbf{1}\mathbf{6}\mathbf{2}}{2\mathsf{T}} \left[ \mathbf{I}\_{\theta} - \frac{\mathbf{E}\_{\theta\theta}}{\mathsf{9}} \right] = \frac{\mathbf{1}\mathbf{8}}{\mathsf{3}} \left[ \mathbf{I}\_{\theta} - \frac{\mathbf{E}\_{\theta\theta}}{\mathsf{9}} \right] \qquad \qquad = \theta \left[ \mathbf{I}\_{\theta} - \frac{\mathbf{E}\_{\theta\theta}}{\mathsf{9}} \right] \tag{5}$$

Where, θ = 6 is the non zero eigen value of C matrix with multiplicity 8. Hence, it is a variance balance design with *t* b*<sup>i</sup>* = (1/ θ) Qi = (1/6) Qi,Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB* =(2/ θ)σ<sup>2</sup> = (2/6) σ<sup>2</sup> . Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*= (2/r) <sup>σ</sup><sup>2</sup> = (2/7) <sup>σ</sup><sup>2</sup> and Efficiency factor, E = 6/7. This shows that efficiency factor is very high.

#### **6. Variance balanced design with (v + 1) treatments**

Variance balanced design with (v + 1) treatments is constructed by reinforcing one treatment in each block of a balanced incomplete block design.

#### **6.1 Variance balanced designs with (v + 1) treatments from a series of balanced incomplete block design with parameters v,** *b* ¼ *vC***<sup>2</sup> ,r=** *v* � **1***C***2**�**<sup>1</sup> , k = 2 and λ = 1**

In this section, method of the construction of variance balanced design with (v + 1) treatments is discussed. Variance balanced design with (v + 1) treatments can always be constructed through a balanced incomplete block design by reinforcing one treatment and augmenting n blocks. Let the parameters of a balanced incomplete block design are v, *b* ¼ *vC*<sup>2</sup> ,r= *v* � 1*<sup>C</sup>*2�<sup>1</sup> , k = 2 and λ = 1, provided v (r – 1) must be divisible by (k + 1) = 3. This is shown in Theorem – 6.1.

**Theorem – 6.1:** Let the parameters of an unreduced balanced incomplete block design are v, *b* ¼ *vC*<sup>2</sup> ,r= *v* � 1*<sup>C</sup>*2�<sup>1</sup> , k = 2 and λ = 1, whose incidence matrix is denoted by N. Let balanced incomplete block design is reinforced by one treatment up to b blocks and augmented with n blocks, such that each block contains each of the v treatments, where, n = 1, 2, … . . The incidence matrix N1 defined by

$$\mathbf{N}\_1 = \begin{bmatrix} \mathbf{N}\_{\mathbf{v} \times \mathbf{b}} & E\_{v \times n} \\ \mathbf{1}\_{1 \times b} & \mathbf{0}\_{1 \times n} \end{bmatrix}.$$

gives the incidence matrix of a Variance balanced design with parameters v1 = (v + 1), b1 = b + n, r1 = {r + n, b}, k = {3, v}, where, Evxn is a matrix of v rows and n columns with elements as 1, **1** is a vector of one row and b columns, **0** is a vector of one row and n columns, provided n = v rð Þ �<sup>1</sup> <sup>3</sup> , n being integer.

**Proof:** Let us consider a unreduced balanced incomplete block designs with parameters *b* ¼ *vC*<sup>2</sup> ,r= *v* � 1*<sup>C</sup>*2�<sup>1</sup> , k = 2 and λ = 1, provided v is divisible by k. This series of balanced incomplete block design is reinforced by one more treatment and augmented by n blocks such that (v + 1)th treatment appears in each of the b blocks and each of the n more blocks contains each of the v treatment once and only once, hence, v1 = v + 1, b1 = b + n, r1 = {r + n, b}, k = {3, v} become the parameters of the resulting variance balanced design. Let us check the Cim (i ¼6 m = 1 to v) value for each pair of treatments, Cim value for any pair of treatments among v treatments is computed as

$$\mathbf{C}\_{\text{im}} = \frac{\mathbf{1}}{\mathbf{3}} + \frac{\mathbf{n}}{\mathbf{v}} \tag{6}$$

Again,

$$\mathbf{C}\_{\rm im} = \frac{\mathbf{r}}{3}, \mathbf{i} = \mathbf{1}, \dots, \mathbf{v}, \text{and } \mathbf{m} = \mathbf{v} + \mathbf{1} \tag{7}$$

For variance balanced design, Cim for each pair of treatment must be same and hence, from (6) and (7), <sup>1</sup> <sup>3</sup> + <sup>n</sup> <sup>v</sup> = <sup>r</sup> 3, or, <sup>n</sup> <sup>v</sup> <sup>=</sup> ð Þ <sup>r</sup>�<sup>1</sup> <sup>3</sup> , hence, n = v rð Þ �<sup>1</sup> <sup>3</sup> .

Again, a block design Is said to variance balanced design, if C matrix satisfies,

C = θ (Iv – Evv/v), where, θ is non zero eigen value of C matrix with multiplicity (v – 1) and C = diag (r1, r2, … ,rv) – N K�<sup>1</sup> N' .

$$\mathbf{C} = \begin{bmatrix} r+n & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & r+n & \dots & \mathbf{0} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & b \end{bmatrix} - \begin{bmatrix} \mathbf{v}\mathbf{r}+\mathbf{3}\mathbf{n} & \mathbf{v}+\mathbf{3}\mathbf{n} & \dots & \mathbf{v}+\mathbf{3}\mathbf{n} \\ \mathbf{v}+\mathbf{3}\mathbf{n} & \mathbf{v}\mathbf{r}+\mathbf{3}\mathbf{n} & \dots & \mathbf{v}+\mathbf{3}\mathbf{n} \\ \vdots & \vdots & \vdots & \dots & \vdots \\ \mathbf{v}+\mathbf{3}\mathbf{n} & \mathbf{v}+\mathbf{3}\mathbf{n} & \dots & \mathbf{b}\mathbf{v} \end{bmatrix} / 3\mathbf{v}$$

Diagonal elements are (i) [3v(r + n) – ðvr þ 3n]/3v and (ii) 2bv, and off diagonal elements are - ð Þ <sup>v</sup>þ3n 3v . After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} \mathbf{3v(r+n)-(vr+3n)} & -(\mathbf{v+3n}) & \dots & -(\mathbf{v+3n})\\ -(\mathbf{v+3n}) & 2\mathbf{v(r+n)-(vr+3n)} & \dots & -(\mathbf{v+3n})\\ \vdots & \vdots & \dots & \vdots\\ -(\mathbf{v+3n}) & -(\mathbf{v+3n}) & \dots & 2\mathbf{b}\mathbf{v} \end{bmatrix} / 3\mathbf{v}$$

For variance balanced design, all the diagonal elements must be same and hence, [3v(r + n) – ðvr þ 3n] = 2bv. This shows that one can use either of diagonal element. In this section, we use [3v(r + n) – ðvr þ 3n] as a diagonal element.

Finally,

$$\mathbf{C} = \frac{(\mathbf{3n} + 2\mathbf{r} + \mathbf{1})}{\mathbf{3}} \left[ \mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{vv}}}{\mathbf{v}} \right] \tag{8}$$

Where, θ = ð Þ 3nþ2rþ<sup>1</sup> <sup>3</sup> is the non zero eigen value of C matrix with multiplicity v. Eq. (8) satisfy the condition of variance balanced design. Hence, this is an unequal replicated and unequal block sizes variance balanced design with parameters v1 = v, b1 = b + n, r1 = {(r + n), b}, k1 = {3, v}.

#### **6.2 Efficiency factor**

Since the resulting variance balanced design has two unequal replications and hence, there are two efficiency factors. The efficiency factor of a variance balanced design is defined as.

$$\mathbf{E}\_1 = \frac{\text{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{RBD1}}}{\text{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{VB}}}, \mathbf{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{RBD1}} = \left\langle 2/\mathbf{r} \right\rangle \sigma^2 = \frac{2}{r+n} \sigma^2, \text{ where, } \mathbf{t}\_i \text{ and } \mathbf{t}\_m \text{ are any}$$

two treatments among v treatments, that is, i 6¼ m = 1 to v.Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*<sup>2</sup> = ( <sup>1</sup> ð Þ *<sup>r</sup>*þ*<sup>n</sup>* <sup>+</sup> <sup>1</sup> *<sup>b</sup>*) <sup>σ</sup><sup>2</sup> , where, i 6¼ m = 1 to v and m = (v + 1).

$$\mathrm{Var}(\hat{t}\_{i} - \hat{t}\_{m}^{\cdot})\_{VB} = (2/\theta)\sigma^{2} = \frac{2}{\frac{(3n+2r+1)}{3}}\sigma^{2} = \frac{6}{(3n+2r+1)}.$$

$$\mathbf{E}\_{1} = \frac{(3\mathbf{n} + 2\mathbf{r} + 1)}{3(\mathbf{r} + \mathbf{n})}. \text{ Again,} \\ \mathbf{E}\_{2} = \frac{\mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}^{\cdot}\right)\_{RBD2}}{\mathrm{Var}\left(\hat{t}\_{i} - \hat{t}\_{m}^{\cdot}\right)\_{VB}} = \frac{(b+r+n)(3n+2r+1)}{6b(r+n)}.$$

**Example – 6.1:** Construct a variance balanced design with parameters v1 = 6, b1 = 15, r1 = {9, 10}, k1 = {3, 5} from a balanced incomplete block design having parameters v = 5, b = 10, r = 4, k = 2, λ = 1.

Blocks of the balanced incomplete block design are.

(1 2), (1 3), (1 4), (1 5), (2 3), (2 4), (2 5), (3 4), (3 5), (4 5). Let n = v rð Þ �<sup>1</sup> <sup>3</sup> = 5. Hence, five blocks are augmented.

Using Theorem �6.1, incidence matrix of the variance balanced design is given as.

111100000011111 100011100011111 N1 =0 1 0 0 1 0 0 1 1 0 1 1 1 1 1 001001010111111 000100101111111 111111111100000

N1 gives the incidence matrix of an unequal replicated and unequal block sizes variance balanced design with parameters v1 = 6, b1 = 15, r1 = {9, 10}, k1 = {3, 5}, Cim = 20/15 and information matrix,

$$\mathbf{C} = \begin{bmatrix} 9 & 0 & 0 & 0 & 0 & 0 \\ 0 & 9 & 0 & 0 & 0 & 0 \\ 0 & 0 & 9 & 0 & 0 & 0 \\ 0 & 0 & 0 & 9 & 0 & 0 \\ 0 & 0 & 0 & 0 & 9 & 0 \\ 0 & 0 & 0 & 0 & 0 & 10 \end{bmatrix} - \begin{bmatrix} 35 & 20 & 20 & 20 & 20 & 20 \\ 20 & 35 & 20 & 20 & 20 & 20 \\ 20 & 20 & 35 & 20 & 20 & 20 \\ 20 & 20 & 20 & 35 & 20 & 20 \\ 20 & 20 & 20 & 20 & 35 & 20 \\ 20 & 20 & 20 & 20 & 20 & 50 \end{bmatrix} / 15$$

After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} 100 & -20 & -20 & -20 & -20 & -20 \\ -20 & 100 & -20 & -20 & -20 & -20 \\ -20 & -20 & 100 & -20 & -20 & -20 \\ -20 & -20 & -20 & 100 & -20 & -20 \\ -20 & -20 & -20 & -20 & 100 & -20 \\ -20 & -20 & -20 & -20 & -20 & 100 \end{bmatrix} / 15$$
 
$$\text{Finally, } \mathbf{C} = \frac{120}{15} \left[ \mathbf{I}\_6 - \frac{66}{6} \right] = 8 \left[ \mathbf{I}\_9 - \frac{\mathbf{E}\_{99}}{9} \right] \qquad = \Theta \left[ \mathbf{I}\_6 - \frac{\mathbf{E}\_{66}}{6} \right] \tag{9}$$

Where, θ = 8 is the non zero eigen value of C matrix with multiplicity 5. Hence, it is a variance balanced design with *t* b*<sup>i</sup>* = (1/ θ) Qi = (1/8) Qi,

Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB* = (2/θ)σ<sup>2</sup> = (2/8) <sup>σ</sup><sup>2</sup> ,Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*<sup>1</sup> = (2/r) <sup>σ</sup><sup>2</sup> = (2/9) <sup>σ</sup><sup>2</sup> and Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*2= ( <sup>1</sup> *<sup>r</sup>*þ*<sup>n</sup>* <sup>+</sup> <sup>1</sup> *<sup>b</sup>*) <sup>σ</sup><sup>2</sup> = (<sup>1</sup> <sup>9</sup> + <sup>1</sup> <sup>10</sup>) <sup>σ</sup><sup>2</sup> = (19/90) <sup>σ</sup><sup>2</sup> with.

Efficiency factor, E1 = 8/9 and E2 = 38/45. This shows that efficiency factor is very high.

### **6.3 Variance balanced designs with (v + 1) treatments from a series of balanced incomplete block design with parameters v,** *b* ¼ *vCk* **,r=** *v* � **1***Ck*�**<sup>1</sup> , k and λ =***v* � **2***Ck*�**<sup>2</sup>**

In Section 6.1, method of the construction of variance balanced design with (v + 1) treatments is discussed with block sizes k = 2. In this section, we have extended the method of construction of variance balanced designs with (v + 1) treatments through a balanced incomplete block design for any value of k by reinforcing one treatment and augmenting n blocks. Let the parameters of a balanced incomplete block design are v, *b* ¼ *vCk* ,r= *v* � 1*Ck*�<sup>1</sup> , k and λ = *v* � 2*Ck*�<sup>2</sup> , provided, v(r – λ) must be divisible by (k + 1). This is shown in Theorem – 6.2.

**Theorem – 6.2** Let the parameters of a balanced incomplete block design are v, *b* ¼ *vCk* ,r= *v* � 1*Ck*�<sup>1</sup> , k and λ = *v* � 2*Ck*�<sup>2</sup> . whose incidence matrix is denoted by N. Let balanced incomplete block design is reinforced by one treatment up to b blocks and augmented with n blocks, such that each block contains each of the v treatments, where, n = 1, 2, … . . The incidence matrix N1 defined by

$$\mathbf{N}\_1 = \begin{bmatrix} \mathbf{N}\_{\mathbf{v} \times \mathbf{b}} & E\_{v \times \mathbf{n}} \\ \mathbf{1}\_{1 \times b} & \mathbf{0}\_{1 \times \mathbf{n}} \end{bmatrix}$$

gives the incidence matrix of a Variance balanced design with parameters v1 = (v + 1), b1 = b + n, r1 = {r + n, b}, k1 = {(k + 1), v}, where, Evxn is a matrix of v rows and n columns with elements as 1, **1** is a vector of one row and b columns, **0** is a vector of one row and n columns, provided n =v rð Þ �<sup>λ</sup> ð Þ <sup>k</sup>þ<sup>1</sup> , n being integers.

**Proof**: Let us consider a series of balanced incomplete block designs with parameters v,*b* ¼ *vCk* ,r= *v* � 1*Ck*�<sup>1</sup> , k and λ = *v* � 2*Ck*�<sup>2</sup> , provided n is divisible byv rð Þ �<sup>λ</sup> ð Þ <sup>k</sup>þ<sup>1</sup> . This series of balanced incomplete block design is reinforced by one more treatment and augmented by n blocks such that (v + 1)th treatment appears in each of the b blocks and each of the n more blocks contains each of the v treatment once and only once, hence, v1 = v + 1, b1 = b + n, r1 = {r + n, b}, k1 = {(k + 1), v} are the parameters of the resulting variance balanced design. Let us check the Cim (i 6¼ m=1 to v) value for each pair of treatments. Cim value for any pair of treatments among v treatments is computed as

$$\mathbf{C}\_{\rm im} = \frac{\mathbf{1}}{(\mathbf{k} + \mathbf{1})} + \frac{\mathbf{n}}{\mathbf{v}} \tag{10}$$

$$\text{Again, } \mathbf{C}\_{\text{im}} = \frac{\mathbf{r}}{(\mathbf{k} + \mathbf{1})}, \mathbf{i} = \mathbf{1}, \dots, \mathbf{v} \text{ and } \mathbf{m} = \mathbf{v} + \mathbf{1} \tag{11}$$

For variance balanced design, Cim for each pair of treatment, must be same and hence, from (10) and (11), <sup>1</sup> ð Þ <sup>k</sup>þ<sup>1</sup> <sup>+</sup> <sup>n</sup> <sup>v</sup> = <sup>r</sup> ð Þ <sup>k</sup>þ<sup>1</sup> , or, <sup>n</sup> <sup>v</sup> <sup>=</sup> ð Þ <sup>r</sup>–<sup>λ</sup> ð Þ <sup>k</sup>þ<sup>1</sup> , hence, n = v rð Þ �<sup>λ</sup> ð Þ <sup>k</sup>þ<sup>1</sup> .

Again, a block design Is said to variance balanced design, if C matrix satisfies,

C = θ (Iv – Evv/v), where, θ is non zero eigen value of C matrix with multiplicity (v – 1) and C = diag(r1, r2, … ,rv) – N K�<sup>1</sup> N' .

$$\mathbf{C} = \begin{bmatrix} r+n & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & r+n & \dots & \mathbf{0} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{0} & \mathbf{0} & \dots & b \\ \mathbf{0} & \mathbf{0} & \dots & b \\ \hline \\ \begin{bmatrix} \mathbf{v}\mathbf{r}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \dots & \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) \\ \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \mathbf{v}\mathbf{r}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \dots & \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) \\ \vdots & \vdots & \ddots & \vdots \\ \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \lambda\mathbf{v}+\mathbf{n}(\mathbf{k}+\mathbf{1}) & \dots & \mathbf{b} \end{bmatrix}/\mathbf{v}(\mathbf{k}+\mathbf{1})$$

Diagonal elements are (i) [v(k + 1)(r + n) – ðvr þ n kð Þ þ 1 ]/v(k + 1) and (ii) kbv and off diagonal elements are - ð Þ <sup>λ</sup>vþn kð Þ <sup>þ</sup><sup>1</sup> v kð Þ <sup>þ</sup><sup>1</sup> , . After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} \mathbf{v}\mathbf{k}(\mathbf{n}+\mathbf{r}) + \mathbf{n}(\mathbf{v}-\mathbf{k}-\mathbf{1}) & -(\lambda\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) & \dots & -(\lambda\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) \\ -(\lambda\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) & \mathbf{v}\mathbf{k}(\mathbf{n}+\mathbf{r}) + \mathbf{n}(\mathbf{v}-\mathbf{k}-\mathbf{1}) & \dots & -(\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) \\ \vdots & \vdots & \dots & \vdots \\ -(\lambda\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) & -(\lambda\mathbf{v} + \mathbf{n}(\mathbf{k}+\mathbf{1})) & \dots & \mathbf{k}\mathbf{b}\mathbf{v} \end{bmatrix} \begin{bmatrix} \mathbf{v}(\mathbf{k}+\mathbf{1}) \\ \mathbf{v}(\mathbf{k}+\mathbf{1}) \end{bmatrix}$$

For variance balance design, all the diagonal elements must be same and hence,vk nð Þþ þ r n vð Þ � k � 1 = kbv. This shows that we can use either of diagonal element. In this section, we usedvk nð Þþ þ r n vð Þ � k � 1 as a diagonal element.

$$\text{Finally, C} = \frac{(\mathbf{n}(\mathbf{k} + \mathbf{1}) + \mathbf{k}\mathbf{r} + \lambda)}{(k+1)} \left[ \mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{v}\mathbf{v}}}{\mathbf{v}} \right] \tag{12}$$

Where, θ = ½ � n kð Þþ <sup>þ</sup><sup>1</sup> krþ<sup>λ</sup> ð Þ *<sup>k</sup>*þ<sup>1</sup> is the non zero eigen value of C matrix with multiplicity v. Eq. (12) satisfy the condition of variance balanced design. Hence, this is an unequal replicated and unequal block sizes variance balanced design with parameters v1 = v, b1 = b + n, r1 = {(r + n), b}, k1 = {3(k + 1) v}.

#### *6.3.1 Efficiency factor of this variance balanced design*

Since the resulting variance balanced design is a two unequal replicated design and hence, there are two efficiency factors. The efficiency factor of a variance balanced design is defined as.

$$\mathbf{E}\_1 = \frac{\text{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{RBD1}}}{\text{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{VB}}}, \text{Var}\left(\widehat{t\_i} - \widehat{t\_m}\right)\_{\text{RBD1}} = \text{(2/r)}\ \sigma^2 = \frac{2}{r+n}\sigma^2, \text{ where } \mathbf{t}\_i \text{ and } \mathbf{t}\_m \text{ are any}$$

$$\mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{2}\_{\text{A}} \quad \mathbf{2}\_{\text{A}} \quad \mathbf{3}\_{\text{A}} \quad \mathbf{4}\_{\text{A}} \quad \mathbf{5}\_{\text{A}} \quad \mathbf{7}\_{\text{A}} \quad \mathbf{8}\_{\text{B}} \quad \mathbf{8}\_{\text{C}} \quad \mathbf{8}\_{\text{D}} \quad \mathbf{9}\_{\text{C}} = \mathbf{0}, \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_{\text{A}} \quad \mathbf{1}\_$$

two treatments among v treatments, i 6¼ m = 1 to v.Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*<sup>2</sup> = ( <sup>1</sup> ð Þ *<sup>r</sup>*þ*<sup>n</sup>* <sup>+</sup> <sup>1</sup> *<sup>b</sup>*) <sup>σ</sup><sup>2</sup> , where, i 6¼ m = 1 to v and m = (v + 1).

$$\text{Var}(\hat{t}\_i - \hat{t}\_{\text{m}})\_{VB} = (2/\theta)\sigma^2 = \frac{2}{\frac{[\mathbf{n}(\mathbf{k}+\mathbf{1}) + \mathbf{k}\mathbf{r} + \lambda]}{(k+1)}}\sigma^2 = \frac{2(\mathbf{k}+\mathbf{1})}{[\mathbf{n}(\mathbf{k}+\mathbf{1}) + \mathbf{k}\mathbf{r} + \lambda]}.$$

$$\mathbf{E}\_1 = \frac{[\mathbf{n}(\mathbf{k}+\mathbf{1}) + \mathbf{k}\mathbf{r} + \lambda]}{(\mathbf{r}+\mathbf{n})(\mathbf{k}+\mathbf{1})} \text{ and } \mathbf{E}\_2 = \frac{[\mathbf{n}(\mathbf{k}+\mathbf{1}) + \mathbf{k}\mathbf{r} + \lambda](\mathbf{b}+\mathbf{r}+\mathbf{n})}{2\mathbf{b}(\mathbf{r}+\mathbf{n})(\mathbf{k}+\mathbf{1})}.$$

**Example – 6.2:** Construct a variance balanced design with parameters v1 = 6, b1 = 15, r1 = {9, 10}, k1 = {3, 5} from a balanced incomplete block design having parameters v = 6, b = 20, r = 10, k = 3, λ = 4.

Blocks of the balanced incomplete block design are.

(1 2 3), (1 2 4), (1 2 5), (1 2 6), (1 3 4), (1 3 5), (1 3 6), (1 4 5), (1 4 6), (1 5 6), (2 3 4), (2 3 5), (2 3 6), (2 4 5), (2 4 6), (2 5 6), (3 4 5), (3 4 6), (3 5 6), (4 5 6). Let n = v rð Þ �<sup>λ</sup> ð Þ <sup>k</sup>þ<sup>1</sup> = 9. Hence, nine blocks are augmented. Using Theorem – 6.2, incidence matrix of the variance balanced design is given as.

$$
\begin{array}{c}
\begin{array}{c}
1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 1 \ 
1 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
0 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \ 
1 \$$

N1 gives the incidence matrix of an unequal replicated and un - equal block sizes variance balanced design with parameters v1 = 7, b1 = 29, r1 = {19, 20}, k1 = {4, 6}; Cim = 10/4 and information matrix,

$$\mathbf{C} = \begin{bmatrix} 19 & 0 & \dots & 0 \\ 0 & 19 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \dots & 20 \end{bmatrix} - \begin{bmatrix} 48 & 30 & \dots & 30 \\ 30 & 48 & \dots & 30 \\ \vdots & \vdots & \dots & \vdots \\ 30 & 30 & \dots & 60 \end{bmatrix} / 12$$

$$\text{After simplification, C reduces to } \mathbf{C} = \begin{bmatrix} 180 & -30 & & -30 \\ -30 & 180 & \dots & -30 \\ \vdots & \vdots & \vdots & \vdots \\ -30 & -30 & \dots & 180 \end{bmatrix} / 12$$

$$\text{Finally, C} = \frac{210}{12} \left[ \mathbf{I}\_{\overline{Y}} - \frac{E\_{\overline{T}}}{\overline{I}} \right] = \frac{35}{2} \left[ \mathbf{I}\_{\overline{Y}} - \frac{\mathbf{E}\_{\overline{T}}}{\overline{I}} \right] \qquad = \theta \left[ \mathbf{I}\_{\overline{Y}} - \frac{\mathbf{E}\_{\overline{T}}}{\overline{I}} \right] \tag{13}$$

Where, θ = 35/2 is the non zero eigen value of C matrix with multiplicity 6. Hence, it is a variance balanced design with*t* b*<sup>i</sup>* = (1/ θ) Qi = (2/35) Qi, Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB* = (2/θ)σ<sup>2</sup> = (4/35) σ<sup>2</sup> ,Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*<sup>1</sup> = (2/r) <sup>σ</sup><sup>2</sup> = (2/19) <sup>σ</sup><sup>2</sup> and Var *<sup>t</sup>* b*<sup>i</sup>* � *t*b *m* � � *RBD*2 = ( <sup>1</sup> *<sup>r</sup>*þ*<sup>n</sup>* <sup>+</sup> <sup>1</sup> *<sup>b</sup>*) <sup>σ</sup><sup>2</sup> = (<sup>1</sup> <sup>9</sup> + <sup>1</sup> 10) <sup>σ</sup><sup>2</sup> = (39/380) <sup>σ</sup><sup>2</sup> with efficiency factor, E1 = 35/38 and E2 = 273/ 304. This shows that efficiency factor is very high.

Nonexistence of variance balanced design by reinforcing (v + t) treatments. Variance balanced design cannot be constructed from a balanced incomplete block design by reinforcing (v + t) treatments, t = 1, 2, …

Because for variance balanced design Cim must be same for each pair of treatments. In this case Cim = *<sup>λ</sup> k*þ1 , i 6¼ m = 1 to v and Ci (v + 1) <sup>=</sup> *<sup>r</sup> k*þ1 , where, t = 1. As per the condition of variance balanced design, *<sup>r</sup> k*þ1 = *<sup>λ</sup> k*þ1 , which is not possible as (r – λ) > 0. If two treatments are reinforced, then *<sup>r</sup> k*þ1 = *<sup>λ</sup> <sup>k</sup>*þ<sup>1</sup> <sup>=</sup> *<sup>b</sup> <sup>k</sup>*þ<sup>1</sup> must holds true, but b 6¼ r 6¼ λ.

## **7. Variance balanced design using 2<sup>n</sup> symmetrical factorial experiments**

This section discusses the construction of variance balanced design using 2n symmetrical factorial experiment.

**Theorem: 7.1** Let us consider a 2n factorial experiment. By deleting the control treatment and all the main effects, equi-replicated and unequal block sizes variance balanced design is obtained with parameters v = n, b = 2n – n – 1, r = 2n � <sup>1</sup> , k = {2, 3, 4 … , n}.

**Proof:** consider a 2<sup>n</sup> treatment combination of a 2n factorial experiment. Delete its control treatment and all main effects. Consider each treatment combination as one block. So, we have (2<sup>n</sup> – 1 – n) blocks with v (= n) treatments, as factors are considered as treatments. Consider this matrix as an incidence matrix of a block design, whose all elements are either zero or 1 only. So, the design is binary.

Since each treatment is repeated (2n � <sup>1</sup> – 1) times, design is equi replicated and unequal block sizes with k = {2, 3, … , n}. Let the incidence matrix of the block design is given by


The incidence matrix of the block design is a variance balanced, if the C matrix of the block design satisfy C = <sup>θ</sup> Iv � <sup>1</sup> <sup>v</sup> EVV � �, where, <sup>θ</sup> is non – zero eigen value of C matrix.

$$\mathbf{C} = \begin{bmatrix} 2^{n-1} - \mathbf{1} & \mathbf{0} & \dots & \mathbf{0} \\ \mathbf{0} & 2^{n-1} - \mathbf{1} & \dots & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \dots & 2^{n-1} - \mathbf{1} \end{bmatrix} - \begin{bmatrix} \mathbf{Y} & \mathbf{X} & \dots & \mathbf{X} \\ \mathbf{X} & \mathbf{Y} & \dots & \mathbf{X} \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{X} & \mathbf{X} & \dots & \mathbf{Y} \end{bmatrix}$$

where, Y = <sup>n</sup>�<sup>1</sup> ð Þ <sup>1</sup> <sup>2</sup> <sup>þ</sup> <sup>n</sup>�<sup>1</sup> ð Þ <sup>2</sup> <sup>3</sup> <sup>þ</sup> <sup>n</sup>�<sup>1</sup> ð Þ <sup>3</sup> <sup>4</sup> <sup>þ</sup> *::* … <sup>þ</sup> <sup>1</sup> <sup>n</sup>, and X = <sup>1</sup> <sup>2</sup> <sup>þ</sup> <sup>n</sup>�<sup>2</sup> <sup>3</sup> <sup>þ</sup> <sup>n</sup>�<sup>3</sup> <sup>4</sup> <sup>þ</sup> … <sup>þ</sup> <sup>n</sup>�ð Þ <sup>n</sup>�<sup>1</sup> <sup>n</sup> and r = 2n – <sup>1</sup> -1 After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} 2^{n-1} - \mathbf{1} - \mathbf{Y} & -\mathbf{X} & \cdots & -\mathbf{X} \\\\ -\mathbf{X} & 2^{n-1} - \mathbf{1} - \mathbf{Y} & \cdots & -\mathbf{X} \\\\ \vdots & \vdots & \vdots & \vdots \\\\ -\mathbf{X} & -\mathbf{X} & \cdots & 2^{n-1} - \mathbf{1} - \mathbf{Y} \end{bmatrix}$$

Finally,

$$\mathbf{C} = \left(\mathbf{2^{n-1}}\mathbf{-1} - \mathbf{Y} + \mathbf{X}\right) \left[\mathbf{I}\_{\mathbf{v}} - \frac{\mathbf{E}\_{\mathbf{v}\mathbf{v}}}{\mathbf{v}}\right] \tag{14}$$

Where, θ = (2n – <sup>1</sup> –1 - Y + X) is the non-zero eigen value of C matrix with multiplicity (v – 1). Eq. (14) satisfy the condition of variance balanced design. Hence, this is an equal replicated and unequal block sizes variance balanced design *Variance Balanced Design DOI: http://dx.doi.org/10.5772/intechopen.101847*

with parameters v = n, b = 2<sup>n</sup> – n – 1, r = 2n � <sup>1</sup> - 1, k = {2, 3, 4 … , n.}. Efficiency factor, E = <sup>2</sup>*n*�<sup>1</sup> ð Þ –1�Yþ<sup>X</sup> <sup>2</sup>*n*�<sup>1</sup> ð Þ �<sup>1</sup> .

**Example: 7.1.** Construct a variance balanced design with parameters v = 4, b = 11, r = 7, k = {2, 3, … , n.}

Using Theorem 7.1, incidence matrix of the variance balanced design is given by

$$\mathbf{N} = \begin{bmatrix} \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{1} \ \mathbf{1} \ \mathbf{1} \ \mathbf{1} \ \mathbf{1} \ \mathbf{1} \\\\ \mathbf{0} \ \mathbf{1} \ \mathbf{1} \ \mathbf{0} \ \mathbf{0} \ \mathbf{1} \ \mathbf{1} \ \mathbf{1} \\\\ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{0} \ \mathbf{1} \\\\ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \\\\ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \ \mathbf{1} \ \mathbf{0} \\\\ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \\\\ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{0} \ \mathbf{7} \\\\ \end{bmatrix} - \begin{bmatrix} 33 & 17 & 17 & 17 \\\\ 17 & 33 & 17 & 17 \\\\ 17 & 17 & 33 & 17 \\\\ 17 & 17 & 17 & 33 \end{bmatrix} / 12 \\\\ \end{bmatrix} / 12$$

After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} 51 & -17 & -17 & -17 \\ -17 & 51 & -17 & -17 \\ -17 & -17 & 51 & -17 \\ -17 & -17 & -17 & 51 \end{bmatrix} / 12$$

Finally,

$$\mathbf{C} = \frac{68}{12} \left[ \mathbf{I}\_4 - \frac{E\_{44}}{4} \right] = \frac{17}{3} \left[ \mathbf{I}\_4 - \frac{\mathbf{E}\_{44}}{4} \right] = \Theta \left[ \mathbf{I}\_4 - \frac{44}{4} \right] \tag{15}$$

Where, θ = 17/3 is the non zero eigen value of C matrix with multiplicity 3. Hence, it is a variance balanced design with *t* b*<sup>i</sup>* = (1/ θ) Qi = (3/17) Qi, Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB* = (2/θ)σ<sup>2</sup> = (6/17) <sup>σ</sup><sup>2</sup> , Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*= (2/r) <sup>σ</sup><sup>2</sup> = (2/7) <sup>σ</sup><sup>2</sup> and efficiency factor, E = 17/21.

This result is due to Ghosh, Sinojia and Ghosh (2018).

#### **8. Construction of variance balanced designs using some incidence matrix**

**Theorem 8.1**: Let In denotes the identity matrix of order n, jn is a column Vector of one, 0n is the row vectors having all elements zero. An incidence matrix N 0 <sup>n</sup> **0**<sup>1</sup>�**<sup>n</sup>** " #

defined as N <sup>¼</sup> <sup>j</sup> **2** In *Enxn<sup>=</sup>*<sup>2</sup>

gives the incidence matrix of a Variance balanced designs, with parameters <sup>v</sup> <sup>¼</sup> <sup>n</sup> <sup>þ</sup> 1, b <sup>¼</sup> <sup>n</sup> <sup>þ</sup> <sup>n</sup> <sup>2</sup> , r <sup>¼</sup> n, 1 <sup>þ</sup> <sup>n</sup> 2 � � and k <sup>¼</sup> f g 2, n , where, n is even. **Proof:** Proof is obvious.

**Example 8.1.** Let = 6*:* So, the incidence matrix N using Theorem 8.1 is given by

$$\mathbf{N} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 4 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 4 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 4 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 4 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 4 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 4 & 0 \\ \end{bmatrix} = \begin{bmatrix} 6 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 2 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 2 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 2 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 2 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 2 & 1 \\ \end{bmatrix} / 2$$

After simplification, C reduces to

$$\mathbf{C} = \begin{bmatrix} 6 & -1 & -1 & -1 & -1 & -1 & -1 \\ -1 & 6 & -1 & -1 & -1 & -1 & -1 \\ -1 & -1 & 6 & -1 & -1 & -1 & -1 \\ -1 & -1 & -1 & 6 & -1 & -1 & -1 & 6 \\ -1 & -1 & -1 & -1 & 6 & -1 & -1 \\ -1 & -1 & -1 & -1 & -1 & 6 & -1 \\ -1 & -1 & -1 & -1 & -1 & -1 & 6 \end{bmatrix} / 2$$

Finally,

$$\mathbf{C} = \frac{\mathsf{T}}{2} \left[ \mathbf{I}\_{\mathsf{V}} - \frac{E\_{\mathsf{T}\mathsf{T}}}{\mathsf{T}} \right] = \frac{\mathsf{T}}{2} \left[ \mathbf{I}\_{\mathsf{V}} - \frac{E\_{\mathsf{T}\mathsf{T}}}{\mathsf{T}} \right] \qquad \qquad = \mathsf{G} \left[ \mathbf{I}\_{\mathsf{V}} - \frac{E\_{\mathsf{T}\mathsf{T}}}{\mathsf{T}} \right] \tag{16}$$

where, θ = 7/2 is the non zero eigen value of C matrix with multiplicity 6. Hence, it is a variance balanced design with parameters v = 7, b = 9, r = {6,4}, k = {2, 6}. *t* b*<sup>i</sup>* = (1/θ) Qi = (2/7) Qi; Var *t* b*<sup>i</sup>* � *t*b *m* � � *VB* = (2/θ)σ<sup>2</sup> = (4/7) <sup>σ</sup><sup>2</sup> , Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*1 = (2/r) σ<sup>2</sup> = (2/4) σ<sup>2</sup> , Var *t* b*<sup>i</sup>* � *t*b *m* � � *RBD*2 = (<sup>1</sup> + <sup>1</sup> 4) <sup>σ</sup><sup>2</sup> = (5/12) <sup>σ</sup><sup>2</sup> with efficiency factor, E1 = 7/8 and efficiency factor, E2 = 35/48.

This result is due to Ghosh, Sinojia and Ghosh (2018).

#### **9. Conclusions**

In this chapter, we have constructed Variance balanced designs using balanced incomplete block, group divisible, resolvable semi - regular group divisible, symmetrical factorial and fractional factorial designs. It is observed that efficiency

#### *Variance Balanced Design DOI: http://dx.doi.org/10.5772/intechopen.101847*

factor of all most all variance balanced design is high. Variance balanced designs constructed in sections 3 to 6 are new and extended methods, while Section 7 and 8, discuss the review work of Ghosh, Sinojia and Ghosh (2018).

## **Acknowledgements**

Author is very much grateful to University Grants Commission, New Delhi, for providing me an opportunity to work as UGC BSR Faculty Fellow. Author is also thankful to referees for suggesting the important ideas in improving the present chapter.

### **Author details**

D.K. Ghosh Department of Statistics, Saurashtra University, Rajkot, Gujarat, India

\*Address all correspondence to: ghosh\_dkg@yahoo.com

© 2022 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

### **References**

[1] Raghavarao D. Constructions and Combinatorial Problems in Design of Experiments. New York: Wiley; 1971

[2] Chakrabarti MC. On the C-matrix in design of experiments. Journal of the Indian Statistical Association. 1963;**1**: 8-23. 8C23

[3] Pearce SC. Experimenting with blocks of natural size. Biometrics. 1964; **20**(4):699-706. 699C706. DOI: 10.2307/ 2528123

[4] Kageyama S. Reduction of associate classes for block designs and related combinatorial arrangements. Hiroshima Mathematical Journal. 1974;**4**:527C618

[5] Rao VR. A note on balanced designs. The Annals of Mathematical Statistics. 1958;**29**(1):290-294. 290C294. DOI: 10.1214/aoms/1177706729

[6] Hedayat A, Federer WT. Pairwise and variance balanced incomplete block design. Annals of the Institute of Statistical Mathematics. 1974;**26**(1): 331-338. DOI: 10.1007/BF02479828

[7] Raghavarao D. On balanced unequal block designs. Biometrika. 1962;**49**(3/ 4):561-562. 561C562. DOI: 10.1093/ biomet/49.3-4.561

[8] Puri PD, Nigam AK. Balanced block designs. Communications in Statistics – Theory and Methods. 1977;**6**(12): 1171-1179. 1171C1179. DOI: 10.1080/ 03610927708827560

[9] John PWM. Balanced designs with unequal number of replications. Annals of Mathematical Statistics. 1964;**35**: 897C899. DOI: 10.1214/aoms/1177703597

[10] Jones B, sinha, K., & Kageyama, S. Further equireplicated variance balanced designs with unequal block sizes. Utilitas Mathematica. 1987;**32**: 5C10

[11] Kageyama S. Construction of balanced block designs. Utilitas Mathematica. 1976;**9**:209C229

[12] Kageyama S. Existence of variancebalanced binary designs with fewer experimental units. Statistics & Probability Letters. 1988;**7**(1):27-28. 27C28. DOI: 10.1016/0167-7152(88) 90083-1

[13] Kulshreshtha AC, Dey A, Saha GM. Balanced designs with unequal replications and unequal block sizes. Annals of Mathematical Statistics. 1972; **43**:1342-1345

[14] Pal S, Pal S. Nonproper variance balanced designs and optimality. Communications in Statistics - Theory and Methods. 1988;**17**:1685C1695. DOI: 10.1080/03610928808829706

[15] Roy J. On the efficiency factor of block designs. Sankhya. 1958;**19**: 181C188

[16] Sinha K. A note on equireplicated balanced block designs from BIB designs. Journal of the Indian Society of Agricultural Statistics. 1988;**42**:150C153

[17] Sinha K. Some new equireplicated balanced block designs. Statistics & Probability Letters. 1989;**8**:89. DOI: 10.1016/0167-7152(89)90089-8

[18] Sinha K, Jones B. Further equireplicated balanced block designs with unequal block sizes. Statistics & Probability Letters. 1988;**6**:229C330

[19] Tyagi BN. On a class of variance balanced block designs. Journal of Statistical Planning and Inference. 1979; **3**:333-336. DOI: 10.1016/0378-3758(79) 90029-6

[20] Khatri CG. A note on variance balanced designs. Journal of Statistical Planning and Inference. 1982;**6**(2):

*Variance Balanced Design DOI: http://dx.doi.org/10.5772/intechopen.101847*

173-177. 173C177. DOI: 10.1016/ 0378-3758(82)90086-6

[21] Das MN, Ghosh DK. Balancing incomplete block designs. Sankhyā: The Indian Journal of Statistics, Series B. 1985;**47**(1):67-77. 67C77

[22] Mukerjee R, Kageyama S. On resolvable and affine resolvable variance-balanced designs. Biometrika. 1985;**72**(1):165-172. 165C172. DOI: 10.1093/biomet/72.1.165

[23] Calvin JA. A new class of variance balanced designs. Journal of Statistical Planning and Inference. 1986;**14**(2–3): 251-254. DOI: 10.1016/0378-3758(86) 90162-X

[24] Calvin JA, Sinha K. A method of constructing variance balanced designs. Journal of Statistical Planning and Inference. 1989;**23**(1):127-131. DOI: 10.1016/0378-3758(89)90045-1

[25] Agarwal GG, Kumar S. On a class of variance balanced designs associated with GD designs. Calcutta Statistics Association Bulletin. 1984;**33**(3–4): 187-190. 187C190

[26] Ghosh DK. A series of generalized efficiency balanced designs. Gujarat Statistical Review. 1988;**15**(1):33-38. 33C38

[27] Ghosh DK, Karmokar PK. Some series of efficiency balanced designs. Australian & New Zealand Journal of Statistics. 1988;**30**(1):47-51. 47C51

[28] Ghosh DK, Divecha J. Construction of variance balanced designs. TEC.R. No.2, Dept. of Maths. and Stat., Saurashtra Uni. 1989:1-13. 1C13

[29] Ghosh DK, Divecha J, Kageyama S. Equi - replicate variance balanced designs from group divisible designs. Journal of the Japan Statistical Society. 1991;**21**(1):205-209. 205C209. DOI: 10.1111/j.1467-842X.1988.tb00610.x

[30] Ghosh DK, Joshi K. Note on construction of variance balanced designs through group divisible designs. Utilitas Mathematica. 1991;**39**:249-253. 249C253

[31] Ghosh DK, Karmokar PK, Divecha J. On Comparison of ternary efficiency balanced and ternary variance balanced designs. Utilitas Mathematica. 1991;**40**: 25-27

[32] Ghosh DK, Anita S, Kageyama S. Construction of variance-balanced and efficiency balanced ternary block designs. Journal of Japan Statistical Association Soc. 1994;**24**(2):201-208

[33] Ghosh DK, Joshi K. Construction of variance balanced designs through triangular PBIB designs. Calcutta Statistical Association Bulletin. 1995;**45** (1–2):111-118

[34] Ghosh DK, Sangeeta A. On variance balanced designs. Journal of Modern Applied Statistical Methods. 2017;**16**(2): 124-137

[35] Agarwal GG, Kumar S. A note on construction of variance balanced designs. Journal of the Indian Society of Agricultural Statistics. 1985;**37**(2): 181-183. 181C183

[36] Agarwal GG, Kumar S. Construction of balanced ternary designs. Australian and Newziland Journal of Statistics. 1986;**28**(2):251-255. 251C255. DOI: 10.1111/j.1467-842X.1986.tb00605.x

[37] Ghosh DK, Joshi K, Kageyama S. Ternary variance balanced designs through BIB and GD designs. Journal of the Japan Statistical Society. 1993;**23**(1): 75-81. 75C81

[38] Ghosh DK. Robustness of Variance balanced designs against the loss of one block. Indian Journal of Statistics and Applications. 2012;**1**:17-30. 17C30

[39] Ghosh DK, Sinojia CN, Ghosh S. On a class of variance balanced designs.

International Journal of Statistics and Probability. 2018;**7**(3):112-120

[40] Hedayat A, Stufken JA. Relation between pair wise balanced and variance balanced block designs. Journal of the American Statistical Association. 1989;**84**(407):753-755. DOI: 10.1080/ 01621459.1989.10478830

[41] Jones RM. On a property of incomplete blocks. Journal of the Royal Statistical Society: Series B. 1959;**21**: 172-179. Available from: http://www. jstor.org/stable/2983939

[42] Gupta SC, Jones B. Equi-replicate balanced block designs with unequal block sizes. Biometrika. 1983;**70**(2): 443-440. 443C440. DOI: 10.1093/ biomet/70.2.433

#### **Chapter 9**

## Estimation of Means of Two Quantitative Sensitive Variables Using Randomized Response Technique

*Amod Kumar*

## **Abstract**

I propose an improved randomized response model for the simultaneous estimation of population means of two quantitative sensitive variables by using blank card option that make use of one scramble response and another fake response. The properties of the proposed estimator have been analysed. To judge the performance of the proposed model, I have considered a real data set and it is to be pointed out that the proposed model is more efficient in terms of relative efficiencies and privacy protection of respondents as well. Suitable recommendations have been made to the survey practitioners.

**Keywords:** randomized response technique, two quantitative sensitive variables, estimation of two means\, blank card, privacy protection

#### **1. Introduction**

Reliability of data is compromised when sensitive topics on embarrassing or illegal acts such as students taking drugs, drunk driving, abortion, family income, tax evasion etc. are required in direct method of data collection in sample survey. Survey on human population has established the fact that the direct question about sensitive characters often results in either refusal to respond or falsification of the answer. To overcome this difficulty and ensure confidentially of respondents, Warner [1] initiated a technique which is called as randomized response technique (RRT). For estimating π, the population proportion of respondents, a simple random sample of size n respondents selected from the population N with replacement. Each respondent selected in the sample has a random device which consists two statements "I belong to sensitive group A" and "its compliment Ac ". The respondent answers of sensitive or non-sensitive questions depending on the outcome of the random device which is unobservable to the sampler. Greenberg et al. [2] adjusted the Warner [1] model with respect to efficiency and respondent's cooperation by suggesting unrelated question randomized response model, where the sensitive question was combined with an unrelated (non-sensitive) question.

Greenberg et al. [3] extended the Greenberg et al. [2] model to estimate the population mean of quantitative sensitive variable, such as income, tax dodging etc. In their model, each respondent selected in the sample with replacement was given

a random device which presents two outcomes Y and X with probabilities P and ð Þ <sup>1</sup>‐<sup>P</sup> respectively, where Y is the true quantitative sensitive variable and X is nonsensitive independent variable. Later, Eichhorn and Hayre [4] introduced a new multiplicative randomized response model for estimating the population mean of quantitative sensitive variable.

Under simple random sampling with replacement (SRSWR) scheme, Perri [5] modified Greenberg et al. [3] technique to obtain the estimator of population mean μ<sup>Y</sup> by using a blank card option, if a blank card is selected then the respondents are requested to use Greenberg et al. [3] model. In his model, the observed response θ<sup>P</sup> is given by:

$$\boldsymbol{\Theta}\_{\mathcal{P}} = \begin{cases} \mathbf{Y} & \text{with probability } \mathbf{P}\_1 \\ \mathbf{Y} & \text{with probability } \mathbf{P}\_2 \\ \text{Blank } \mathbf{Card} & \text{with probability } \mathbf{P}\_3 \end{cases} \tag{1}$$

Perri [5] proposed an unbiased estimator of the population mean μ<sup>Y</sup>

$$
\hat{\mu}\_{\rm P} = \frac{\Theta\_{\rm P} \{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1} \cdot \mathbf{P})\} \mu\_{\rm X}}{(\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P})} \tag{2}
$$

with variance

$$\mathbf{V}(\hat{\mu}\_{\rm p}) = \frac{\sigma\_{\theta p}^{2}}{\mathbf{n}(\mathbf{P}\_{\rm 1} + \mathbf{P}\_{\rm 3}\mathbf{P})^{2}} \tag{3}$$

where θ<sup>P</sup> ¼ 1*=*n P<sup>n</sup> <sup>i</sup>¼<sup>1</sup>θPi and

$$\begin{split} \sigma\_{\boldsymbol{\Theta}\mathbf{p}}^{2} &= (\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P}) \big( \sigma\_{\mathbf{Y}}^{2} + \boldsymbol{\mu}\_{\mathbf{Y}}^{2} \big) \\ &+ \{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}(\mathbf{1}\cdot\mathbf{P})\big) \big( \sigma\_{\mathbf{X}}^{2} + \boldsymbol{\mu}\_{\mathbf{X}}^{2} \big) \big[ (\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P})\boldsymbol{\mu}\_{\mathbf{Y}} + \{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}(\mathbf{1}\cdot\mathbf{P})\big) \boldsymbol{\mu}\_{\mathbf{X}} \big]^{2} \end{split}$$

Many different suggestions have been made for the use of these blank cards by various authors including Bhargava and Singh [6], Singh et al. [7], Batool et al. [8], Singh [9] and Singh et al. [10, 11] among others. Furthermore in addition, the theory of randomized response technique to estimate the population parameters of sensitive characteristics was extended by Narjis and Shabbir [12, 13].

Recently, Ahmed et al. [14] have introduced the idea to estimate the means of two quantitative sensitive variables simultaneously by using one scramble response and other face response. Let Y1i and Y2i be the two values of quantitative sensitive variables with means <sup>μ</sup>Y1, <sup>μ</sup>Y2 ð Þ and variances <sup>σ</sup><sup>2</sup> Y1, σ<sup>2</sup> Y2 � � respectively connected with the ith unit in the populationN. The parameters of interest are <sup>μ</sup>Y1, <sup>μ</sup>Y2 ð Þ which are to be estimated. Each respondent selected in the sample with replacement is asked to produce two fake values of scramble variables S1 and S2 from two known distributions. Let S1 and S2 be the independent scramble variables with known means ð Þ θ1, θ<sup>2</sup> and variance γ20, γ<sup>02</sup> ð Þ respectively, which help to maintain the protection of respondents. Ahmed et al. [14] defined the scramble response as:

$$\mathbf{Z\_{1i}} = \mathbf{S\_{1}Y\_{1i}} + \mathbf{S\_{2}Y\_{2i}} \tag{4}$$

Each respondent selected in the sample is also requested to draw a card from the deck which consist two types of cards, similar to Warner [1] model but has different type of outcomes. Let P be the probability of cards bearing the statements in the deck, "the selected respondent to report scramble response as S1" and 1ð Þ ‐<sup>P</sup> is the

probability of cards bearing the statement in the deck, "the selected respondent to report scramble response as S2". Thus, the second response from the ith respondent given as:

$$\mathbf{Z\_i} = \begin{cases} \mathbf{S\_1} & \text{with probability P} \\ \mathbf{S\_2} & \text{with probability (1-P)} \end{cases} \tag{5}$$

where P 6¼ <sup>θ</sup>1γ<sup>20</sup> θ1γ<sup>20</sup> þ θ2γ<sup>02</sup>

Ahmed et al. [14] proposed unbiased estimators of population means μY1 and μY2 respectively, and are given as:

$$\hat{\mu}\_{\rm{AY1}} = \frac{\left[\mathbf{P}\Theta\_1\Theta\_2 + (\mathbf{1}\cdot\mathbf{P})\left(\chi\_{02} + \Theta\_2^2\right)\right]\overline{\mathbf{Z}}\_1\Theta\_2\overline{\mathbf{Z}}\_2}{(\mathbf{1}\cdot\mathbf{P})\Theta\_1\chi\_{02}\mathbf{-}\mathbf{P}\Theta\_2\chi\_{20}}\tag{6}$$

and

$$\hat{\mu}\_{\rm{AY2}} = \frac{\boldsymbol{\Theta}\_{1}\overline{\mathbf{Z}}\_{2^{\*}}\left[\mathbf{P}\left(\boldsymbol{\gamma}\_{20} + \boldsymbol{\Theta}\_{1}^{2}\right) + (\mathbf{1}\cdot\mathbf{P})\boldsymbol{\Theta}\_{1}\boldsymbol{\Theta}\_{2}\right]\overline{\mathbf{Z}}\_{1}}{(\mathbf{1}\cdot\mathbf{P})\boldsymbol{\Theta}\_{1}\boldsymbol{\gamma}\_{02}\cdot\mathbf{P}\boldsymbol{\Theta}\_{2}\boldsymbol{\gamma}\_{20}}\tag{7}$$

where Z1 <sup>¼</sup> <sup>1</sup> n P<sup>n</sup> <sup>i</sup>¼<sup>1</sup>Z1i and Z2 <sup>¼</sup> <sup>1</sup> n P<sup>n</sup> <sup>i</sup>¼<sup>1</sup>Z2i

In follow up of above works and motivated by Ahmed et al. [14], I adopt Perri [5] method and proposes a new improved randomized response model by introducing blank card option for estimation of population means of two quantitative sensitive variables. For example, Y1 may stand for the respondents' income and Y2 may stand for the respondents' expenditure, Y1 denotes the import in millions and Y2 denotes the export in millions etc. I have demonstrated the efficacious performance of the proposed randomized response model over the Ahmed et al. [14] model along with privacy protection of respondents.

#### **2. Proposed model**

In the proposed model, I have considered the similar supposition as it is the case of Ahmed et al. [14] procedure with the modification that the second response of Warner [1] method is replaced with Perri [5] blank card method. Proceeding on the lines of Ahmed et al. [14] as given in their model, the first observed response is given by:

$$\mathbf{Z\_{1i}} = \mathbf{S\_1Y\_{1i}} + \mathbf{S\_2Y\_{2i}} \tag{8}$$

Noted that by mixing of two quantitative sensitive variables with two scramble variables will make more comfortable to respondent about providing information because it make very hard to guess the true value of two quantitative sensitive variables to an interviewer.

Here I differ from the existing randomized response model available in the literature, in that, the second response is replaced with Perri [5] procedure but has different outcomes. Each selected respondent in the sample provided a random device which consists three type of cards bearing the statements (i) green cards with the statement: report scramble variable S1, (ii) red cards with the statement: report scramble variable S2 and (iii) yellow card with no statement (blank cards) with probabilities P1, P2 and P3 respectively such that P<sup>3</sup> <sup>i</sup>¼<sup>1</sup>Pi <sup>¼</sup> 1. Thus, the second response ZAi in the proposed model from ith respondent is given by:

$$\mathbf{Z\_{Ai}} = \begin{cases} \mathbf{S\_1} & \text{with probability } \mathbf{P\_1} \\ \mathbf{S\_2} & \text{with probability } \mathbf{P\_2} \\ \text{Blank } \text{card} & \text{with probability } \mathbf{P\_3} \end{cases} \tag{9}$$

If a blank card is selected, the respondents are requested to use Ahmed et al. [14] second response. Thus, the second response ZAi can be rewritten as:

$$\mathbf{Z}\_{\rm Ai} = \begin{cases} \mathbf{S}\_1 & \text{with probability } \mathbf{P}\_1 \\ \mathbf{S}\_2 & \text{with probability } \mathbf{P}\_2 \\ \begin{Bmatrix} \mathbf{S}\_1 & \text{with probability } \mathbf{P} \\ \mathbf{S}\_2 & \text{with probability } (1 \text{-P}) \end{Bmatrix} & \text{with probability } \mathbf{P}\_3 \end{cases} \tag{10}$$

where P 6¼ ð Þ P2 <sup>þ</sup> P3 <sup>θ</sup>1γ02‐P2θ2γ<sup>20</sup> P3 θ1γ<sup>02</sup> þ θ2γ<sup>20</sup> ð Þ

Taking expectation on both sides of (Eq. (8)), I have

$$\mathbf{E}(\mathbf{Z}\_{1i}) = \mathbf{E}(\mathbf{S}\_1 \mathbf{Y}\_{1i} + \mathbf{S}\_2 \mathbf{Y}\_{2i}) = \theta\_1 \mu\_{\mathbf{Y}1} + \theta\_2 \mu\_{\mathbf{Y}2} \tag{11}$$

With the help from Eqs. (8) and (10), I generate a new response Z<sup>0</sup> 2i as:

$$\begin{aligned} \mathbf{Z}\_{2i}^{2} &= \mathbf{Z}\_{\text{i}} \mathbf{Z}\_{\text{Ai}} \\ &= \begin{cases} \mathbf{S}\_{\text{1}}^{2} \mathbf{Y}\_{\text{1i}} + \mathbf{S}\_{\text{1}} \mathbf{S}\_{\text{2}} \mathbf{Y}\_{\text{2i}} & \text{with probability } \mathbf{P}\_{\text{1}} \\ \mathbf{S}\_{\text{1}} \mathbf{S}\_{\text{2}} \mathbf{Y}\_{\text{1i}} + \mathbf{S}\_{\text{2}}^{2} \mathbf{Y}\_{\text{2i}} & \text{with probability } \mathbf{P}\_{\text{2}} \\ \begin{cases} \mathbf{S}\_{\text{1}}^{2} \mathbf{Y}\_{\text{1i}} + \mathbf{S}\_{\text{1}} \mathbf{S}\_{\text{2}} \mathbf{Y}\_{\text{2i}} & \text{with probability } \mathbf{P}\_{\text{1}} \\ \mathbf{S}\_{\text{1}} \mathbf{S}\_{\text{2}} \mathbf{Y}\_{\text{1i}} + \mathbf{S}\_{\text{2}}^{2} \mathbf{Y}\_{\text{2i}} & \text{with probability } (\mathbf{1} \cdot \mathbf{P}) \end{cases} \end{aligned} \tag{12}$$

Taking expectation on both sides of (Eq. (12)), I get

$$\begin{array}{l} \mathbf{E}\left(\mathbf{Z}\_{2i}'\right) = (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \left[ (\chi\_{20} + \Theta\_1^2) \mu\_{\mathbf{Y}1} + \Theta\_1 \Theta\_2 \mu\_{\mathbf{Y}2} \right] \\ \quad + \{\mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P})\} \left[ \Theta\_1 \Theta\_2 \mu\_{\mathbf{Y}1} + \left( \chi\_{02} + \Theta\_2^2 \right) \mu\_{\mathbf{Y}2} \right] \end{array} \tag{13}$$

Using the method of moments on Eqs. (11) and (13), I have:

$$
\Theta\_1 \hat{\mu}\_{\text{Y1}} + \Theta\_2 \hat{\mu}\_{\text{Y2}} = \frac{1}{\mathbf{n}} \sum\_{i=1}^{\mathbf{n}} \mathbf{Z}\_{\text{li}} \tag{14}
$$

and

$$\begin{aligned} \left[ \left( \mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P} \right) \left( \mathbf{y}\_{20} + \theta\_1^2 \right) + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \Theta\_1 \Theta\_2 \right] \hat{\boldsymbol{\mu}}\_{\text{Y1}} \\ + \left[ \left( \mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P} \right) \Theta\_1 \Theta\_2 + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \left( \mathbf{y}\_{02} + \theta\_2^2 \right) \right] \hat{\boldsymbol{\mu}}\_{\text{Y2}} = \frac{1}{\mathbf{n}} \sum\_{i=1}^{n} \mathbf{Z}\_{2i}' \end{aligned} \tag{15}$$

Eqs. (14) and (15) can be rewritten as:

$$
\begin{bmatrix}
\Theta\_1, & & \Theta\_2 \\
\left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\left(\mathbf{\varprojlim}\_{20} + \Theta\_1^2\right) & & (\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P})\Theta\_1\Theta\_2 \\
+ \{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\Theta\_1\Theta\_2, & + \{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\left(\mathbf{\varprojlim}\_{02} + \Theta\_2^2\right)
\end{bmatrix}
\begin{bmatrix}
\hat{\boldsymbol{\mu}}\_{\mathbf{Y}1} \\
\hat{\boldsymbol{\mu}}\_{\mathbf{Y}2}
\end{bmatrix} = \begin{bmatrix}
\overline{\mathbf{Z}}\_1 \\
\overline{\mathbf{Z}}\_2'
\end{bmatrix} \tag{16}
$$

Applying Cramer's rule on Eq. (16), I obtain

Δ ¼ θ1, θ2 ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1θ2, Pð Þ <sup>1</sup> <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � � � � <sup>¼</sup> <sup>θ</sup><sup>1</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � ‐ <sup>θ</sup><sup>2</sup> ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1θ<sup>2</sup> � � <sup>¼</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1γ02‐ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>2γ<sup>20</sup> (17) <sup>Δ</sup><sup>1</sup> <sup>¼</sup> Z1, <sup>θ</sup><sup>2</sup> Z0 2, Pð Þ <sup>1</sup> <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � � � � � � � � � <sup>¼</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � Z1‐θ2Z<sup>0</sup> <sup>2</sup> (18)

and

$$
\Delta\_2 = \begin{vmatrix} \theta\_1 & \overline{Z}\_1 \\ \left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\left(\mathbf{\gamma}\_{20} + \theta\_1^2\right) + \left\{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\right\}\theta\_1\theta\_2, & \overline{Z}\_2^\prime \\\\ \theta\_1\overline{Z}\_2^\prime\left[\left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\left(\mathbf{\gamma}\_{20} + \theta\_1^2\right) + \left\{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\right\}\theta\_1\theta\_2\right]\overline{Z}\_1 \end{vmatrix} \tag{19}
$$

Thus, the estimators of the population mean μY1 and μY2 are respectively given by:

$$\hat{\mu}\_{\rm Y1} = \frac{\Delta\_{\rm 1}}{\Delta} = \frac{\left[ (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \theta\_1 \theta\_2 + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \left( \gamma\_{02} + \theta\_2^2 \right) \right] \overline{\mathbf{Z}}\_1 \cdot \theta\_2 \overline{Z}\_2}{\left[ \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \theta\_1 \gamma\_{02} \cdot (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \theta\_2 \gamma\_{20} \right]} \tag{20}$$

and

$$\hat{\mu}\_{\rm Y2} = \frac{\Delta\_2}{\Delta} = \frac{\Theta\_1 \overline{Z}\_2^- \left[ (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \left( \chi\_{20} + \Theta\_1^2 \right) + \{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \} \Theta\_1 \Theta\_2 \right] \overline{Z}\_1}{\left[ \{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \} \Theta\_1 \chi\_{02} \cdot (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \Theta\_2 \chi\_{20} \right]} \tag{21}$$

I have the following theorems.

**Theorem 1:** μ^Y1 is an unbiased estimator of the population mean μY1.

$$\mathcal{E}(\hat{\mu}\_{\mathbf{Y}1}) = \mu\_{\mathbf{Y}1} \tag{22}$$

� � � � �

**Proof:** Taking expectation on both sides of Eq. (20), I have

$$\mathbb{E}(\boldsymbol{\hat{\mu}}\_{\mathbf{Y}\mathbf{1}}) = \frac{\left[\left(\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P}\right)\boldsymbol{\theta}\_{\mathbf{1}}\boldsymbol{\theta}\_{\mathbf{2}} + \left\{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}\left(\mathbf{1}\cdot\mathbf{P}\right)\right\}\left(\boldsymbol{\gamma}\_{02} + \boldsymbol{\theta}\_{\mathbf{2}}^{2}\right)\right] \mathbb{E}\left(\overline{\mathbf{Z}}\_{1}\right) \cdot \boldsymbol{\theta}\_{\mathbf{2}} \mathbb{E}\left(\overline{Z}\_{2}\right)}{\left[\left\{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}\left(\mathbf{1}\cdot\mathbf{P}\right)\right\}\boldsymbol{\theta}\_{\mathbf{1}}\boldsymbol{\gamma}\_{02} \cdot \left(\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P}\right)\boldsymbol{\theta}\_{\mathbf{2}}\boldsymbol{\gamma}\_{20}\right]} \right]$$

¼ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h i � � <sup>1</sup>*=*<sup>n</sup> Pn <sup>i</sup>¼1E Z1i ð Þ‐θ21*=*<sup>n</sup> Pn <sup>i</sup>¼1E Z<sup>0</sup> 2i � � P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ1γ02‐ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>2γ<sup>20</sup> � � ¼ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h i � � <sup>P</sup><sup>n</sup> i¼1 θ1μY1 þ θ2μY2 ð Þ ‐ <sup>θ</sup><sup>2</sup> Pn <sup>i</sup>¼<sup>1</sup> ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � �μY1 <sup>þ</sup> <sup>θ</sup>1θ2μY2 n o <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>θ</sup>1θ2μY1 <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � �μY2 <sup>h</sup> n oi n P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ1γ02‐ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>2γ<sup>20</sup> � �

After simplification, I get

$$=\frac{[\{\mathbf{P}\_2+\mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\theta\_1\chi\_{02}\mathbf{-}(\mathbf{P}\_1+\mathbf{P}\_3\mathbf{P})\Theta\_2\chi\_{20}]\mu\_{\mathbf{Y}1}}{[\{\mathbf{P}\_2+\mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\theta\_1\chi\_{02}\mathbf{-}(\mathbf{P}\_1+\mathbf{P}\_3\mathbf{P})\Theta\_2\chi\_{20}]}=\mu\_{\mathbf{Y}1}.$$

which completes the proof.

**Theorem 2:** μ^Y2 is an unbiased estimator of the population mean μY2.

$$\mathbf{E}(\hat{\mu}\_{\rm Y2}) = \mu\_{\rm Y2} \tag{23}$$

**Proof:** Taking expectation on both sides of Eq. (21), I have

$$\hat{\mu}\_{\rm Y2} = \frac{\theta\_1 \mathbb{E}\left(\overline{\mathbf{Z}}\_2'\right) \cdot \left[ \left( \mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P} \right) \left( \mathbf{y}\_{20} + \theta\_1^2 \right) + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \theta\_1 \theta\_2 \right] \mathbb{E}(\overline{\mathbf{Z}}\_1)}{\left[ \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \theta\_1 \gamma\_{02} \cdot \left( \mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P} \right) \theta\_2 \gamma\_{20} \right]}$$

Similarly, following the pattern as given in Theorem 1, I obtain

$$=\frac{[\{\mathbf{P}\_2+\mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\theta\_1\chi\_{02}\mathbf{-}(\mathbf{P}\_1+\mathbf{P}\_3\mathbf{P})\theta\_2\chi\_{20}]\mu\_{\mathbf{Y}2}}{[\{\mathbf{P}\_2+\mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\}\theta\_1\chi\_{02}\mathbf{-}(\mathbf{P}\_1+\mathbf{P}\_3\mathbf{P})\theta\_2\chi\_{20}]}=\mu\_{\mathbf{Y}2}\dots$$

hence, it is proved.

**Theorem 3:** The variance of the unbiased estimator μ^Y1 is given by:

$$\left[\left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\boldsymbol{\Theta}\_1\boldsymbol{\Theta}\_2 + \left\{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\right\}\left(\boldsymbol{\chi}\_{02} + \boldsymbol{\Theta}\_2^2\right)\right]^2 \sigma\_{\mathbf{Z}\_4}^2 + \boldsymbol{\Theta}\_2^2 \sigma\_{\mathbf{Z}\_2'}^2$$

$$\mathbf{V}(\boldsymbol{\hat{\mu}}\_{\mathbf{Y}1}) = \frac{\cdot\cdot\cdot\mathbf{2}\theta\_2\left[\left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\theta\_1\theta\_2 + \left\{\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\right\}\left(\boldsymbol{\chi}\_{02} + \boldsymbol{\Theta}\_2^2\right)\right]\sigma\_{\mathbf{Z}\_i\mathbf{Z}\_2'}}{\mathbf{n}\left[\left(\mathbf{P}\_2 + \mathbf{P}\_3(\mathbf{1}\cdot\mathbf{P})\right)\theta\_1\chi\_{02}\mathbf{\color{red}{.}}\left(\mathbf{P}\_1 + \mathbf{P}\_3\mathbf{P}\right)\theta\_2\chi\_{20}\right]^2} \tag{24}$$

$$\text{where } \sigma\_{\text{Z}\_4}^2 = \chi\_{20} \left(\sigma\_{\text{Y}1}^2 + \mu\_{\text{Y}1}^2\right) + \chi\_{02} \left(\sigma\_{\text{Y}2}^2 + \mu\_{\text{Y}2}^2\right) + \theta\_1^2 \sigma\_{\text{Y}1}^2 + \theta\_2^2 \sigma\_{\text{Y}2}^2 + 2\theta\_1 \theta\_2 \sigma\_{\text{Y}1} \sigma\_{\text{Y}2},$$

σ2 Z0 2 <sup>¼</sup> <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>40</sup> <sup>þ</sup> <sup>4</sup>γ30θ<sup>1</sup> <sup>þ</sup> <sup>6</sup>γ20θ<sup>2</sup> <sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h � �i <sup>þ</sup> <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>04</sup> <sup>þ</sup> <sup>4</sup>γ03θ<sup>2</sup> <sup>þ</sup> <sup>6</sup>γ02θ<sup>2</sup> <sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 2 � � <sup>þ</sup> P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h � �i <sup>þ</sup><sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ P1 <sup>þ</sup> P3P � �θ<sup>2</sup> <sup>γ</sup><sup>30</sup> <sup>þ</sup> <sup>3</sup>γ20θ<sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ<sup>1</sup> <sup>γ</sup><sup>03</sup> <sup>þ</sup> <sup>3</sup>γ02θ<sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 2 h � �i ‐ P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ1θ<sup>2</sup> n oμY1 <sup>h</sup> <sup>þ</sup> P1 <sup>þ</sup> P3P � �θ1θ<sup>2</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 n o � � <sup>μ</sup>Y2i<sup>2</sup>

and

σZ1Z<sup>0</sup> <sup>2</sup> <sup>¼</sup> <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>30</sup> <sup>þ</sup> <sup>3</sup>γ20θ<sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 1 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup><sup>2</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � � � <sup>þ</sup> <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup><sup>1</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>03</sup> <sup>þ</sup> <sup>3</sup>γ02θ<sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 2 � � � � <sup>þ</sup><sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup><sup>2</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup><sup>1</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � ‐ <sup>θ</sup>1μY1 <sup>þ</sup> <sup>θ</sup>2μY12 ð Þ ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1θ<sup>2</sup> � �μY1 � <sup>þ</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � <sup>μ</sup>Y2�

The proof is given in Appendix.

**Theorem 4:** The variance of the unbiased estimator μ^Y2 is given by:

$$\Theta\_1^2 \sigma\_{\mathbf{Z}\_2'}^2 + \left[ (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \left( \chi\_{20} + \theta\_1^2 \right) + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \Theta\_1 \Theta\_2 \right]^2 \sigma\_{\mathbf{Z}\_1}^2$$

$$\mathbf{V}(\hat{\mu}\_{\mathbf{Y}2}) = \frac{\cdot \cdot 2 \Theta\_1 \left[ (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \left( \chi\_{20} + \theta\_1^2 \right) + \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \Theta\_1 \Theta\_2 \right] \sigma\_{\mathbf{Z}\_1 \mathbf{Z}\_2}}{\mathbf{n} \left[ \left\{ \mathbf{P}\_2 + \mathbf{P}\_3 (\mathbf{1} \cdot \mathbf{P}) \right\} \Theta\_1 \chi\_{0 2} \cdot (\mathbf{P}\_1 + \mathbf{P}\_3 \mathbf{P}) \Theta\_2 \chi\_{20} \right]^2} \tag{25}$$

**Proof:** The proof is similar as given in Theorem 3.

In the next section, I discuss a privacy protection measure to compare the respondent's privacy protection and efficiency for the considered model.

#### **3. Privacy protection measure**

A number of measures have been introduced in the literature to estimate the performance of competitive strategies taking into account both efficiency and respondent privacy protection. For a discussion on privacy protection measures for randomized response survey of stigmatizing character, see Lanke [15], Leyseiffer and Warner [16], Bhargava and Singh [17] and among other. These measures of privacy protection are based on the qualitative characters.

When dealing with quantitative sensitive variable, the respondent privacy is conserved by asking interviewees to algebraically scramble the true response by means of a coding mechanism. Respondent's privacy protection measures for quantitative sensitive variable have been investigated by Diana and perri [18] and Zhimin et al. [19] which is based on the square of correlation coefficient i.e. ρ2 <sup>y</sup><sup>θ</sup> ∈½ � 0, 1 . Later, Diana and Perri [20] introduced the new measure of privacy protection of respondents by using auxiliary variable. These measures are normalized with zero (one) denoting maximum (minimum) privacy protection. Recently, Singh et al. [11] considered the case when no auxiliary variable is available in the procedure and studied the normalized measure of respondent privacy. This normalized measure allows researchers to attain a trade-off between efficiency and privacy. Moreover, it is worth remarking that if one procedure is more efficient than other, then it will be less protective. Thus, all the provided measures using the randomized procedure for the privacy protection, they have concluded for a measure of respondent's privacy protection having a trade-off between these two aspects.

$$
\pi = \mathbf{1} \cdot \rho\_{\mathbf{y}\boldsymbol{\theta}}^2 \tag{26}
$$

The values of τ closer to 1 indicates more privacy protection and greater cooperation may be expected using randomized response models while τ closer to zero denotes that the privacy protection is completely violated. Now, I use this normalized measure for comparing the trade-off between efficiency and privacy protection.

In the proposed model, there are two quantitative sensitive variables Y1i and Y2i associated with the second observed response Z<sup>0</sup> 2i. Following Section 2, I compute the square of correlation coefficients between the second observed response Z<sup>0</sup> 2i and quantitative sensitive variables Y1i and Y2i respectively, and are given as:

$$\rho\_{\mathbf{Y}\_{\mathbf{i}\mathbf{i}}\mathbf{Z}\_{\mathbf{i}}'}^{2} = \frac{\left[\left(\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P}\right)\left\{\left(\mathbf{Y}\_{\mathbf{2}0} + \mathbf{6}\_{\mathbf{1}}^{2}\right)\sigma\_{\mathbf{Y}\mathbf{1}}^{2} + \mathbf{\theta}\_{\mathbf{1}}\mathbf{\theta}\_{\mathbf{2}}\sigma\_{\mathbf{Y}\mathbf{1}}\sigma\_{\mathbf{Y}\mathbf{2}}\right\} + \left\{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}\mathbf{(1}\cdot\mathbf{P}\right\}\left\{\left(\mathbf{\theta}\_{\mathbf{1}}\mathbf{\theta}\_{\mathbf{2}}\sigma\_{\mathbf{Y}\mathbf{1}}^{2} + \left(\mathbf{\theta}\_{\mathbf{2}0} + \mathbf{\theta}\_{\mathbf{2}}^{2}\right)\sigma\_{\mathbf{Y}\mathbf{1}}\sigma\_{\mathbf{Y}\mathbf{2}}\right\}\right\}^{2} \tag{27}$$

and

$$\rho^2\_{\mathbf{y}\_{\mathbf{Z}}\mathbf{Z}\_{\mathbf{z}}'} = \frac{\left[\left(\mathbf{P}\_{\mathbf{1}} + \mathbf{P}\_{\mathbf{3}}\mathbf{P}\right)\left\{\left(\mathbf{Y}\_{\mathbf{2}0} + \mathbf{6}\_{\mathbf{1}}^2\right)\sigma\_{\mathbf{Y}\mathbf{1}}\sigma\_{\mathbf{Y}\mathbf{2}} + \mathbf{6}\_{\mathbf{1}}\mathbf{6}\_{\mathbf{2}}\sigma\_{\mathbf{Y}\mathbf{2}}^2\right\} + \left\{\mathbf{P}\_{\mathbf{2}} + \mathbf{P}\_{\mathbf{3}}\mathbf{(1}\cdot\mathbf{P}\right\}\left\{\left(\mathbf{\theta}\_{\mathbf{1}}\mathbf{\theta}\_{\mathbf{2}}\sigma\_{\mathbf{Y}\mathbf{1}}\sigma\_{\mathbf{Y}\mathbf{2}} + \left(\mathbf{y}\_{20} + \mathbf{6}\_{\mathbf{2}}^2\right)\sigma\_{\mathbf{Y}\mathbf{2}}^2\right)\right\}^2}{\sigma\_{\mathbf{Y}\mathbf{2}}^2\sigma\_{\mathbf{Z}\_{\mathbf{z}}'}^2}$$

where σ<sup>2</sup> Z0 is given in Theorem 3.

2 Now, I define the measure of respondent's privacy protection associated with the proposed second response Z0 2i as:

$$\mathsf{tr}\_{\mathsf{P}\{}} = \mathsf{1} \cdot \mathsf{p}\_{\mathsf{Y}\_{\mathsf{Y}} \mathsf{Z}\_{\mathsf{Y}}'}^{2}, \mathsf{J} = \mathsf{1}, \ \mathsf{2} \tag{29}$$

I also define the square of correlation coefficients for the Ahmed et al. [14] response model. In the case of Ahmed et al. [14] model, there are also two quantitative sensitive variables associated with the second observed response. Thus, the square of correlation coefficients between the second observed response and quantitative sensitive variables Y1i and Y2i are respectively given by:

$$
\rho\_{\mathbf{y}\_{\mathbf{z}}\mathbf{Z}\_{\mathbf{z}}}^{2} = \frac{\left[\mathbf{P}\left\{\left(\mathbf{\gamma}\_{20} + \mathbf{\theta}\_{1}^{2}\right)\sigma\_{\mathbf{Y}1}^{2} + \mathbf{\theta}\_{1}\mathbf{\theta}\_{2}\sigma\_{\mathbf{Y}1}\sigma\_{\mathbf{Y}2}\right\} + (\mathbf{1}\cdot\mathbf{P})\left\{\mathbf{\theta}\_{1}\mathbf{\theta}\_{2}\sigma\_{\mathbf{Y}1}^{2} + \left(\mathbf{\gamma}\_{20} + \mathbf{\theta}\_{2}^{2}\right)\sigma\_{\mathbf{Y}1}\sigma\_{\mathbf{Y}2}\right\}\right]^{2}}{\sigma\_{\mathbf{Y}1}^{2}\sigma\_{\mathbf{Z}\_{\mathbf{z}}}^{2}} \tag{30}
$$

$$
\rho\_{\mathbf{y}\_{\mathbf{z}}\mathbf{Z}\_{\mathbf{z}}}^{2} = \frac{\left[\mathbf{P}\left\{\left(\mathbf{\gamma}\_{20} + \mathbf{\theta}\_{1}^{2}\right)\sigma\_{\mathbf{Y}1}\sigma\_{\mathbf{Y}2} + \mathbf{\theta}\_{1}\mathbf{\theta}\_{2}\sigma\_{\mathbf{Y}2}^{2}\right\} + (\mathbf{1}\cdot\mathbf{P})\left\{\mathbf{\theta}\_{1}\mathbf{\theta}\_{2}\sigma\_{\mathbf{Y}1}\sigma\_{\mathbf{Y}2} + \left(\mathbf{\gamma}\_{20} + \mathbf{\theta}\_{2}^{2}\right)\sigma\_{\mathbf{Y}2}^{2}\right\}\right]^{2}}{\sigma\_{\mathbf{Y}2}^{2}\sigma\_{\mathbf{Z}\_{\mathbf{z}}}^{2}} \tag{31}
$$

where

σ2 Z2 <sup>¼</sup> <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 <sup>P</sup> <sup>γ</sup><sup>40</sup> <sup>þ</sup> <sup>4</sup>γ30θ<sup>1</sup> <sup>þ</sup> <sup>6</sup>γ20θ<sup>2</sup> <sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 1 <sup>þ</sup> ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 <sup>þ</sup> <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>04</sup> <sup>þ</sup> <sup>4</sup>γ03θ<sup>2</sup> <sup>þ</sup> <sup>6</sup>γ02θ<sup>2</sup> <sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 2 <sup>þ</sup> <sup>P</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 <sup>þ</sup> <sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ <sup>P</sup>θ<sup>2</sup> <sup>γ</sup><sup>30</sup> <sup>þ</sup> <sup>3</sup>γ20θ<sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 1 <sup>þ</sup> ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup><sup>1</sup> <sup>γ</sup><sup>03</sup> <sup>þ</sup> <sup>3</sup>γ02θ<sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 2 ‐ <sup>P</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 <sup>þ</sup> ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1θ<sup>2</sup> <sup>μ</sup>Y1 <sup>þ</sup> <sup>P</sup>θ1θ<sup>2</sup> <sup>þ</sup> ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 <sup>μ</sup>Y2 <sup>2</sup>

I also define the measure of respondent's privacy protection for Ahmed et al. [14] as:

$$\mathsf{tr}\_{\mathsf{A}\downarrow} = \mathsf{1} \cdot \mathsf{p}^{2}\_{\mathsf{Y}\_{\mathsf{ji}}\mathsf{Z}\_{\mathsf{A}}}, \mathsf{J} = \mathsf{1}, \ \mathsf{2} \tag{32}$$

In the next section, I investigate the performance of the proposed model with respect to Ahmed et al. [14] model in terms of relative efficiency and privacy protection under different parametric situations.

#### **4. Efficiency vs privacy protection**

The relative efficiency (RE) of the proposed estimators μ^Y1 and μ^Y2 over Ahmed et al. [14] the estimators μ^AY1 and μ^AY2 are respectively given by:

$$\text{RE}\_{\text{J}}(\hat{\mu}\_{\text{Y}\text{J}}, \ \hat{\mu}\_{\text{AY}\text{J}}) = \frac{\text{V}(\hat{\mu}\_{\text{AY}\text{J}})}{\text{V}(\hat{\mu}\_{\text{Y}\text{J}})}, \text{J} = \mathbf{1}, \ \mathbf{2} \tag{33}$$

To have a possible trade-off between relative efficiency and privacy protection of respondents, I consider the parametric values in this manner that the relative efficiencies are maximum and expect greater privacy protection of respondents. I decided to take P ¼ 0*:*6, μY1 ¼ 25–45 with a step 5 and μY2 ¼ 35–55 with a step 5, five values of θ<sup>1</sup> and θ2, equal to 2–10 and 4–16 with a increment 2 and 3

respectively, σY1 ¼ 7, σY2 ¼ 5, γ<sup>20</sup> ¼ 2, γ<sup>02</sup> ¼ 9, γ<sup>30</sup> ¼ 1*:*5, γ<sup>03</sup> ¼ 1*:*2, γ<sup>40</sup> ¼ 3*:*2 and γ<sup>04</sup> ¼ 3*:*5. I have also chosen different values of probabilities Pi ð Þ i ¼ 1, 2, 3 and presented in **Tables 1** and **2**.

**Tables 1** and **2** show how the proposed model works in term of efficiency along with privacy protection. For the situation under investigate it emerges that the proposed model based on blank card method is more efficient than Ahmed et al. [14] model. Hence, the finding results, which are worth discussing, are described in the following points.



#### **Table 1.**

*Relative efficiency of the proposed estimator μ*^*Y1 with respect to Ahmed et al. [14] estimator μ*^*AY1 and privacy protection of the τP1 and τA1.*

attains when Pi ¼ 0*:*05 ið Þ ¼ 1, 2 and Pi ¼ 0*:*90 with corresponding values of θ<sup>1</sup> ¼ 10 and θ<sup>2</sup> ¼ 16.


$$\mathbf{V}(\hat{\mathfrak{u}}\_{\text{AYI}]}) > \mathbf{V}(\hat{\mathfrak{u}}\_{\text{YI}}), \mathbf{J} = \mathbf{1}, \mathbf{2}$$

and

$$
\mathfrak{r}\_{\mathsf{A}\downarrow} > \mathfrak{r}\_{\mathsf{P}\downarrow}, \mathsf{J} = \mathtt{1}, \mathsf{2}
$$

Hence, I conclude that small difference in efficiency may procure substantial improvement in privacy protection of respondent. Thus, our comparisons underline the good performance, in terms of efficiency and respondent's privacy protection.

11.Therefore, the proposed randomized response model under the blank card method may be declared to be best for estimating the mean of two quantitative sensitive variables and thus may be recommended to the survey practitioners whenever they deal with extremely sensitive characteristics.

To judge the performance of the proposed model, I consider a real data CO124 of N = 124 units of Sarndal et al. [21]. A random sample of size n = 30 units are drawn from the CO124 population. Let Y1, Y2 and X be the import, export and military expenditure in the state of U.S. during the year 1983, 1983 and 1981 respectively. The parametric ranges of quantitative sensitive variables Y1 and Y2 and nonsensitive variable X have been found by using t–test and chi–square test, which are terms as μY1 ∈ ð Þ 331*:*60, 567*:*66 , μY2 ∈ ð Þ 242*:*56, 440*:*30 , μ<sup>X</sup> ∈ ð Þ 43, 171*:*16 , σY1 ∈ ð Þ 256*:*03, 432*:*17 , σY2 ∈ ð Þ 214*:*47, 362*:*02 and σ<sup>X</sup> ∈ ð Þ 138*:*90, 234*:*46 .


*Estimation of Means of Two Quantitative Sensitive Variables Using Randomized Response… DOI: http://dx.doi.org/10.5772/intechopen.101269*

#### **Table 2.**

*Relative efficiency of the proposed estimator μ*^*Y2 with respect to Ahmed et al. [14] estimator μ*^*AY2 and privacy protection of the τP2 and τA2.*

The relative efficiencies have been computed for these parameters combinations and presented in **Tables 3**–**7**.

The behaviour of the estimators in **Tables 3**–**7** indicate that the proposed estimators perform better than Perri [5] and Ahmed et al. [14] estimators in terms of efficiency.


#### The rest of the results can be read out from the given tables.



**Table 3.**

*Relative efficiency of the proposed estimator μ*^*Y1 with respect to Perri [5] estimator μ*^*<sup>P</sup> when θ<sup>1</sup>* ¼ *4 and θ<sup>2</sup>* ¼ *1.*



**Table 4.**

*Relative efficiency of the proposed estimator μ*^*Y2 with respect to Perri [5] estimator μ*^*<sup>P</sup> when θ<sup>1</sup>* ¼ *4 and θ<sup>2</sup>* ¼ *1.*



#### **Table 5.**

*Relative efficiency of the proposed estimator μ*^*Y1 with respect to Ahmed* et al. *[14] estimator μ*^*AY1 when θ<sup>1</sup>* ¼ *4 and θ<sup>2</sup>* ¼ *1.*



#### **Table 6.**

*Relative efficiency of the proposed estimator μ*^*Y2 with respect to Ahmed* et al. *[14] estimator μ*^*AY2 when θ<sup>1</sup>* ¼ *4 and θ<sup>2</sup>* ¼ *1.*





*Estimation of Means of Two Quantitative Sensitive Variables Using Randomized Response… DOI: http://dx.doi.org/10.5772/intechopen.101269*


#### **Table 7.**

*Relative efficiency of the proposed estimators* μ^Y1 *and* μ^Y2 *with respect to Ahmed et al. [14] estimators μ*^*AY1 and μ*^*AY2 respectively when θ<sup>1</sup>* ¼ *4, θ<sup>2</sup>* ¼ *1 and P* ¼ *0:9.*

#### **5. Conclusions**

The main objective of this paper is to estimate the population means of two quantitative sensitive variables. It is to be pointed out that the proposed model is more efficient in terms of relative efficiencies and respondent's privacy protection. Therefore, these results advocate that the proposed technique is appreciatively favourable in obtaining the truthful response from the respondents.

#### **Appendix**

**Proof:** Given that E Sð Þ¼ <sup>1</sup> θ<sup>1</sup> and E Sð Þ¼ <sup>2</sup> θ2. Following Singh [9], I define

$$\gamma\_{\rm rs} = \mathbb{E}[\mathbb{S}\_1 \cdot \theta\_1]^{\rm r} [\mathbb{S}\_2 \cdot \theta\_2]^{\rm s} \tag{34}$$

Then due to independence of the scramble variables, I have

$$\mathbf{E}\left(\mathbf{S}\_1^2\right) = \mathbf{\gamma}\_{20} + \mathbf{\theta}\_1^2 \tag{35}$$

$$\mathbf{E}\left(\mathbf{S}\_1^3\right) = \boldsymbol{\gamma}\_{30} + \mathbf{3}\boldsymbol{\gamma}\_{20}\boldsymbol{\theta}\_1 + \boldsymbol{\theta}\_1^3\tag{36}$$

$$\mathbf{E}\left(\mathbf{S}\_1^4\right) = \boldsymbol{\chi}\_{40} + 4\boldsymbol{\chi}\_{30}\boldsymbol{\theta}\_1 + 6\boldsymbol{\chi}\_{20}\boldsymbol{\theta}\_1^2 + \boldsymbol{\theta}\_1^4 \tag{37}$$

$$\mathbf{E}\left(\mathbf{S}\_2^2\right) = \chi\_{02} + \theta\_2^2 \tag{38}$$

$$\mathbf{E}\left(\mathbf{S}\_2^3\right) = \boldsymbol{\gamma}\_{03} + \mathbf{3}\boldsymbol{\gamma}\_{02}\boldsymbol{\Theta}\_2 + \boldsymbol{\Theta}\_2^3\tag{39}$$

$$\mathrm{E}\left(\mathrm{S}\_{2}^{4}\right) = \chi\_{04} + 4\chi\_{03}\theta\_{2} + 6\chi\_{02}\theta\_{2}^{2} + \theta\_{2}^{4} \tag{40}$$

$$\mathbf{E}(\mathbf{S}\_1 \mathbf{S}\_2) = \theta\_1 \theta\_2 \tag{41}$$

$$\mathbb{E}\left(\mathbb{S}\_1^2 \mathbb{S}\_2^2\right) = \left(\mathbb{y}\_{20} + \theta\_1^2\right) \left(\mathbb{y}\_{02} + \theta\_2^2\right) \tag{42}$$

$$\mathbf{E}\left(\mathbf{S}\_1^3 \mathbf{S}\_2\right) = \left(\boldsymbol{\chi}\_{30} + \mathbf{3}\boldsymbol{\chi}\_{20}\boldsymbol{\theta}\_1 + \boldsymbol{\Theta}\_1^3\right)\boldsymbol{\Theta}\_2\tag{43}$$

and

$$\mathbf{E}\left(\mathbf{S}\_1\mathbf{S}\_2^3\right) = \Theta\_2\left(\mathbf{\gamma}\_{03} + \mathbf{\mathcal{Y}}\_{02}\Theta\_2 + \Theta\_2^3\right) \tag{44}$$

V μ^Y1 ð Þ¼ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � <sup>2</sup>P<sup>n</sup> <sup>i</sup>¼1V Zð Þþ 1i <sup>θ</sup><sup>2</sup> 2 P<sup>n</sup> <sup>i</sup>¼1V Z<sup>0</sup> 2i � � ‐ <sup>2</sup>θ<sup>2</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � P<sup>n</sup> <sup>i</sup>¼<sup>1</sup>cov Z1iZ<sup>0</sup> 2i � � n2 f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1γ02‐ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>2γ<sup>20</sup> ½ �<sup>2</sup> ¼ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � <sup>2</sup> σ2 Z1 <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2σ2 Z0 2 ‐ <sup>2</sup>θ<sup>2</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> f g P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � � � <sup>σ</sup>Z1Z<sup>0</sup> 2 n Pf g <sup>2</sup> <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> <sup>θ</sup>1γ02‐ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>2γ<sup>20</sup> ½ �<sup>2</sup> (45)

where, the variance σ<sup>2</sup> Z1 is given by:

σ2 Z1 <sup>¼</sup> E Z2 1i � �‐ E Zð Þ 1i ½ �<sup>2</sup> ¼ E Sð Þ 1Y1i þ S2Y1i 2 ‐ E Sð Þ 1Y1i <sup>þ</sup> S2Y1i ½ �<sup>2</sup> <sup>¼</sup> E S<sup>2</sup> 1Y2 1i <sup>þ</sup> <sup>S</sup><sup>2</sup> 2Y2 2i <sup>þ</sup> 2S1S2Y1iY2i � �‐ <sup>θ</sup>1μY1 <sup>þ</sup> <sup>θ</sup>2μY2 ½ �<sup>2</sup> <sup>¼</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � σ<sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � σ<sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � <sup>þ</sup>2θ1θ<sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ‐ <sup>θ</sup>1μY1 <sup>þ</sup> <sup>θ</sup>2μY2 ½ �<sup>2</sup> <sup>¼</sup> <sup>γ</sup><sup>20</sup> <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1σ2 Y1 <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2σ2 Y2 þ 2θ1θ2σY1σY2 (46)

The variance σ<sup>2</sup> Z0 2 is given by:

σ2 Z0 2 <sup>¼</sup> E Z0<sup>2</sup> 2i � �‐ E Z<sup>0</sup> 2i h i � � <sup>2</sup> <sup>¼</sup> P1 <sup>þ</sup> P3P � �E S<sup>2</sup> <sup>1</sup>Y1i <sup>þ</sup> S1S2Y2i � �<sup>2</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �E S1S2Y1i <sup>þ</sup> <sup>S</sup><sup>2</sup> <sup>2</sup>Y2i � �<sup>2</sup> ‐ P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � �μY1 <sup>þ</sup> <sup>θ</sup>1θ2μY2 n o <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>θ</sup>1θ2μY1 <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � �μY2 <sup>h</sup> n oi<sup>2</sup> <sup>¼</sup> P1 <sup>þ</sup> P3P � �E S<sup>4</sup> 1Y2 1i <sup>þ</sup> <sup>S</sup><sup>2</sup> 1S2 2Y2 2i <sup>þ</sup> 2S3 <sup>1</sup>S2Y1iY2i � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �E S<sup>2</sup> 1S2 2Y2 1i <sup>þ</sup> <sup>S</sup><sup>4</sup> 2Y2 2i <sup>þ</sup> 2S1S<sup>3</sup> <sup>2</sup>Y1iY2i � � ‐ P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � �μY1 <sup>þ</sup> <sup>θ</sup>1θ2μY2 n o <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>θ</sup>1θ2μY1 <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � �μY2 <sup>h</sup> n oi<sup>2</sup> <sup>¼</sup> P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>40</sup> <sup>þ</sup> <sup>4</sup>γ30θ<sup>1</sup> <sup>þ</sup> <sup>6</sup>γ20θ<sup>2</sup> <sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 1 � � <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � <sup>þ</sup> <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 <sup>h</sup> � � <sup>þ</sup><sup>2</sup> <sup>γ</sup><sup>30</sup> <sup>þ</sup> <sup>3</sup>γ20θ<sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 1 � �θ<sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ<sup>i</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � � <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 <sup>h</sup> � � <sup>þ</sup> <sup>γ</sup><sup>04</sup> <sup>þ</sup> <sup>4</sup>γ03θ<sup>2</sup> <sup>þ</sup> <sup>6</sup>γ02θ<sup>2</sup> <sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 2 � � <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � <sup>þ</sup> <sup>2</sup>θ<sup>1</sup> <sup>γ</sup><sup>03</sup> <sup>þ</sup> <sup>3</sup>γ02θ<sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 2 � � <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ ‐ P1 <sup>þ</sup> P3P � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � �μY1 <sup>þ</sup> <sup>θ</sup>1θ2μY2 n o <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>θ</sup>1θ2μY1 <sup>þ</sup> <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 � �μY2 <sup>h</sup> n oi<sup>2</sup>

#### After simplification, this gives

σ2 Z0 2 <sup>¼</sup> <sup>σ</sup><sup>2</sup> Y1 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y1 � � ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>40</sup> <sup>þ</sup> <sup>4</sup>γ30θ<sup>1</sup> <sup>þ</sup> <sup>6</sup>γ20θ<sup>2</sup> <sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h � �i <sup>þ</sup> <sup>σ</sup><sup>2</sup> Y2 <sup>þ</sup> <sup>μ</sup><sup>2</sup> Y2 � � P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>04</sup> <sup>þ</sup> <sup>4</sup>γ03θ<sup>2</sup> <sup>þ</sup> <sup>6</sup>γ02θ<sup>2</sup> <sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>4</sup> 2 � � <sup>þ</sup> ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 h � �i <sup>þ</sup><sup>2</sup> <sup>σ</sup>Y1σY2 <sup>þ</sup> <sup>μ</sup>Y1μY2 ð Þ ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup><sup>2</sup> <sup>γ</sup><sup>30</sup> <sup>þ</sup> <sup>3</sup>γ20θ<sup>1</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ<sup>1</sup> <sup>γ</sup><sup>03</sup> <sup>þ</sup> <sup>3</sup>γ02θ<sup>2</sup> <sup>þ</sup> <sup>θ</sup><sup>3</sup> 2 h � �i ‐ ð Þ P1 <sup>þ</sup> P3P <sup>γ</sup><sup>20</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 1 � � <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � �θ1θ<sup>2</sup> n oμY1 <sup>h</sup> <sup>þ</sup> ð Þ P1 <sup>þ</sup> P3P <sup>θ</sup>1θ<sup>2</sup> <sup>þ</sup> P2 <sup>þ</sup> P3ð Þ <sup>1</sup>‐<sup>P</sup> � � <sup>γ</sup><sup>02</sup> <sup>þ</sup> <sup>θ</sup><sup>2</sup> 2 n o � � <sup>μ</sup>Y2i<sup>2</sup>

#### **References**

[1] Warner SL. Randomized Response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association. 1965; **60**(309):63-69

[2] Greenberg BG, Abul-Ela A, Simmons WR, Horvitz DG. The unrelated question randomized response model: Theoretical Framework. Journal of the American Statistical Association. 1969;**64**:520-539

[3] Greenberg BG, Kuebler RR Jr, Abernathy JR, Horvitz DG. Application of the randomized response technique in obtaining quantitative data. Journal of the American Statistical Association. 1971;**66**(334):243-250

[4] Eichhorn BH, Hayre LS. Scrambled randomized response methods for obtaining sensitive quantitative data. Journal of Statistical Planning and Inference. 1983;**7**(4):307-316

[5] Perri PF. Modified randomized devices for Simmons' model. Model Assisted Statistics and Applications. 2008;**3**(3):233-239

[6] Bhargava M, Singh R. A note on a modified randomization device using unrelated question. Metron-International Journal of Statistics. 1999; **57**(3-4):141-145

[7] Singh S, Horn S, Singh R, Mangat NS. On the use of modified randomization device for estimating the prevalence of a sensitive attribute. Statistics in Transition. 2003;**6**(4):515-522

[8] Batool F, Shabbir J, Hussain H. On the estimation of a sensitive quantitative mean using blank cards. Communications in Statistics-Theory and Methods. 2017;**46**(6):3070-3079

[9] Singh S. On the estimation of correlation coefficient using scrambled responses. In: Chaudhuri A, Christofides TC, Rao CR, editors. Data Gathering, Analysis and Protection of Privacy Through Randomized Response Techniques: Qualitative and Quantitative Human Traits, Handbook of Statistics-34. Elsevier; 2016

[10] Singh GN, Kumar A, Vishwakarma GK. Estimation of population mean of sensitive quantitative character using blank cards in randomized device. Communications in Statistics-Simulation and Computation. 2018. DOI: 10.1080/ 03610918.2018.1502779

[11] Singh GN, Kumar A, Vishwakarma GK. Some alternative additive randomized response models for estimation of population mean of quantitative sensitive variable in the presence of scramble variable. Communications in Statistics-Simulation and Computation. 2018. DOI: 10.1080/03610918.2018.1520879

[12] Narjis G, Shabbir J. An efficient partial randomized response model for estimating a rare sensitive attribute using Poisson distribution. Communications in Statistics-Theory and Methods. 2019. DOI: 10.1080/ 03610926.2019.1628992

[13] Narjis G, Shabbir J. Bayesian analysis of optional unrelated question randomized response models. Communications in Statistics-Theory and Methods. 2020. DOI: 10.1080/ 03610926.2020.1713367

[14] Ahmed S, Sedory SA, Singh S. Simultaneous estimation of means of two sensitive variables. Communications in Statistics-Theory and Methods. 2018;**47**(2):324-343

[15] Lanke J. On the degree of protection in randomized interviews. International Statistical Review. 1976;**44**:197-203

[16] Leysieffer RW, Warner SL. Respondent jeopardy and optimal designs in randomized response models. Journal of the American Statistical Association. 1976;**71**:649-656

[17] Bhargava M, Singh R. On the efficiency comparison of certain randomized response strategies. Metrika. 2002;**55**:191-197

[18] Diana G, Perri PF. Efficiency vs privacy protection in SRR methods. In: Proceedings of 44th Scientific Meeting of the Italian Statistical Society. 2008

[19] Zhimin H, Zaizai Y, Lidong W. Combination of the additive and multiplicative models at the estimation stage. In: 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering. 2010. pp. 172-174

[20] Diana G, Perri PF. A class of estimators for quantitative sensitive data. Statistical Papers. 2011;**52**(3): 633-650

[21] Sarndal CE, Swensson B, Wretman J. Model Assisted Survey Sampling, Springer Series in Statistics. Springer-Verlag Publishing; 1992

#### **Chapter 10**

## Causality Relationship between Import, Export and Exim Bank Loans: Turkish Economy

*Yüksel Akay Ünvan and Ulviyya Nahmatli*

#### **Abstract**

Export promotion tools aim to increase exports and support the entrepreneur in reaching new foreign markets. The positive impact of incentives, especially on financial issues, on exports both before and after shipment is undeniable. Founded in 1987, Turkish Exim bank is Turkey's official export credit institution. By observing macro-economic balances, Exim bank ensures that exporters, export-oriented production manufacturers and entrepreneurs operating abroad are supported by credit, guarantee and insurance programs to increase their competitiveness. The study aims to examine the causal relationship between imports, exports and Exim bank loans in the Turkish economy. In the study, stationarity with the extended Dickey-Fuller unit root test, long-term relationship with the Johansen co-integration test, and then causality with the Granger test were investigated. The causality relationship was analyzed using import, export and Eximbank loans data for the periods 2003–2020.

**Keywords:** exports, exim bank loans, ADF test, causality test

#### **1. Introduction**

For developing countries to reach the level of developed countries and to catch the level to compete with them, more than one condition must be met. The most important of these conditions is the industrialization strategies that developing countries will implement. With the decisions of January 24, 1980, which were a turning point in terms of redesigning the Turkish economy, the export-based industrialization strategy was started to be implemented by targeting export-based growth instead of the import substitution strategy implemented since the 1960s, and some institutions were created to eliminate the problems that will be encountered at the implementation stage of these decisions ([1], p. 22).

To increase the competitiveness of exporters in foreign markets, Turkish Exim bank provides export financing in Turkey with credit, guarantee and insurance programs under international rules and principles ([2], p. 180).

In developing countries, Exim bank loans are provided by organizations that support the Central Bank of the Republic of Turkey (CBRT) and non-profit exports. Commercial banks, private equity export credit insurance companies and factoring companies are the only organizations that support finance, as the main purpose is profit.

In developed countries, the necessary financing for exports is usually provided by the commercial banking system. Export financing organizations, on the other

hand, support the export sector and banks with insurance and guarantee programs, only performs the function of providing a risk-free environment.

#### **1.1 Import**

Imports are the value of foreign goods and services bought by a country's households, firms, government agencies, and other organizations in a given period.

#### **1.2 Exports**

Exports are goods and services that are produced in one country and sold to buyers in another. Exports, along with imports, make up international trade.

#### **1.3 Eximbank loan**

Eximbank loans are lines of credit made available by Export Credit Bank of Turkey (Exim bank) to enhance exports. This credit is made available during the pre-export stage against a written pledge by the exporter to export Turkish-origin goods and services as stipulated by Exim bank. It provides a price advantage over other export loans offered by banks.

#### **2. Literature review**

In the Literature view, a summary of information was given about research that examines the relationship between exports, financial development and economic growth in Turkey in the context of causality.

Dodaro [3], examined the relationship between economic growth and exports with the Granger Causality test by using variables between 1967 and 1986 periods. The study found a one-sided causal relationship from economic growth to exports.

Bahmani and Domac [4] examined the relationship between economic growth and exports, with the Co-Integration test by using variables between 1923 and 1990 periods. As a result of the research, it is found that there is a decidedly causal relationship between economic growth and exports.

Tuncer [5], examined the causal relationship between exports, imports, investments and Gross domestic product (GDP) with the method Toda and Yamamoto by using variables between 1980Q1 and 2000Q3 periods. As a result of the study, a one-sided causality relationship has been found from economic growth to exports.

Şimşek [6], tested the export-based growth hypothesis with Error Correction Model, Co-Integration Test and Causality tests by using variables between 1960 and 2002 periods. As a result of the study, the one-sided causality relationship has been found from economic growth to exports.

Erdogan [7], examined the relationship between economic growth and exports, with Co-Integration and Causality tests by using variables between 1923 and 2004 periods. As a result of the study, the long-term double-sided causal relationship between economic growth and exports was found at the level of 10% significance.

Taştan [8], examined the interaction and causal relationships between export, industrial production and import variables, with Co-Integration and Causality tests by using variables between 1985Q1 and 2009Q3 periods. As a result of the study, a one-sided causality relationship has been found from economic growth to exports.

Tıraşoglu [9], examined whether the export-based growth hypothesis is valid in Turkey or not, with Co-Integration and Causality tests by using variables between

#### *Causality Relationship between Import, Export and Exim Bank Loans: Turkish Economy DOI: http://dx.doi.org/10.5772/intechopen.101733*

1998Q1–2011Q3 periods. As a result of the study, there is a long-term one-sided causal relationship between exports and economic growth.

Korkmaz [10], examined the relationship between economic growth and exports, with Co-Integration and Causality tests by using variables between 1998: Q1–2013:Q3 periods. As a result of the study, a one-sided causality relationship has been found from exports to economic growth.

Pentecost and Kar [11], examined the relationship between economic growth and exports, with Co-Integration and Causality tests by using variables between 1963 and 1995 periods. As a result of the research, there is a one-sided causal relationship from economic growth to financial development.

Al-Yousif [12], studied the causal relationship between financial development and economic growth for 30 developing countries, with both Time Series and Panel Data Analysis tests, by using variables between 1970 and 1999 periods. As a result of the study, there is a double-sided relationship between economic growth and financial development.

Ceylan and Durkaya [13], examined the causal relationship between domestic credit volume and economic growth, by taking advantage of Gross domestic product (GDP) and total loans that private banks use domestically by using variables between 1998 and 2008 periods. As a result of the research, there is a onesided causality relationship from economic growth to loans.

#### **3. Econometric analysis**

#### **3.1 Data set**

In this study, the data set used were between 2003 and 2019 periods. The source of the data used in the study was taken from the Central Bank of the Republic of Turkey (TCMB) and the official website of the bank Exim bank. This data was created with three different variables which are listed in **Table 1**. All analyses and tests were performed on these variables by using the EViews11 program.

#### **3.2 Augmented Dickey-Fuller (ADF) unit root test**

To obtain econometrically significant relationships between series in time series analysis, it is essential that the analyzed series must be stationary. Unit root tests are usually used to test whether the series has a stationary structure or not. The most commonly used of these tests is the unit root test performed by Dickey-Fuller [14], which assumes that the error term is independent and uniformly distributed. If a time series is stationary, its variance, average, and covariance (with various delays) are the same, no matter when it is measured ([15], p. 757).


**Table 1.** *Data set.*

Let *Yt* be any time series, the stationary of a series depends on the following conditions:

$$\mathbf{E}\left(\mathbf{Y}\_{\mathbf{t}}\right) = \boldsymbol{\mu} \tag{1}$$

$$\text{Var}\left(\mathbf{Y}\_t\right) = \mathbf{E}(\mathbf{Y}\_t\boldsymbol{\mu})^2 = \sigma^2 \tag{2}$$

$$\gamma\_k = \mathbb{E}[(\mathbf{Y}\_t \mu)(\mathbf{Y}\_{t-k} - \mu)] \tag{3}$$

The relationship between this period value of Series *Yt* and the value it has in the last period, is as in Eq. (4):

$$\mathbf{Y}\_t = \rho \mathbf{Y}\_{t-1} + \mathbf{e}\_t \tag{4}$$

$$\mathbf{Y}\_t - \mathbf{Y}\_{t-1} = \rho \mathbf{Y}\_{t-1} - \mathbf{Y}\_{t-1} + \mathbf{e}\_t \tag{5}$$

$$
\Delta \mathbf{Y}\_t = (\rho - \mathbf{1}) \mathbf{Y}\_{t-1} + \mathbf{e}\_t \tag{6}
$$

$$
\Delta \mathbf{Y}\_t = \mathbf{y} \cdot \mathbf{Y}\_{t-1} + \mathbf{e}\_t \tag{7}
$$

If ρ = 1 or γ = 0 is found in this equation, there is a unit root problem. If ρ = 1, the relationship will be as in Eq. (8):

$$Y\_t = Y\_{t-1} + \mathbf{e}\_t \tag{8}$$

This means that the impact of the shock that the series was subjected in the previous period remains in the system as it was. If ρ < 1, it means that the initial effect of shocks in the past continues and that this effect will disappear over time.

The main regression patterns used in the Dickey-Fuller test are:

$$
\Delta \mathbf{Y}\_t = \mathbf{y} \; \mathbf{Y}\_{t-1} + \mathbf{e}\_t \tag{9}
$$

$$
\Delta Y\_t = \beta\_{0+} \mathcal{Y} \ Y\_{t-1} + \mathbf{e}\_t \tag{10}
$$

$$
\Delta Y\_t = \beta\_{0+} \beta\_i t + \gamma \text{ Y}\_{t-1} + \mathfrak{e}\_t \tag{11}
$$

Eq. (9), shows a structure with no fixed term and no trend effect. Eq. (10) shows a structure with a fixed term and no trend term, and Eq. (11) shows a structure with a fixed term and no trend effect.

In case of correlation between error terms, the extended Dickey-Fuller (ADF) unit root test was developed again by Augmented Dickey-Fuller [16] by including the delayed values of the dependent variable in the model. The proposed models for this test are shown in the following equations:

$$
\Delta \mathbf{Y}\_t = \mathbf{y} \cdot \mathbf{Y}\_{t-1} + \sum\_{i=2}^{\rho} \beta\_i \Delta \mathbf{Y}\_{t-i+1} + \mathbf{e}\_t \tag{12}
$$

$$
\Delta \mathbf{Y}\_t = \boldsymbol{\theta}\_{\mathbf{0}+} \boldsymbol{\eta} \,\, \mathbf{Y}\_{t-1} + \sum\_{i=2}^{\rho} \boldsymbol{\theta}\_i \Delta \mathbf{Y}\_{t-i+1} + \mathbf{e}\_t \tag{13}
$$

$$
\Delta \mathbf{Y}\_t = \boldsymbol{\beta}\_{0+} \boldsymbol{\beta} \mathbf{t} + \boldsymbol{\gamma} \, \mathbf{Y}\_{t-1} + \sum\_{i=2}^{\rho} \boldsymbol{\beta}\_i \Delta \mathbf{Y}\_{t-i+1} + \mathbf{e}\_t \tag{14}
$$

Eq. (12) shows the structure in which there is no fixed term and no trend effect. Eq. (13) shows the structure in which there is only a fixed term, and Eq. (14) shows the structure in which both the fixed term and the trend effect are observed.

The stationary test is first performed at the level value. If stationary is not achieved in the level value, the first difference of the *Yt* series will be taken. If the *Causality Relationship between Import, Export and Exim Bank Loans: Turkish Economy DOI: http://dx.doi.org/10.5772/intechopen.101733*

*ΔYt*= *Yt Yt***<sup>1</sup>** series becomes stationary, it is denoted by I(1) and the series becomes stationary in the first difference. If stationarity cannot be achieved in the first difference of the series, the second difference will be taken. The process of taking the difference of the series continues until it becomes stationary.

In Eqs. (4) and (7), the *H***0**: *γ*=0 (the series aren't stationary) hypothesis in the unit root test was found by Dickey Fuller [14] and tested with the τ (tau) statistic. If the error term is correlated in the *Yt* series, the extended Dickey Fuller (ADF) test is preferred, and the *H***<sup>0</sup>** hypothesis is rejected if the critical values of MacKinnon [17], correspond to the absolute value of the statistics τ (tau), are greater than τ. ([15], p. 757).

If the ADF test statistic value is more negative than the MacKinnon [17] critical values at various significance levels, it is decided that there is a unit root in the series; in other words, the series are not stationary. In this study, the stability of the series was analyzed using the extended Dickey-Fuller (ADF) unit Root Test.

As we can see in **Table 2**, Import variables were found stationary in the intercept model in the first difference I(1), Export variables were found stationary in nonintercept and trendless model in the first difference I(1); while Eximbank loans variables were found stationary in intercept model in the second difference I(2).

#### **3.3 Johansen cointegration test**

To test whether non-stationary series converge to equilibrium over a long period, the cointegration test examines whether there is a long-term relationship between the series or not. But since this test does not provide information about the direction of the relationship, causality tests are used to determine the direction of the relationship. There are two Tests in Johansen's cointegration analysis. These are trace and max.

Trace hypothesis test H0: r ≤ r0, H1: r ≥ r0 + 1.

Max hypothesis test H0: r = r0, H1: r = r0 + 1.

If r = 0 there is not cointegration vector.

The series were analyzed using the Johansen cointegration test and the results were shown in **Table 3**. In **Table 3**, the r = 0 hypothesis, shows that there is no cointegration relationship between the variables; the r ≥ 1 hypothesis, is an alternative hypothesis which shows that there is at least one cointegration relationship; the


#### **Table 2.**

*ADF unit root test.*


#### **Table 3.**

*Johansen cointegration test results.*

r ≥ 2 hypothesis is an alternative hypothesis that shows that there are at least two cointegration relations:

According to the Johansen test output, both the Trace test statistic value and the Maximum Eigen test statistic value were greater than the table critical value of 5%. Therefore, the zero hypothesis of r = 0 can be rejected for both test values. In other words, Export, Gross domestic product (GDP), and Loan variables are cointegrated.

#### **3.4 Granger causality test**

The Granger causality test examines the relationship between series based on estimating past and present values. According to Granger, if past information about *Xt* helps to obtain estimates. On the other hand, if *Yt*'s past values allow *Xt* to be estimated, the *Yt* series is the granger cause of *Xt*. If *Xt* causes *Yt* and *Yt* causes *Xt*, there is a bilateral causality relationship. An error correction model is used to determine the direction of the causality relationship, if the series is co-integrated. But if the series is not co-integrated, standard Granger or Sims tests are used to determine the direction of the causality relationship ([18], pp. 213–228).

#### *3.4.1 Determination of appropriate lag length*

Accurate determination of the number of lag lengths in the Granger causality test is very important for the application to give healthy results, because this test is sensitive to the number of lag lengths. To find the appropriate lag length numbers for the Granger causality test, the Vector autoregression (VAR) model is estimated. Here a generic VAR model is estimated primarily to determine the appropriate number of lag length. Then, the number of lag length, will be determined by Akaike information criteria and by the LM test.

For the VAR model, the appropriate lag length was obtained by LogL (Log-We), LR (sequential modified LR test statistic), FPE (Final prediction error), AIC (Akaike information criterion), SC (Schwarz information criterion) and HQ (Hannan-Quinn information criterion) criteria. The model with the largest LogL and LR values and the smallest FPE, AIC, SC and HQ values were selected to determine the appropriate lag length criteria.

As seen from **Table 4**, Sequentially modified LR test statistic (LR); Final prediction error (FPE), Akaike information criterion (AIC),Schwarz information criterion (SC) and Hannan-Quinn information criterion (HQ) appropriate lag length as 1. According to this information, the lag length will be 1**.**

In **Figure 1** it is presented the Var(1) model which provides the stationary condition:

Since the auto-regressive characteristic roots are all in the unit circle,the model VAR(1) which is used in the study, provided the stationary condition. Subsequently, appropriate delay numbers for the Granger causality test were performed


## **Table 4.**

*Causality Relationship between Import, Export and Exim Bank Loans: Turkish Economy DOI: http://dx.doi.org/10.5772/intechopen.101733*

**Figure 1.** *Stationarity analysis.*


#### **Table 5.**

*Granger causality test results.*

by autocorrelation LM tests, it was determined that there was no autocorrelation and the series was stationary.

The series were analyzed using the Granger causality test, as we can see from **Table 5**; there is no causal relationship between Eximbank to Export variables (ρ = 0.2485 > 0.05), Import to Export variables (ρ =0.1140 > 0.05), Export and Eximbank variables(ρ = 0.3826 > 0.05), Import to Eximbank variables (ρ = 0.0839 > 0.05), Eximbank to Import(ρ =0.98035 > 0.05), Export to Import (ρ =0.8944 > 05).

According to the results which are shown in **Table 5**, it was determined that there is no causal relationship between Eximbank loans, Import and Export variables at 1 and 5% significance levels.

#### **4. Conclusion**

To decipher the causal relationship between import, export and Eximbank loan variables in the Turkish economy, three different variables were used in the study. All variables used in the study are time series, because they depend on time, so the stationarity of the variables was tested by the ADF test. As a result of the test, stationarity was achieved by taking first-order differences in import and export variables and second-order differences in eximbank loans variables. To test whether non-stationary series converge to equilibrium over a long period or not, the series were analyzed by using the Johansen cointegration test and the results revealed that Export, GDP, and Loan variables were cointegrated. Then the series were analyzed using the Granger causality test, and according to the results, it was determined that there was no causal relationship between Eximbank loans, Import and Export variables at 1 and 5% significance levels.

When we look at the literature review, a summary of information was given about research that examines the relationship between exports, financial development and economic growth in Turkey in the context of causality. From the study of Ceylan and Durkaya [13], there was found one-sided causality relationship from economic growth to loans. From the study of Dodaro [3], Bahmani and Domac [4], Tuncer [5], Şimşek [6] and Taştan [8] it was found a causal relationship from economic growth to exports. Erdogan [7] found causality relationship between economic growth and exports at the level of 10% significance. Tıraşoğlu [9] and Korkmaz [10], found a causal relationship between export and economic growth. Pentecost, Kar [11] and Al-Yousif [12] found causal relationships from economic growth to financial development. But in this study, it was determined that there were no causal relationship between Eximbank loans, Import and Export variables at 1 and 5% significance levels.

Turkey's export target in 2023, is to set at 500 billion USD. Looking at the export figures at the end of 2015, Turkey must increase exports by an average of 16.5% each year to reach the 2023 target. To achieve this increase, it is necessary to ensure the high growth of the economy, accelerate R&D investments, diversify exports, reach new markets, and provide the necessary regulations and facilities for exporting companies to compete with exporters in other countries.

Eximbank loans provide a price advantage over other export loans offered by banks. It has a strong financial structure. Because of this financial structure, it supports exports at a high rate. To achieve the export potential that the country has, also in international markets, it should implement new and effective credit/insurance programs under international treaties and the restrictions of the institutions to which it is affiliated.

#### **Author details**

Yüksel Akay Ünvan\* and Ulviyya Nahmatli Ankara Yıldırım Beyazıt University, Turkey, Ankara

\*Address all correspondence to: aunvan@ybu.edu.tr

© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

*Causality Relationship between Import, Export and Exim Bank Loans: Turkish Economy DOI: http://dx.doi.org/10.5772/intechopen.101733*

#### **References**

[1] Bülbül S, Demiral A. Türkiye Ekonomisinde Ekonomik Büyüme. İhracat Ve Eximbank Kredileri Arasındaki nedensellik İlişkisi, Marmara Üniversitesi Öneri Dergisi. 2016;**46**(12):22-23

[2] Öztürk S, Sözdemir A, Koçbulut Ö. Türk Eximbank Programlarının Türkiye İhracatına Etkileri ve AB/DTÖ'ye Uygunluğu. Suleyman Demirel Üniversitesi İ.İ.B.F. 2007;**12**(2):180

[3] Dodaro S. Exports and growth: A reconsideration of causality. Journal of Developing Areas. 1993;**27**(2):227-244

[4] Bahmani M, Domac I. Export and economic growth in Turkey: Evidence from cointegration analysis. Middle East Technical University Studies in Development. 1995;**22**(1):67-77

[5] Tuncer İ. Türkiye'de İhracat İthalat ve Büyüme: Toda Yamamoto Yöntemiyle Gran-ger Nedensellik Analizleri (1980–2000). Çukurova Üniversitesi Sosyal Bilimler Enstitüsü Dergisi. 2002;**9**(9):89-104

[6] Şimşek M. İhracata Dayalı Büyüme Hipotezinin Türkiye Ekonomisi Verileri ile Analizi 1960–2002. Dokuz Eylül Üniversitesi İ.İ.B.F Dergisi. 2003;**18**(2): 43-63

[7] Erdoğan S. Türkiye'nin İhracat Yapısındaki Değişme ve Büyüme İlişkisi: Koentegrasyon ve Nedensellik Testi Uygulaması. Selçuk Üniversitesi Karaman İ.İ.B.F. Dergisi. 2006;**10**(9):30-38

[8] Taştan H. Türkiye'de İhracat, İthalat ve Ekonomik Büyüme Arasındaki Nedensellik İlişkilerinin Spektral Analizi. Ekonomi Bilimleri Dergisi. 2010;**2**(1):87-96

[9] Tıraşoğlu M. Türkiye Ekonomisinde İhracata Dayalı Büyüme Hipotezinin Yapısal Kırılmalı Birim Kök ve Eş bütünleşme Testleri ile İncelenmesi.

İstanbul Üniversitesi İktisat Fakültesi Mecmuası. 2012;**62**(2):373-392

[10] Korkmaz S. Türkiye Ekonomisinde İhracat ve Ekonomik Büyüme Arasındaki Nedensellik İlişkisi. Business and Economics Research Journal. 2014; **5**(4):119-128

[11] Pentecost EJ, Kar M. Financial Development and Economic Growth in Turkey: Further Evidence on the Causality Issue. Leicestershire, UK: Loughborough University Department of Economics; 2000. pp. 3-13

[12] Al-Yousif YK. Financial development and economic growth: Another look at the evidence from developing countries. Review of Financial Economics. 2002; **11**(2):131-150

[13] Ceylan S, Durkaya M. Türkiye'de Kredi Kullanımı Ekonomik Büyüme İlişkisi. Atatürk Üniversitesi İ.İ.B.F. Dergisi. 2010;**24**(2):21-33

[14] Dickey DA and Fuller WA. Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association. 1979;**74**:427-431

[15] Gujarati DM. Basic Econometrics. Tata McGraw-Hill Education. 2012: 755-757

[16] Dickey DA, Fuller WA. Distribution of the estimators for autoregressive time series with a unit root. Econometrica. 1981;**49**:1057-1072

[17] Mackinnon JG. Numerical Distribution Functions for Unit Root and Cointegration Tests. Journal of Applied Econometrics. 1996;**11**:601-618

[18] Granger CWJ, Escribano A. Limitation on the Long-Run. Relationship Between Prices from an Efficient Market, UCSD Discussion Paper; 1986

## *Edited by Ricardo López-Ruiz*

Nature evolves mainly in a statistical way. Different strategies, formulas, and conformations are continuously confronted in the natural processes. Some of them are selected and then the evolution continues with a new loop of confrontation for the next generation of phenomena and living beings. Failings are corrected without a previous program or design. The new options generated by different statistical and random scenarios lead to solutions for surviving the present conditions. This is the general panorama for all scrutiny levels of the life cycles. Over three sections, this book examines different statistical questions and techniques in the context of machine learning and clustering methods, the frailty models used in survival analysis, and other studies of statistics applied to diverse problems.

Published in London, UK © 2022 IntechOpen © Kittiphat Abhiratvorakul / iStock

Computational Statistics and Applications

Computational Statistics

and Applications

*Edited by Ricardo López-Ruiz*