**2. Methods and results**

### **2.1 Using data to derive the p-values and lower confidence interval bounds**

We used the pooled 2010–2011 and 2014–2015 TUS-CPS data for adult daily smokers (n = 34,728) who reported the price of the last self-purchased pack or carton of cigarettes. The reported prices were used to compute the (average) PPP. The overall cohort was representative of about 23,370,261 adult daily smokers, where 12% were 18–24 years old, 38% were 25–44 years old, and 50% were 45+ years old, and 54% were men and 47% were women. The racial/ethnic representation was as follows: 76% were W, 11% were BAA, 8% were H, 2% were MULT, 2% were ASIAN, 1% were AIAN, and less than 1% were HPI. All racial/ethnic groups were well represented in the sample: the smallest number of respondents (96) corresponded to HPI daily smokers. Additional sample characteristics have been described in a prior study of purchasing cigarettes on Indian reservations [33].

We fixed the overall error rate at α = 5% and fitted a design-based multiple linear regression (R2 ≈ 30%, F(25, 160) ≈ 257, p < 0.0001) to model the mean PPP as a function of daily smokers' characteristics, location of the purchase (on/off Indian reservation), survey mode (phone, in-person), and survey period (2010–2011, 2014–2015). The daily smokers' characteristics included race/ethnicity, age, sex, marital status, education, employment record, region of residency (West, South, Midwest, and Northeast), metropolitan area of residency (metro, nonmetro), and heavy smoking indicator. The analysis incorporated statistical methods recommended in the methodological guidelines for analysis of the CPS and CPS supplements [34, 35]. Specifically, because the CPS incorporates complex sampling, we estimated variance using balanced repeated replications [36]. The main and 160 replicate weights for this approach have been made available for public use by the U.S. Census Bureau [34, 35]. The analysis was performed using SAS®9.4 software [37]; the SAS®9.4 Survey Package procedures suitable for analysis of TUS-CPS have been discussed elsewhere [38]. **Table 1** depicts the estimated model coefficients and their standard errors for all covariates. As is shown in **Table 1**, smokers' sex and survey mode (phone, in-person) were not significant.

**Table 1** presents the individual p-values for comparisons of racial/ethnic populations of daily smokers versus W daily smokers (based on the model): < 0.0001 *pAIAN* , < 0.0001 *pASIAN* , < 0.0001 *pBAA* , < 0.0001 *pH* , = 0.0002 *pHPI* , and *pMULT* = 0.0087 .The individual lower 95% confidence interval bounds for the mean PPP difference for each racial/ethnic population of daily smokers relative to W daily smokers were computed using the formula:

$$L\_i = \hat{d}\_i - t\_{\alpha, g\xi, d\hat{\eta}\text{\\\\\\reflectbox{ $\eta$ }}} \text{SE}(\hat{d}\_i), i = \text{AIAN}, \text{ASAAN}, \text{BAA}, \text{H.HPI}, \text{MUL}T,\tag{5}$$

where <sup>ˆ</sup> *di* denotes the estimated mean PPP difference relative to W daily smokers (the point estimate for the *ith* mean difference), ( ) <sup>ˆ</sup> *SE di* is the standard error of the estimate (computed using the balanced repeated replications), and *t*0.05, 160 *df* <sup>=</sup> = 1.6544 is the 95th percentile of the central *t*-distribution with 160 degrees of freedom (the number of degrees of freedom matches the number of the replicate weights) [34–36]. We note that there are alternative methods to construct the lower bounds, for example, using the standard normal distribution instead of the central *t*-distribution [34, 36].

**Figure 1** depicts the lower bounds s *Li* and the estimated mean differences <sup>ˆ</sup> <sup>s</sup> *di* for all racial/ethnic populations (relative to the W population). These bounds were computed using *proc surveyreg* procedure with *lsmestimate* statements (with "cl," "e,"

**169**

*\**

**Table 1.**

*On Statistical Assessments of Racial/Ethnic Inequalities in Cigarette Purchase Price among Daily…*

**Intercept** 3.64 0.12 \*

AIAN versus W 0.61 0.13 \* ASIAN versus W 0.62 0.09 \* BAA versus W 0.51 0.04 \* H versus W 0.61 0.06 \* HPI versus W 0.83 0.22 0.0002 MULT versus W 0.22 0.08 0.0087

18–24 years old versus 45+ years old 0.19 0.04 \* 25–44 years old versus 45+ years old 0.20 0.02 \*

Female versus male 0.00 0.02 0.9052

Married (living with a spouse) 0.02 0.02 0.4878 Never married 0.22 0.03 \*

Graduate degree 0.11 0.08 0.1632 High school/equivalent −0.14 0.02 \* Less than high school −0.15 0.03 \*

Employed (at work or absent) versus unemployed 0.16 0.03 \* Not in labor force versus unemployed −0.15 0.04 \*

No versus yes 1.57 0.12 \*

Midwest versus West 0.09 0.03 0.0091 Northeast versus West 1.75 0.05 \* South versus West −0.62 0.03 \*

Metropolitan area versus nonmetropolitan area 0.32 0.03 \*

Heavy (20+ cigarettes per day) versus non-heavy smoker −0.20 0.02 \*

Personal interview versus phone interview −0.01 0.02 0.5205

2010–2011 versus 2014–2015 −0.38 0.02 \*

*Design-based multiple linear regression for the mean cigarette price per pack.*

**coefficient**

**Standard error**

**p-Value\***

*DOI: http://dx.doi.org/10.5772/intechopen.93380*

**Race/ethnicity** (reference group is W)

**Age** (reference group is 45+ years old)

**Marital status** (reference group is widowed/divorced/

**Highest level of education** (reference group is some

**Employment status** (reference group is unemployed)

**Place where cigarettes were purchased** (reference group is "on Indian reservation") (reference group is "yes)

**Sex**

separated)

college/Bachelor's degree)

**U.S. region of residency**

**Metropolitan area of residency**

**Heavy smoking indicator**

**Survey mode**

**Survey period**

*p-value < 0.0001.*

**Factor Estimated** 

*On Statistical Assessments of Racial/Ethnic Inequalities in Cigarette Purchase Price among Daily… DOI: http://dx.doi.org/10.5772/intechopen.93380*


#### **Table 1.**

*Design-based multiple linear regression for the mean cigarette price per pack.*

*Recent Advances in Numerical Simulations*

**2.1 Using data to derive the p-values and lower confidence interval bounds**

prior study of purchasing cigarettes on Indian reservations [33].

W daily smokers were computed using the formula:

We used the pooled 2010–2011 and 2014–2015 TUS-CPS data for adult daily smokers (n = 34,728) who reported the price of the last self-purchased pack or carton of cigarettes. The reported prices were used to compute the (average) PPP. The overall cohort was representative of about 23,370,261 adult daily smokers, where 12% were 18–24 years old, 38% were 25–44 years old, and 50% were 45+ years old, and 54% were men and 47% were women. The racial/ethnic representation was as follows: 76% were W, 11% were BAA, 8% were H, 2% were MULT, 2% were ASIAN, 1% were AIAN, and less than 1% were HPI. All racial/ethnic groups were well represented in the sample: the smallest number of respondents (96) corresponded to HPI daily smokers. Additional sample characteristics have been described in a

We fixed the overall error rate at α = 5% and fitted a design-based multiple linear

tion of daily smokers' characteristics, location of the purchase (on/off Indian reservation), survey mode (phone, in-person), and survey period (2010–2011, 2014–2015). The daily smokers' characteristics included race/ethnicity, age, sex, marital status, education, employment record, region of residency (West, South, Midwest, and Northeast), metropolitan area of residency (metro, nonmetro), and heavy smoking indicator. The analysis incorporated statistical methods recommended in the methodological guidelines for analysis of the CPS and CPS supplements [34, 35]. Specifically, because the CPS incorporates complex sampling, we estimated variance using balanced repeated replications [36]. The main and 160 replicate weights for this approach have been made available for public use by the U.S. Census Bureau [34, 35]. The analysis was performed using SAS®9.4 software [37]; the SAS®9.4 Survey Package procedures suitable for analysis of TUS-CPS have been discussed elsewhere [38]. **Table 1** depicts the estimated model coefficients and their standard errors for all covariates. As is shown in **Table 1**, smokers' sex and survey mode (phone, in-person) were not significant. **Table 1** presents the individual p-values for comparisons of racial/ethnic populations of daily smokers versus W daily smokers (based on the model): < 0.0001 *pAIAN* , < 0.0001 *pASIAN* , < 0.0001 *pBAA* , < 0.0001 *pH* , = 0.0002 *pHPI* , and *pMULT* = 0.0087 .The individual lower 95% confidence interval bounds for the mean PPP difference for each racial/ethnic population of daily smokers relative to

≈ 30%, F(25, 160) ≈ 257, p < 0.0001) to model the mean PPP as a func-

<sup>=</sup> − = 0.95, 160 <sup>=</sup> ( ) ˆ ˆ , , , ,, , , *L d t SE d i AIAN ASIAN BAA H HPI MULT i i df <sup>i</sup>* (5)

(the point estimate for the *ith* mean difference), ( ) <sup>ˆ</sup> *SE di* is the standard error of

**Figure 1** depicts the lower bounds s *Li* and the estimated mean differences <sup>ˆ</sup> <sup>s</sup> *di* for all racial/ethnic populations (relative to the W population). These bounds were computed using *proc surveyreg* procedure with *lsmestimate* statements (with "cl," "e,"

the estimate (computed using the balanced repeated replications), and *t*0.05, 160 *df* <sup>=</sup> = 1.6544 is the 95th percentile of the central *t*-distribution with 160 degrees of freedom (the number of degrees of freedom matches the number of the replicate weights) [34–36]. We note that there are alternative methods to construct the lower bounds, for example, using the standard normal distribution instead of

*di* denotes the estimated mean PPP difference relative to W daily smokers

**2. Methods and results**

regression (R2

**168**

where <sup>ˆ</sup>

the central *t*-distribution [34, 36].

#### **Figure 1.**

*Individual lower 95% confidence intervals for the mean price per pack differences relative to non-Hispanic (NH) White daily smokers; the lower number corresponds to the lower bound and the upper number corresponds to the point estimate for the mean difference. For example, AIAN daily smokers, on average, pay at least \$0.39 more per pack of cigarettes than do NH White daily smokers, and the point estimate for the difference is \$0.61.*

"upper," and "alpha = 0.05" options) when fitting the model using SAS software. Alternatively, we could use the *lsmeans* statement (with "adj = bon," "cl," and "alpha = 0.1" options), and select the comparisons of interest out of all 21 pair-wise comparisons reported and note the lower bound of the two-sided 90% confidence interval reported in the output.

#### **2.2 Demonstrating the study goal via the min test and SBH confidence interval**

The p-value for the Min test is *p* = 0.0087 , indicating that at 5% significance level we reject the null hypothesis in favor of the alternative. The corresponding SBH lower 95% confidence interval bound for the mean PPP difference is \$0.08 (see **Figure 1**). Therefore, all six racial/ethnic groups of daily smokers paid, on average, higher PPP relative to W daily smokers in the United States in the periods from 2010–2011 to 2014–2015.

If instead of the Min test we used the Bonferroni approach, then the adjusted p-values would be less than 0.0006 for four comparisons (AIAN versus W, ASIAN versus W, BAA versus W, and H versus W), 0.0012 for one comparison (HPI versus W), and 0.0522 for one comparison (MULT versus W). Therefore, we would conclude that only AIAN, ASIAN, BAA, H, and HPI daily smokers pay higher PPP, on average, than do W daily smokers; and would fail to demonstrate that all six considered racial/ ethnic groups of daily smokers pay higher PPP, on average, relative to W daily smokers.

### **3. Discussion**

The choice of the reference group as "W daily smokers" was based on the study goal and prior studies of cigarette purchasing behaviors of smokers [1, 33]. The

**171**

**4. Conclusion**

100 1( −

α

*On Statistical Assessments of Racial/Ethnic Inequalities in Cigarette Purchase Price among Daily…*

choice of the reference group as well as the statistical methods should always align with the study goal and should be made prior to the data analysis. Specifically, when examining racial/ethnic disparities, using "W" as the reference group could be logical in some studies but not logical in the other studies. For example, if the study goal is to show that purchasing cigarettes on Indian reservations is most prevalent among AIAN smokers, then "AIAN smokers" should be chosen as the reference group. In addition, while both Bonferroni method and the Min test are simple to use, in practice, only Bonferroni method results in individual conclusions regarding each comparison. However, Bonferroni method is less powerful than the Min test

when applied to an intersection-union problem (to assess Goal 2) [6, 12].

The study indicated that W daily smokers paid significantly less for cigarettes, on average, than the other six racial/ethnic groups of daily smokers in the United States in the period from 2010–2011 to 2014–2015. The earlier reported finding (see model 6 in [1]) was that non-Hispanic White smokers, on average, paid significantly less for cigarettes than did BAA, AIAN, ASIAN/HPI (combined), and H smokers, and paid similar prices to the prices paid by "other non-Hispanic" smokers [1]. While the results might seem to disagree, the direct comparisons between these two findings are problematic, because the studies concerned different populations of smokers (daily smokers in our study, and daily and occasional smokers in the prior study) and time periods (overall 2010–2011 and 2014–2015 in our study, and 2010–2011 in the prior study). Moreover (though, the authors did not mention the method they used to adjust for multiple comparisons, if any), the authors considered the union-intersection problem that is conceptually different from the intersection-union problem addressed in our study [1]. Our study has several potential limitations. First, we considered the population of daily smokers, and thus, results should not be generalized to other populations of smokers such as occasional smokers. Indeed, daily and occasional smokers have very different cigarette purchasing behaviors, for example, daily smokers are more likely to purchase cigarettes in cartons rather than packs and travel to another state or Indian reservations to purchase cigarettes at lower prices [1, 39, 40]. Second, the analysis was based on a certain regression model where the mean PPP was modeled as a function of smokers' characteristics, location of the purchase, survey mode, and survey period. Another model could potentially lead to a different conclusion, for example, only two out of six models indicated significantly higher mean PPP for AIAN smokers relative to W smokers [1]. Another potential limitation is a lack of a theoretical proof that the SBH interval for the smallest mean PPP difference has indeed confidence level of

) %. The probability coverage of the SBH confidence interval depends on

α) % level.

the probability coverage of the individual confidence intervals for the mean differences [23]. Because we used the statistical methods outlined in the CPS methodological guidelines for constructing the individual intervals, we believe that the resulting

Future research may target development and implementation of procedures for the Min test and SBH interval. Specifically, the software packages developed for analysis of complex survey data currently offer just a few multiple comparison methods. For example, the SAS Survey Package offers a built-in procedure for Bonferroni adjustments but lacks procedures for the multiple testing (interval estimation) such as the Min test (SBH interval). Availability of the "Min test" and "SBH interval" procedures would enable researchers to incorporate these methods

In our study, results of the Min test (and SBH interval) were different from the results of the Bonferroni method. Specifically, using the Min test (and SBH

SBH interval has the probability coverage close to 100 1( −

directly in their analyses of complex survey data.

*DOI: http://dx.doi.org/10.5772/intechopen.93380*

#### *On Statistical Assessments of Racial/Ethnic Inequalities in Cigarette Purchase Price among Daily… DOI: http://dx.doi.org/10.5772/intechopen.93380*

choice of the reference group as well as the statistical methods should always align with the study goal and should be made prior to the data analysis. Specifically, when examining racial/ethnic disparities, using "W" as the reference group could be logical in some studies but not logical in the other studies. For example, if the study goal is to show that purchasing cigarettes on Indian reservations is most prevalent among AIAN smokers, then "AIAN smokers" should be chosen as the reference group. In addition, while both Bonferroni method and the Min test are simple to use, in practice, only Bonferroni method results in individual conclusions regarding each comparison. However, Bonferroni method is less powerful than the Min test when applied to an intersection-union problem (to assess Goal 2) [6, 12].

The study indicated that W daily smokers paid significantly less for cigarettes, on average, than the other six racial/ethnic groups of daily smokers in the United States in the period from 2010–2011 to 2014–2015. The earlier reported finding (see model 6 in [1]) was that non-Hispanic White smokers, on average, paid significantly less for cigarettes than did BAA, AIAN, ASIAN/HPI (combined), and H smokers, and paid similar prices to the prices paid by "other non-Hispanic" smokers [1]. While the results might seem to disagree, the direct comparisons between these two findings are problematic, because the studies concerned different populations of smokers (daily smokers in our study, and daily and occasional smokers in the prior study) and time periods (overall 2010–2011 and 2014–2015 in our study, and 2010–2011 in the prior study). Moreover (though, the authors did not mention the method they used to adjust for multiple comparisons, if any), the authors considered the union-intersection problem that is conceptually different from the intersection-union problem addressed in our study [1].

Our study has several potential limitations. First, we considered the population of daily smokers, and thus, results should not be generalized to other populations of smokers such as occasional smokers. Indeed, daily and occasional smokers have very different cigarette purchasing behaviors, for example, daily smokers are more likely to purchase cigarettes in cartons rather than packs and travel to another state or Indian reservations to purchase cigarettes at lower prices [1, 39, 40]. Second, the analysis was based on a certain regression model where the mean PPP was modeled as a function of smokers' characteristics, location of the purchase, survey mode, and survey period. Another model could potentially lead to a different conclusion, for example, only two out of six models indicated significantly higher mean PPP for AIAN smokers relative to W smokers [1]. Another potential limitation is a lack of a theoretical proof that the SBH interval for the smallest mean PPP difference has indeed confidence level of 100 1( −α ) %. The probability coverage of the SBH confidence interval depends on the probability coverage of the individual confidence intervals for the mean differences [23]. Because we used the statistical methods outlined in the CPS methodological guidelines for constructing the individual intervals, we believe that the resulting SBH interval has the probability coverage close to 100 1( −α) % level.

Future research may target development and implementation of procedures for the Min test and SBH interval. Specifically, the software packages developed for analysis of complex survey data currently offer just a few multiple comparison methods. For example, the SAS Survey Package offers a built-in procedure for Bonferroni adjustments but lacks procedures for the multiple testing (interval estimation) such as the Min test (SBH interval). Availability of the "Min test" and "SBH interval" procedures would enable researchers to incorporate these methods directly in their analyses of complex survey data.
