2.3. Method comparison

The number of replication studies to perform can be calculated based on the number of acceptable failures. The sample size calculation is based on set levels of confidence and reliability. Confidence (accuracy) is the difference between 1 and type I error rate. Reliability is the degree of precision. For a failure rate of 0 (i.e. we are not allowing any incorrect results),

<sup>n</sup> <sup>¼</sup> ln 1ð Þ � confidence

then the calculation of the sample size is based on the following equation:

f

n i � �

to plot measured values of the reference material for each run on the plot (Figure 2).

i¼1

<sup>1</sup> � Confidence <sup>¼</sup> <sup>X</sup>

where f is the failure rate and n is the sample size.

The confidence level is often set at 0.95 and reliability at 0.90 or 0.80. If we allow failure events,

In a Levey-Jennings plot the X-axis represents time and Y-axis represents the measured value. Reference lines are drawn parallel to the X-axis corresponding to mean, mean �1 standard deviations, mean �2 standard deviations, and mean �3 standard deviations. The next step is

Westgard rules are a set of guidelines set by Dr. James Westgard for identification of random and systematic error in laboratory quality control experiments. They are based on repeated measurements of at least two reference samples with each analytical run. Some of the Westgard rules are

Figure 2. An example of a Levey-Jennings plot. X-axis plots the time of measurement (e.g. day) and the Y-axis plot the measurement value for that unit of time. The lines denoting the mean value and 1, 2 and 3 standard deviations from the

ð Þ <sup>1</sup> � Reliability <sup>i</sup>

ln ð Þ reliability (1)

Reliability<sup>n</sup>�<sup>i</sup> (2)

the equation can be stated as:

52 Quality Control in Laboratory

2.2. Westgard rules

mean are explained in the figure.

Method comparison is used for initial assay validation as well as for studying accuracy of a test. The aim of method comparison is to establish whether the assay measures what it is supposed to measure and how accurately it measures it. The findings of method comparison also allow for correction of the results if a bias is found (i.e. calibration). The principal for method comparison is that a gold standard or a standard reference material exists where in the amount of analyte in the sample is exactly known (or known with a high degree of accuracy). We can use this reference standard as a comparator against the performance of our assay and determine the degree of bias that exists in our measurements. This essentially means that we are measuring the relative performance of our assay against the reference standard.

Ideally, identification of a bias should lead to a search for the source of the bias and systematic error, and attempts should be made to rectify the cause of the observed bias. However, there are instances in which no fault or solvable problem is identified; in these instances, if the assay has enough precision and stability as well as clinical merit then we can use the findings of method comparison to adjust for the observed bias.

Bias can take two general forms: constant bias and proportional bias. The constant bias is a difference between the observed measurement and the expected measurement that is constant throughout the range of the observations. Constant bias (β0) is represented in regression statistics

Figure 3. Examples of systematic error in Levey-Jennings plot: A. An example of 2-2S rule, B. An example of 4-1S rule, C. An example of 10x rule.

as intercept. Proportional bias (β1), on the other hand, is proportional to the observed value of the measurement and varies across the range of measurements. Proportional bias is represented in regression statistics as the slope of the regression line. If the expected value of measurement is Yi for each sample i, and the observed value of measurement for sample i is Xi, then we can form a linear regression between the expected values and observed values:

$$Y\_i = \beta \mathbf{0} + \beta \mathbf{1} \ X\_i + \varepsilon\_i \tag{3}$$

problems can sometimes be caused by the difference in the composition of calibrator samples and the standard samples or biologic test matrices. The matrix of the reference standard is usually near the actual matrix of the patient samples and thus may contain confounders which may adversely affect the measurement. Yet calibrators often do not have a biologic matrix. If the source of the proportional bias is due to calibration problems, then a recalibration can rectify the

The problem with the Youden assumption is that it considers our observations to have no random disruptions, an assumption which is false as we know every measurement is associated with a degree of uncertainty and imprecision. Alternatively, we can use Deming's regression where the random error for both expected and observed values is factored into the calculation of the proportional and constant bias. In Deming's regression a ratio of the vari-

> <sup>δ</sup> <sup>¼</sup> σε 2 ση

observed values random error. Using this ratio, the OLS estimator for the proportional bias can

If a linear relation between errors and measurements exists (or is assumed) then an alternative method for error detection is to create Bland-Altman plots. In these plots, the average of the paired values for expected and observed values is plotted on the x-axis and the difference of each pair is plotted on the y-axis. In this method the average difference of the values is called bias and the standard deviation of the differences is also calculated to determine the limits of

The Bland-Altman approach allows for a visual inspection of the proportional bias. However, by dividing the limits of agreement by the mean value of the expected values we can obtain a metric called percentage error. The acceptable percentage error levels for different analytes have been determined and are standardized. In cases where the percentage error

One of the important statistics for simple linear regression is calculation of the Pearson's r coefficient. This coefficient shows how well the compared results change together and can have values of between minus 1 and 1. This coefficient can be calculated by dividing the

exceeds the acceptable levels, corrective action is needed for the detected bias [10].

covariance of the two variables to the product of their standard deviations:

<sup>2</sup> (6)

Systematic Error Detection in Laboratory Medicine http://dx.doi.org/10.5772/intechopen.72311 55

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð Þ Var Yð Þ� <sup>δ</sup>Var Xð Þ <sup>2</sup> <sup>þ</sup> <sup>4</sup>δCovar Xð Þ ;<sup>Y</sup> <sup>2</sup>

<sup>2</sup>Covar Xð Þ ; <sup>Y</sup> (7)

<sup>2</sup> is the variance of the

ances of the random error of observed and expected values is calculated:

<sup>β</sup><sup>1</sup> <sup>¼</sup> <sup>ð</sup>Var Yð Þ� <sup>δ</sup>Var Xð ÞÞ þ

agreement which constitutes Mean difference �1.96SD.

<sup>2</sup> is the variance of the expected values random error and ση

q

This regression formula is also known as the maximum likelihood estimator [9].

problem.

where σε

be given by:

2.4. R statistics

where ε<sup>i</sup> is the random error of the expected observations under the Youden assumption which states that the random error of observed values is smaller than the random error for expected values.

The regression formula is the representation of the best regression line that shows the relationship of the observed value to the expected value. Figure 4 shows the regression lines for different constant and proportional bias levels.

If no bias exists then Yi ¼ Xi.

The simple linear regression formula allows us to calculate the constant and proportional bias using a simple unweighted ordinary least squares estimator. In ordinary least squares (OLS) models, different candidate values for the parameter vector β1 are tested to create regression lines. Then for each i-th observation the residual for that observation is calculated by measuring the vertical distance between the data point (Yi, Xi) and the regression line formed using the candidate value. The sum of squared residuals (SSR) is determined as a measure of the overall model fit. The candidate value that minimizes the sum of squared residuals is considered as the OLS estimator for the slope. For simple method comparison studies where only two comparators are present the model can be simplified as:

$$\beta 1 = \frac{\sum X\_i Y\_i - \frac{1}{n} \sum X\_i \sum Y\_i}{\sum X\_i^2 - \frac{1}{n} \left(\sum X\_i\right)^2} = \frac{\text{Covariance}(X, Y)}{\text{Variance}(X)} \tag{4}$$

The constant bias can be calculated by subtracting the mean expected value from mean observed value weighted by proportional bias:

$$
\beta \mathbf{0} = \overline{Y} - \beta \mathbf{1} \,\overline{X} \tag{5}
$$

Constant and proportional bias usually has different root causes. Constant bias often stems from insufficient blank sample correction and is fairly easy to address and rectify. Proportional

Figure 4. A. When no systematic error exists. B. Shows constant bias. C. Shows a proportional bias.

problems can sometimes be caused by the difference in the composition of calibrator samples and the standard samples or biologic test matrices. The matrix of the reference standard is usually near the actual matrix of the patient samples and thus may contain confounders which may adversely affect the measurement. Yet calibrators often do not have a biologic matrix. If the source of the proportional bias is due to calibration problems, then a recalibration can rectify the problem.

The problem with the Youden assumption is that it considers our observations to have no random disruptions, an assumption which is false as we know every measurement is associated with a degree of uncertainty and imprecision. Alternatively, we can use Deming's regression where the random error for both expected and observed values is factored into the calculation of the proportional and constant bias. In Deming's regression a ratio of the variances of the random error of observed and expected values is calculated:

$$
\delta = \frac{\sigma\_\epsilon^2}{\sigma\_\eta^2} \tag{6}
$$

where σε <sup>2</sup> is the variance of the expected values random error and ση <sup>2</sup> is the variance of the observed values random error. Using this ratio, the OLS estimator for the proportional bias can be given by:

$$\beta 1 = \frac{\left(Var(Y) - \delta Var(X)\right) + \sqrt{\left(Var(Y) - \delta Var(X)\right)^2 + 4\delta Covur(X, Y)^2}}{2\text{Covur}(X, Y)}\tag{7}$$

This regression formula is also known as the maximum likelihood estimator [9].

If a linear relation between errors and measurements exists (or is assumed) then an alternative method for error detection is to create Bland-Altman plots. In these plots, the average of the paired values for expected and observed values is plotted on the x-axis and the difference of each pair is plotted on the y-axis. In this method the average difference of the values is called bias and the standard deviation of the differences is also calculated to determine the limits of agreement which constitutes Mean difference �1.96SD.

The Bland-Altman approach allows for a visual inspection of the proportional bias. However, by dividing the limits of agreement by the mean value of the expected values we can obtain a metric called percentage error. The acceptable percentage error levels for different analytes have been determined and are standardized. In cases where the percentage error exceeds the acceptable levels, corrective action is needed for the detected bias [10].

#### 2.4. R statistics

as intercept. Proportional bias (β1), on the other hand, is proportional to the observed value of the measurement and varies across the range of measurements. Proportional bias is represented in regression statistics as the slope of the regression line. If the expected value of measurement is Yi for each sample i, and the observed value of measurement for sample i is Xi, then we can form

where ε<sup>i</sup> is the random error of the expected observations under the Youden assumption which states that the random error of observed values is smaller than the random error for expected values. The regression formula is the representation of the best regression line that shows the relationship of the observed value to the expected value. Figure 4 shows the regression lines for

The simple linear regression formula allows us to calculate the constant and proportional bias using a simple unweighted ordinary least squares estimator. In ordinary least squares (OLS) models, different candidate values for the parameter vector β1 are tested to create regression lines. Then for each i-th observation the residual for that observation is calculated by measuring the vertical distance between the data point (Yi, Xi) and the regression line formed using the candidate value. The sum of squared residuals (SSR) is determined as a measure of the overall model fit. The candidate value that minimizes the sum of squared residuals is considered as the OLS estimator for the slope. For simple method comparison studies where only two comparators

PYi

The constant bias can be calculated by subtracting the mean expected value from mean

Constant and proportional bias usually has different root causes. Constant bias often stems from insufficient blank sample correction and is fairly easy to address and rectify. Proportional

<sup>2</sup> <sup>¼</sup> Covariance Xð Þ ;<sup>Y</sup>

Variance Xð Þ (4)

β0 ¼ Y � β1 X (5)

Yi ¼ β0 þ β1 Xi þ ε<sup>i</sup> (3)

a linear regression between the expected values and observed values:

different constant and proportional bias levels.

are present the model can be simplified as:

β1 ¼

observed value weighted by proportional bias:

<sup>P</sup>XiYi � <sup>1</sup>

PXi

n PXi

<sup>2</sup> � <sup>1</sup> n <sup>P</sup>ð Þ Xi

Figure 4. A. When no systematic error exists. B. Shows constant bias. C. Shows a proportional bias.

If no bias exists then Yi ¼ Xi.

54 Quality Control in Laboratory

One of the important statistics for simple linear regression is calculation of the Pearson's r coefficient. This coefficient shows how well the compared results change together and can have values of between minus 1 and 1. This coefficient can be calculated by dividing the covariance of the two variables to the product of their standard deviations:

$$r = \frac{\text{Covar}(X, Y)}{\sigma\_X \sigma\_Y} \tag{8}$$

A t-test with a significant p-value signifies the presence of a significant bias in the mean of the methods. The next step then would be to determine whether the systematic error represents a constant bias or a proportional bias. This can be done by examining the regression curve or equation. The presence of an intercept signifies a constant bias while presence of a slope other than 1 signifies proportional error. The correction for a constant bias is simple and would require adding the constant to the measurement results. Correction of the proportional bias,

The f-test compares the expected variance of the values to the observed variance; while the ttest compares the centroid of the data points (the mean), the f-test deals with the distribution and variance of the data points (the variance). The t-test is more sensitive to differences in the values in the middle of the data range while f-test is more sensitive to differences in the extremes of the data range. A significant f-test would signify random error in the measurement or in other words imprecision. To calculate the f-test the following equation is used (the larger of the two variances will always be the numerator and the smaller one the denominator in this

> <sup>f</sup> <sup>¼</sup> Var<sup>1</sup> Var<sup>2</sup>

The degree of freedom of the f-test is (n-1, n-1) and the significance threshold can be looked in a

It is important to perform the f-test prior to the t-test; one of the basic assumptions of the t-test is that the standard deviations of the data points are similar between the two groups, i.e. no significant imprecision should exist for t-test results to be valid. In presence of a significant imprecision, the determination of presence of a significant bias should be done using a Cochran

In Cochran variant of t-test, standard deviation cannot be pooled between the two groups:

<sup>t</sup> <sup>¼</sup> <sup>μ</sup><sup>D</sup> ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Var1þVar<sup>2</sup> n

t

Accuracy profiling has moved away from treating bias and imprecision as separate entities. In fact, most guidelines (whether based on the total error principles or measurement uncertainty principles) combine bias and imprecision for acceptability criteria. To calculate bias and

<sup>n</sup> ð Þ Var<sup>1</sup> þ Var<sup>2</sup> Var1þVar<sup>2</sup> n

<sup>q</sup> (12)

Systematic Error Detection in Laboratory Medicine http://dx.doi.org/10.5772/intechopen.72311

(11)

57

(13)

however, requires a recovery experiment as described in Section 3.8 below.

fraction):

variant of the t-test.

2.6. Accuracy profile

f-table corresponding the degree of freedom.

The critical value for the t-statistics should also be calculated:

where t is the t-score corresponding to n-1 degrees of freedom [11].

Critical t ¼

The closer the r coefficient gets to 1, the greater the linear relationship is between the two variables. Some interpret the r coefficient as a measure of correlation with r coefficients more than 0.8 showing correlation. However, in laboratory medicine a correlation of 0.8 actually signifies a great degree of bias. In fact, laboratories should aim for a perfect degree of linearity (r > 0.99) to ensure that systematic error is minimized. Attaining a Pearson's r coefficient of <0.975 signals the presence of systematic error and should prompt the lab to conduct further investigation (using t-test and f-test) to determine the source of this error.

The degree of agreement or the coefficient of determination (R<sup>2</sup> ). This coefficient is calculated from the ratio of explained variance to the total variance of Y:

$$R^2 = \frac{\sum\left(\widehat{Y}\_i - \overline{Y}\right)^2}{\sum\left(Y\_i - \overline{Y}\right)^2} \tag{9}$$

where Yb<sup>i</sup> is the calculated value of Y based on the regression for the i-th observation and Yi is the actual value of Y for i-th observation.

Alternatively, the coefficient of determination can be simply calculated by squaring the Pearson's r coefficient. While the Pearson's r coefficient shows the presence of linearity, the coefficient of determination helps us to determine how well the regression line fits the actual data points. In assessment of a method comparison evaluating this coefficient is necessary as it shows fit of the model: The closer the coefficient gets to 1, the better the regression line fits actual data points. However, it must be noted that even at numbers very close to 1 significant bias may exist. For example, a 5% bias will only result in a R squared score of 0.99 and a 10% bias will result in a R squared score of 0.96. For laboratory medicine purposes we should aim for a R squared score of more than 0.99.

#### 2.5. T-test and F-test

In cases where there is a suspicion of significant bias (as determined by Pearson's r or R squared statistics), then we should determine whether the bias stems from difference in the mean assay concentration or in the variance of the assay. To check for mean we run a paired t-test, and, to check for variance, we run an f-test.

The paired t-test is performed by comparing the means of the observed and expected values; more specifically the mean difference of the values (μD) is used for the comparison. The t-statistics can be calculated by:

$$t = \frac{\mu\_D}{\omega\_D \Big/ \sqrt{n}}\tag{10}$$

where n is the number of data points and σ<sup>D</sup> is the standard deviation of the mean difference. To determine the significance of the results (the p-value), the t-statistics should be looked up on a t table corresponding the degree of freedom; the degree of freedom in paired t-tests equals n–1.

A t-test with a significant p-value signifies the presence of a significant bias in the mean of the methods. The next step then would be to determine whether the systematic error represents a constant bias or a proportional bias. This can be done by examining the regression curve or equation. The presence of an intercept signifies a constant bias while presence of a slope other than 1 signifies proportional error. The correction for a constant bias is simple and would require adding the constant to the measurement results. Correction of the proportional bias, however, requires a recovery experiment as described in Section 3.8 below.

The f-test compares the expected variance of the values to the observed variance; while the ttest compares the centroid of the data points (the mean), the f-test deals with the distribution and variance of the data points (the variance). The t-test is more sensitive to differences in the values in the middle of the data range while f-test is more sensitive to differences in the extremes of the data range. A significant f-test would signify random error in the measurement or in other words imprecision. To calculate the f-test the following equation is used (the larger of the two variances will always be the numerator and the smaller one the denominator in this fraction):

$$f = \frac{Var\_1}{Var\_2} \tag{11}$$

The degree of freedom of the f-test is (n-1, n-1) and the significance threshold can be looked in a f-table corresponding the degree of freedom.

It is important to perform the f-test prior to the t-test; one of the basic assumptions of the t-test is that the standard deviations of the data points are similar between the two groups, i.e. no significant imprecision should exist for t-test results to be valid. In presence of a significant imprecision, the determination of presence of a significant bias should be done using a Cochran variant of the t-test.

In Cochran variant of t-test, standard deviation cannot be pooled between the two groups:

$$t = \frac{\mu\_D}{\sqrt{\frac{Var\_1 + Var\_2}{n}}} \tag{12}$$

The critical value for the t-statistics should also be calculated:

$$\text{Critical t} = \frac{\frac{t}{n}(Var\_1 + Var\_2)}{\frac{Var\_1 + Var\_2}{n}} \tag{13}$$

where t is the t-score corresponding to n-1 degrees of freedom [11].

#### 2.6. Accuracy profile

<sup>r</sup> <sup>¼</sup> Covar Xð Þ ; <sup>Y</sup> σXσ<sup>y</sup>

The closer the r coefficient gets to 1, the greater the linear relationship is between the two variables. Some interpret the r coefficient as a measure of correlation with r coefficients more than 0.8 showing correlation. However, in laboratory medicine a correlation of 0.8 actually signifies a great degree of bias. In fact, laboratories should aim for a perfect degree of linearity (r > 0.99) to ensure that systematic error is minimized. Attaining a Pearson's r coefficient of <0.975 signals the presence of systematic error and should prompt the lab to conduct further

> <sup>P</sup> <sup>Y</sup>b<sup>i</sup> � <sup>Y</sup> � �<sup>2</sup>

where Yb<sup>i</sup> is the calculated value of Y based on the regression for the i-th observation and Yi is

Alternatively, the coefficient of determination can be simply calculated by squaring the Pearson's r coefficient. While the Pearson's r coefficient shows the presence of linearity, the coefficient of determination helps us to determine how well the regression line fits the actual data points. In assessment of a method comparison evaluating this coefficient is necessary as it shows fit of the model: The closer the coefficient gets to 1, the better the regression line fits actual data points. However, it must be noted that even at numbers very close to 1 significant bias may exist. For example, a 5% bias will only result in a R squared score of 0.99 and a 10% bias will result in a R squared score of 0.96.

In cases where there is a suspicion of significant bias (as determined by Pearson's r or R squared statistics), then we should determine whether the bias stems from difference in the mean assay concentration or in the variance of the assay. To check for mean we run a paired t-test, and,

The paired t-test is performed by comparing the means of the observed and expected values; more specifically the mean difference of the values (μD) is used for the comparison. The t-statistics

> <sup>t</sup> <sup>¼</sup> <sup>μ</sup><sup>D</sup> <sup>σ</sup><sup>D</sup> = ffiffi n p

where n is the number of data points and σ<sup>D</sup> is the standard deviation of the mean difference. To determine the significance of the results (the p-value), the t-statistics should be looked up on a t table corresponding the degree of freedom; the degree of freedom in paired t-tests equals n–1.

For laboratory medicine purposes we should aim for a R squared score of more than 0.99.

investigation (using t-test and f-test) to determine the source of this error.

<sup>R</sup><sup>2</sup> <sup>¼</sup>

The degree of agreement or the coefficient of determination (R<sup>2</sup>

from the ratio of explained variance to the total variance of Y:

the actual value of Y for i-th observation.

2.5. T-test and F-test

56 Quality Control in Laboratory

can be calculated by:

to check for variance, we run an f-test.

(8)

(10)

). This coefficient is calculated

<sup>P</sup> Yi � <sup>Y</sup> � �<sup>2</sup> (9)

Accuracy profiling has moved away from treating bias and imprecision as separate entities. In fact, most guidelines (whether based on the total error principles or measurement uncertainty principles) combine bias and imprecision for acceptability criteria. To calculate bias and imprecision, we need to run a reproducibility study. Reproducibility of quantitative studies is obtained by repeated measurements of a sample in a series and then conducting multiple series of reproducibility studies.

The overall measurement of bias will be the difference between the mean value of the analyte obtained from the repeated measurement and the reference value:

$$\text{Bias} = \text{Overall mean} - \text{Reference value} \tag{14}$$

SL <sup>2</sup> <sup>¼</sup> <sup>1</sup> n

:

the H ratio:

And:

the intermediate precision:

2.7. Weighting procedures

Thus, we can rewrite the tolerance interval as [12]:

The next step is to calculate the G<sup>2</sup>

Which in turn is used to calculate C:

Varbetweenseries

The between-series reproducibility is used in calculation of the Mee factor (Ks). Mee factor is the other component of intermediate precision. Since the calculation of the Mee factor is complicated we have broken it down into a series of equations. The first step is to calculate

> <sup>H</sup> <sup>¼</sup> SL 2 Sr

<sup>G</sup><sup>2</sup> <sup>¼</sup> <sup>H</sup> <sup>þ</sup> <sup>1</sup>

1 npG<sup>2</sup> !<sup>1</sup>=<sup>2</sup>

> <sup>H</sup>þ<sup>1</sup> ð Þ<sup>n</sup> 2 <sup>p</sup>�<sup>1</sup> <sup>þ</sup> <sup>1</sup>�<sup>1</sup> n np

C ¼ 1 þ

The final step is to multiply C by the t-score associated with the degree of freedom (dof):

Degree of Freedom <sup>¼</sup> ð Þ <sup>H</sup> <sup>þ</sup> <sup>1</sup> <sup>2</sup>

By calculating the Mee factor and the standard deviation of reproducibility we can now obtain

The problem with simple linear regression is that is based on a set of assumptions; one of the problematic assumptions is that the standard deviation of the random error is constant throughout the range of measurement. This assumption, however, is often wrong as the standard error of measurement is often much larger near the extremes of measurement range (near the limit of detection and the highest range of linearity). The solution in laboratory

<sup>p</sup> � <sup>1</sup> � Sr

2

� � (19)

Systematic Error Detection in Laboratory Medicine http://dx.doi.org/10.5772/intechopen.72311

<sup>2</sup> (20)

nH <sup>þ</sup> <sup>1</sup> (21)

Ks ¼ C � tdof (24)

Intermediate precision ¼ Ks � SR (25)

Tolerance Interval ¼ reference value þ bias � ð Þ Ks � SR (26)

(22)

59

(23)

Bias and imprecision are used to form the tolerance interval; it is the interval which, with a determined degree of confidence, a specified proportion of results for a sample fall. Tolerance interval can be expressed as:

$$\text{Tolerance Interval} = \text{reference value} + \text{bias} \pm \text{intermediate precision} \tag{15}$$

For laboratory medicine, the tolerance interval of analytes needs to be smaller than the acceptability limits. In united states, the acceptability limits are set and governed by the Clinical Laboratory Improvement Amendments of 1988 (CLIA88). These acceptability limits are provided under the following heading: 42 CFR Part 493, Subpart I - Proficiency Testing Programs for Nonwaived Testing (https://www.gpo.gov/fdsys/pkg/CFR-2011-title42-vol5/pdf/CFR-2011 title42-vol5-part493.pdf).

The important factor from intermediate precision that is needed in calculation of tolerance interval is the standard deviation of reproducibility (SR). The standard deviation of reproducibility can be calculated by the following equation:

$$\left|S\_{R}\right|^{2} = \frac{1}{n} \left(\frac{Var\_{\text{betweenersies}}}{p-1} + (n-1)\frac{Var\_{\text{withinersies}}}{n-p}\right) \tag{16}$$

where n is the number of within-series measurement repeats and p is the number of series of reproducibility measurements.

An advantage of calculating the intermediate precision is that we can use it in combination with within- series repeatability to determine the uncertainty of bias:

$$\text{Uncertainty of Bias} = 1.96 \left[ \frac{n \left( \text{S}\_{\text{R}}^{2} - \text{S}\_{r}^{2} \right) + \text{S}\_{r}^{2}}{np} \right]^{1/2} \tag{17}$$

Sr <sup>2</sup> is the within-series repeatability and can be calculated using the following equation:

$$\left|S\_r\right|^2 = \frac{Var\_{withinersies}}{p(n-1)}\tag{18}$$

Uncertainty of bias is essentially 1.96 times the standard deviation of bias which corresponds to a 95% confidence interval for bias determination.

The between-series reproducibility is calculated using the following equation:

Systematic Error Detection in Laboratory Medicine http://dx.doi.org/10.5772/intechopen.72311 59

$$\left| S\_L \right|^2 = \frac{1}{n} \left( \frac{Var\_{between}}{p-1} - S\_r^{-2} \right) \tag{19}$$

The between-series reproducibility is used in calculation of the Mee factor (Ks). Mee factor is the other component of intermediate precision. Since the calculation of the Mee factor is complicated we have broken it down into a series of equations. The first step is to calculate the H ratio:

$$H = \frac{\mathcal{S}\_L^{-2}}{\mathcal{S}\_r^{-2}}\tag{20}$$

The next step is to calculate the G<sup>2</sup> :

$$G^2 = \frac{H+1}{nH+1} \tag{21}$$

Which in turn is used to calculate C:

$$\mathcal{C} = \left(1 + \frac{1}{npG^2}\right)^{1/2} \tag{22}$$

The final step is to multiply C by the t-score associated with the degree of freedom (dof):

$$Degree\ of\ Freadom = \frac{\left(H+1\right)^2}{\frac{\left(H+\frac{1}{n}\right)^2}{p-1} + \frac{1-\frac{1}{n}}{np}}\tag{23}$$

And:

imprecision, we need to run a reproducibility study. Reproducibility of quantitative studies is obtained by repeated measurements of a sample in a series and then conducting multiple

The overall measurement of bias will be the difference between the mean value of the analyte

Bias and imprecision are used to form the tolerance interval; it is the interval which, with a determined degree of confidence, a specified proportion of results for a sample fall. Tolerance

For laboratory medicine, the tolerance interval of analytes needs to be smaller than the acceptability limits. In united states, the acceptability limits are set and governed by the Clinical Laboratory Improvement Amendments of 1988 (CLIA88). These acceptability limits are provided under the following heading: 42 CFR Part 493, Subpart I - Proficiency Testing Programs for Nonwaived Testing (https://www.gpo.gov/fdsys/pkg/CFR-2011-title42-vol5/pdf/CFR-2011-

The important factor from intermediate precision that is needed in calculation of tolerance interval is the standard deviation of reproducibility (SR). The standard deviation of reproduc-

where n is the number of within-series measurement repeats and p is the number of series of

An advantage of calculating the intermediate precision is that we can use it in combination

<sup>2</sup> is the within-series repeatability and can be calculated using the following equation:

<sup>2</sup> <sup>¼</sup> Varwithinseries

Uncertainty of bias is essentially 1.96 times the standard deviation of bias which corresponds

Sr

The between-series reproducibility is calculated using the following equation:

<sup>p</sup> � <sup>1</sup> <sup>þ</sup> ð Þ <sup>n</sup> � <sup>1</sup> Varwithinseries

n SR

<sup>2</sup> � Sr <sup>2</sup> � � <sup>þ</sup> Sr

np " #<sup>1</sup>=<sup>2</sup>

� �

n � p

2

p nð Þ � <sup>1</sup> (18)

(16)

(17)

Varbetweenseries

with within- series repeatability to determine the uncertainty of bias:

Uncertainty of Bias ¼ 1:96

Tolerance Interval ¼ reference value þ bias � intermediate precision (15)

Bias ¼ Overall mean � Reference value (14)

obtained from the repeated measurement and the reference value:

series of reproducibility studies.

58 Quality Control in Laboratory

interval can be expressed as:

title42-vol5-part493.pdf).

reproducibility measurements.

Sr

ibility can be calculated by the following equation:

SR <sup>2</sup> <sup>¼</sup> <sup>1</sup> n

to a 95% confidence interval for bias determination.

$$K\_s = \mathbb{C} \times t\_{dof} \tag{24}$$

By calculating the Mee factor and the standard deviation of reproducibility we can now obtain the intermediate precision:

$$\text{Intermediate precision} = \text{K}\_s \times \text{S}\_R \tag{25}$$

Thus, we can rewrite the tolerance interval as [12]:

$$\text{Tolerance Interval} = \text{reference value} + \text{bias} \pm (\text{K}\_s \times \text{S}\_R) \tag{26}$$

#### 2.7. Weighting procedures

The problem with simple linear regression is that is based on a set of assumptions; one of the problematic assumptions is that the standard deviation of the random error is constant throughout the range of measurement. This assumption, however, is often wrong as the standard error of measurement is often much larger near the extremes of measurement range (near the limit of detection and the highest range of linearity). The solution in laboratory medicine can be to run linearity experiments and limit the measurement range based on the linearity results. Despite this the effect of random variation on the regression line remains. To rectify this, a solution is to employ a weighting procedure.

The simplest weighting procedure is to use the standard deviation of variation for each data point of the method comparison study. This requires that the method comparison study is repeated multiple times (20-30 times). This allows us to calculate the standard deviation of measurement for each point (Si). The weighting coefficient will then be the inverse of this standard deviation:

$$w\_i = \frac{1}{S\_i} \tag{27}$$

3. Bias detection without comparators

a considerable investment in terms of time, labor and cost.

3.1. Average of normal (AON)

including the inherent biologic uncertainty into the calculation of bias.

reference normal can signal the presence of a systematic error.

then we have detected a systematic error in the analytical run.

Up to this point we have discussed bias detection methods that use a reference material or comparator to assess the presence of bias. While this has been the accepted standard for many laboratory regulatory agencies, there are arguments against this approach to bias detection: first of all, the assumption of method comparison studies is that the reference material (control samples) values are true and do not suffer from imprecisions. The measurement uncertainty is considered to be minimal in these samples. Yet, unless these samples vary considerably from the biologic sample matrix, a degree of measurement uncertainty would exist in these samples which lead to inaccurate estimates of bias and imprecision of laboratory instruments and techniques. On the other hand, running repeated control samples with each run and the need for revalidation of the instrument and techniques after each change in the parameters, requires

Systematic Error Detection in Laboratory Medicine http://dx.doi.org/10.5772/intechopen.72311 61

Alternatively, the systematic error can be determined by using the patient samples. This can be done by either tracking the results of known normal patients (i.e. those expected to have a result within the reference range based on their clinical and physiologic state) or by following the trend of all the results of an analyte over time. Using patient samples has the advantage of

In this approach the comparator for quality control would the average values of the analyte in normal individuals. This requires us to know the population average and standard deviation for that analyte. If we measure the analyte in a normal individual, we would expect the results to approximate the population average. Deviations of the normal results from the expected

In AON, the mean value of normal samples is compared to a mean reference value. The mean reference value should be established by the laboratory based on the population it serves; this is best done as part of the initial validation of an assay when a large size sample of normal individuals is tested to establish the reference ranges. This experiment allows us to calculate the population mean, standard deviation and standard error (SD/√N). We expect the Average of Normals from our analytical run to fall within the 95% confidence interval of the population mean.

With each analytical run, a sample of normal results should be used to calculate the Average of Normals for that analytical run. If the calculate average is beyond the 95% CI of the population

In AON method, as the size of the normal sample increases the probability of detecting bias also increases. The size calculations for the AON method are determined by the ratio of the biological variance of the target analyte (CVb) to the variance of the method (CVa) (CVb/CVa)

95%CI ¼ Population Mean � 1:96 Standard Error (30)

This weight can then be incorporated into the equations of the method comparison. For example, the r coefficient can be recalculated as:

$$r = \frac{\sum w\_i \left(X\_i - \overline{X}\right) \left(Y\_i - \overline{Y}\right)}{\left(\sum w\_i \left(X\_i - \overline{X}\right)^2 \sum w\_i \left(Y\_i - \overline{Y}\right)^2\right)^{\frac{1}{2}}} \tag{28}$$

Weighting can often considerably decrease the bias percentage especially at the extremes of measurement compared to non-weighted regression. Weighting by inverse of standard deviation tends to normalize the relative bias at the extremes of measurement while weighting by inverse of variance tends to favor the bias correction for lower ends of measurement (less bias at lower concentrations). The decision for weighting and/or choice of weighting procedure should be based on the assay characteristics and performance requirements [13].

#### 2.8. Recovery percentage

To estimate the proportional bias, a recovery experiment is needed. The recovery experiments are performed by calculating the amount of recovery when adding a known amount of the analyte to the sample: this is done by dividing the measurement sample into two equal aliquots and performing the measurement for both aliquots. To one of the aliquots, a known amount of target analyte is added (aliquot 1). For the other aliquot (aliquot 2) an equal amount of diluent is added and the measurement is repeated. The recovery percentage can then be calculated:

$$\text{Recovery} \%= \frac{(\text{Analysis amount in aliquot 1}) - (\text{Analysis amount in aliquot 2})}{\text{Amount of analyte added to aliquot 1}} \times 100\tag{29}$$

The recovery or bias percentage is often used in laboratory medicine to state the proportional bias. Most of the regulatory agencies have set critical values for the recovery percentage for different analytes. The advantage of using recovery percentage is that it normalizes to 100 allowing for easier understanding of the scale of bias present [2].
