**6. Overview of clinical validation**

#### **6.1. Main objectives**

The aim of clinical validation is to determine whether a new monitor measures cardiac output reliably, which is done by comparing its performance with that of an accepted clinical standard such as single bolus thermodilution cardiac output. If the new monitor performs as well or better than the reference method, it can be accepted into clinical practice.

However, there are two important aspects to reliable cardiac output measurement:


The type of clinical data and statistic analysis needed to evaluate these two aspects are different.

If ones objective is to diagnose a low or high cardiac output, then the accuracy of individual readings in relation to the true value is of greatest importance. However, if ones objective is to follow the change in haemodynamic response to a therapeutic intervention, then serial cardiac output readings are needed and their absolute accuracy becomes less important, providing the readings reliably show the changes. This division into two roles may at first seem a little pedantic, but a monitor that does not measure cardiac output accurately may still be useful clinically if it detects trends reliably. As most bedside cardiac output monitors used today are now able to measure cardiac output continuously, although many are not particu‐ larly accurate, the issue of being a reliable trend monitor becomes very relevant. Unfortunately, the majority of published validation studies have only addressed accuracy [37].

output performed using a PAC has been used. The average of three thermodilution readings is used, and aberrant readings that differ by more than 10% are rejected, in order to improve the precision. However, thermodilution is not a gold standard method and significant measurement errors, both random and systematic, arise when it is used. It is generally accepted that thermodilution has a precision error of ±20%. True gold standard methods such as aortic flow probes have precisions errors of less than ±5%. Thus, thermodilution is an imprecise reference method and its use greatly influences the statistical analysis. Most of the benchmarks against which the outcomes of validation studies are judged are based on this precision of

Minimally Invasive Cardiac Output Monitoring in the Year 2012

http://dx.doi.org/10.5772/54413

65

Other more precise and gold standard reference methods could be used, such as the Fick method or a flow probe surgically placed on the aorta. However, in the clinical setting their use is inappropriate and thus the current clinical standard for cardiac output measurement thermodilution via a PAC is used. The current decline in the clinical use of PACs has left a void. Thus, some recently published validation studies have used transpulmonary thermodi‐ lution using the PiCCO system or oesophageal Doppler monitoring using the CardioQ as

Recently, the precision of ±20% for thermodilution has come under scrutiny. The reason that thermodilution is said to have a precision error of ±20% can be attributed to our 1999 publi‐ cation on bias and precision statistics which first proposed percentage error [39]. In the 1990's consensus of opinion was that for a monitor to be accepted into clinical use it should be able to detect at least a change in cardiac output of 1 L/min when the mean cardiac output was 5 L/min, which was a 20% change [40,41]. Furthermore, Stetz and colleagues meta-analysis of studies from the 1970's validating the thermodilution method suggested that it had a precision of 13-22% [42]. The 30% benchmark percentage error that everyone today quotes was based on a precision error of ±20% for thermodilution. However, it is now seems that the precision of thermodilution can be very variable and depends on type of patient and measurement system used [43]. Recently Peyton and Chong have suggested that the precision of thermodi‐

Study design becomes significant when ability to detection trends, in addition to accuracy, is investigated. To determine accuracy one needs only a single pair of cardiac output readings, test and reference, from each patient. Test refers to the new method being validated and reference to the clinical standard thermodilution, though ideally a gold standard method should be used. Readings, test and reference, should ideally be performed simultaneously, because cardiac output is not a static parameter and fluctuates between cardiac cycles. The size

Study design becomes more complicated if the ability to detect trends is being investigated. A series of paired readings from the same patient are now needed that show changes in cardiac

of the study usually includes twenty or more pairs of readings.

±20%.

alternative reference methods.

**7.2. The precision error of thermodilution**

lution may be as large as ±30% [44].

**7.3. Study design**

#### **6.2. Understanding errors**

The error that arises when measuring cardiac output has two basic components:


If I use a measuring tape to measure the heights of patients attending a clinic, my readings may vary by few millimeters from the true height of each patient. This is random error. But if the measuring tape is stretched by 2 to 3 centimeters, then every reading I take will consistently under read the height of each patient by a few centimeters. This is a systematic error. The division of measurement error into random and systematic components plays an important role in the choice of statistical techniques used for validation.

One of main sources of systematic error is imprecise calibration. Calibration is performed by (a) measuring cardiac output using a second method such as thermodilution, or (b) using population data to derive cardiac output from the patient's demographics, (i.e. age, height and weight)). Unfortunately, cardiac output, and related parameters vary between individuals. In the Nidorf normogram used to predict aortic valve size when using suprasternal Doppler cardiac output the range of possible values about the mean for valve size at each height is ±16% [23]. This gives rise to a significant systematic error between patients and this error impacts upon accuracy when Bland-Altman comparisons are made against a reference method [38]. However, reliability during trending may still be preserved because trending involves a series of readings from one single patient. Providing the systematic error remains constant, and the random measurement errors between the series of readings are acceptably low, the monitor can still detect changes in cardiac output reliably.

The accepted method of presenting errors in validation statistics is to use (a) percentages of mean cardiac output and (b) 95% confidence intervals, which approximates to two standard deviations. The term precision error is used, and should not be confused with the percentage error which is one of the outcomes of Bland-Altman analysis.
