**3. Use-case: dashboard for hospital-acquired infection rates at Massachusetts hospitals**

In the US, the Commonwealth of Massachusetts (MA) Department of Public Health (DPH) posts annual healthcare-associated infection (HAI) reports about MA hospitals on their web page [21]. The purpose of the reports is public data transparency by law [21]. Briefly, HAI, such as catheter-acquired urinary tract infection (CAUTI), central line-associated bloodstream infections (CLABSI), and surgical site infections (SSI), are serious issues because they are the fault of the hospital, and can lead to sepsis, which is a systemic infection that can end in death [22, 23]. Since catheterization happens in the intensive care unit (ICU) setting, typically hospitals track CAUTI and CLABSI as part of healthcare quality activities centered around ICUs [14]. By contrast, SSIs are tracked in association with specific operative procedures (e.g., colon surgery) [24].

In the US compared to other countries, rates of HAI are relatively high, likely because they are not required to be tracked at the federal level [14, 25]. Hospitals can opt into the federal voluntary tracking system called the National Health Safety Network (NHSN), but the NHSN does not have a publicly-facing dashboard, and the data are inaccurate, especially in undercounting severe events [14, 24, 26]. As HAI is a serious public health issue, there has been a call for greater data transparency, so the reports posted on the MA DPH web site represent MA's attempt to comply with state-level mandates for OGD.

Although summary reports are available for download on the MA DPH web site, it is not possible to access hospital-level reports directly from the web site. To download hospital-level reports, the user must access a dashboard presented on the web page in a link (**Figure 2**).

As per **Figure 2**, once inside the dashboard, individual PDF-style reports can be found through navigation to the hospital in question, and the reports appear to present formatted output from a database. One way to navigate to the hospital record is to locate it on the map ("C" in **Figure 2**) and click on its icon, causing the panel "B" to display hospital-level metrics and a link to the hospital's PDF report.

Each PDF report has a header displaying attribute data about the hospital (e.g., number of beds), followed by a series of ICU-, procedure-, and infectionlevel output. This mirrors the structure seen in the dashboard tabs and reports

*Framework to Evaluate Level of Good Faith in Implementations of Public Dashboards DOI: http://dx.doi.org/10.5772/intechopen.101957*

#### **Figure 2.**

*MA HAI public dashboard landing page. Note: "A" labels a menu of tabs that can be used for navigation to view metrics on the various hospital-acquired infections (HAIs). In panel labeled "B", tabs can be used to toggle between viewing state-level metrics and hospital-level metrics. Hospitals can be selected for display using a map labeled "C".*

(**Figure 2**, "A" and "B"). For the ICUs at each hospital, the report displays a set of tables summarizing CAUTI and CLABSI rates, followed by time-series graphs. For a set of high-risk surgical procedures, SSI rates and graphs for the hospital are displayed. Medication-resistant staphylococcus aureus (MRSA) and *C. Difficile* infections are serious HAIs that can be acquired in any part of the hospital and are diagnosed using laboratory tests [27]. Rates and graphs of MRSA and *C. Difficile* infections are also displayed on the report.

The underlying data come from the NHSN. This is not stated on the dashboard. Instead, there is a summary report and presentation posted alongside the dashboard on the web site, and the analyses in these files are based on NHSN data [21]. It seems that the DPH is using this NHSN data using as a back-end to the dashboard, and the dashboard is an attempt to comply with OGD laws.

Because the authors are aware of the high rates of HAI in the US, and because we both live in MA and we both are women who are cognizant that sexism in US healthcare adds additional layers of risk to women [28], we identified that we were in a state of information asymmetry. Specifically, we had the *information need* to compare MA hospitals to choose the least risky or "lethal" one for elective surgery or childbearing (planned procedures), but we felt this need was not met by this OGD implementation.

In this section, we start by evaluating the existing MA DPH HAI dashboard against our good vs. bad faith framework. Next, we propose an alternative dashboard solution that improves the good vs. bad faith features of the implementation.

#### **3.1 Considering existing dashboard: design process and requirements**

**Figure 3** provides a logical entity-relationship diagram (ERD) for the data behind the dashboard.

As described earlier, the landing page (**Figure 2**) provides a map by which users can select a hospital, causing the metrics for the hospital to appear in a panel. The user chooses which measurement to view (e.g., CAUTI) through navigation using the tabs. This suggests the dashboard is aimed at individuals with a working knowledge of MA geography who are intending on comparing and selecting hospitals

#### **Figure 3.**

*Logical entity-relationship diagram for data behind dashboard. Note: The schema presented assumes four entities: The hospital entity (primary key [PK]: HospRowID), each intensive care unit (ICU) attached to a hospital which contains the frequency of infection and catheter days attributes to allow rate calculation (PK: ICURowID), each procedure type attached to a hospital (to support the analysis of surgical site infection [SSI], with PK: ProcRowID), and each other infection type at the hospital not tracked with ICUs (PK: LabID).*

least likely to cause HAI for an elective procedure (e.g., childbearing), or to establish as their top choice of the local hospital should they ever need to be admitted. This interface makes it difficult to compare HAI at different hospitals, because metrics from more than one hospital cannot be viewed at the same time. Further, metrics about different HAIs at the same hospital are on different panels, so withinhospital comparisons cannot be facilitated. There appears to be no overall metric to use by which to compare hospitals in terms of their HAI rates.

**Figure 4** shows an example of the metrics reported by each hospital on the dashboard reporting panel ("B" in **Figure 2**). The figure also shows one of the two tables and one of the two figures displayed on the CAUTI tab for the selected hospital. In all, two tables and two figures are displayed in portrait style in panel "B" (**Figure 2**), and **Figure 4** shows the top table and figure displayed. In the table displayed (labeled "1" in **Figure 4**), the metrics presented are the number of infections, predicted

#### **Figure 4.**

*Dashboard metric display for each hospital. Note: To view hospital-acquired infection (HAI) rates at hospitals, a hospital is selected (Figure 2, panel "C"), then the user selects the tab for the HAI of interest. In Figure 4, a hospital has been identified, and a tab for catheter-associated urinary tract infection (CAUTI) has been selected (see circle). Two tables and two figures are presented in portrait format on the reporting panel for each hospital (Figure 2, panel "B"). Figure 4 shows the first table and figure presented ("1" and "2"); the table reflects stratified metrics for CAUTI at each ICU at the hospital, and the graph reflects a time series of these metrics stratified by hospital vs. state levels, and intensive care unit (ICU) vs. ward ("ward" is not defined in the dashboard). The metrics provided in "1" are a number of infections, predicted infections, standard infection ratios (SIRs), a confidence interval for the SIR, and an interpretation of the level. In "B", the SIR is graphed.*

*Framework to Evaluate Level of Good Faith in Implementations of Public Dashboards DOI: http://dx.doi.org/10.5772/intechopen.101957*

infections, standard infection ratios (SIRS), a confidence interval for the SIR, and an interpretation of the level. The figure (labeled "2" in **Figure 4**) displays a time-series graph of SIRs for the past five years. In the other table on the panel (not shown in **Figure 4**), ICU-level metrics are provided about catheter-days, predicted catheterdays, Standard Utilization Ratios (SURs) and their confidence interval, and an interpretation, and an analogous time-series graph of five years of SURs is presented (also not shown in **Figure 4**).

SIRs and SURs are not metrics used typically by the public to understand rates of HAI in hospitals. Risk communication about rates to the public is typically done in the format of *n per 10,000* or *100,000*, depending upon the magnitude of the rate [29]. Further, stratifying rates by ICU is confusing, as prospective patients may not know what ICU in which they will be placed. Because the hospital environment confers the strongest risk factors for HAI (e.g., worker burnout), HAI rates will be intra-correlated within each hospital [30]. Therefore, it is confusing to present all these rates and stratify them by ICU. **Figure 4** only displays 50% of the information available about CAUTI at one hospital. With each tab displaying similar metrics about SSI and other infections, the experimental unit being used is so small, it obfuscates any summary statistics or comparisons. Also, it is unclear how the "predicted" metrics presented were calculated.

Ultimately, the design process and requirements behind this dashboard are not known. There is no documentation as to how this dashboard was designed, and what it is supposed to do. It appears to be an alpha prototype that was launched without a stated *a priori* design process, and without any user testing or formal evaluation.

#### **3.2 Alternative design: design process and requirements**

We chose to redesign the dashboard into a new alpha prototype that met requirements that we, as members of the public, delineated. Consistent with the good faith principles proposed, our requirements included the following: 1) the dataset we use should be easily downloadable by anyone using the dashboard, 2) the documentation of how the dashboard was developed should be easy to access, 3) hospitals should present summary metrics rather than stratified ones, 4) different HAI metrics for the same hospital should be presented together, and 5) there needs to be a way to easily compare hospitals and choose the least risky hospital. To do this, we first obtained the data underlying the original dashboard. Next, we analyzed it to determine better metrics to present. We also selected open-source software to use to redeploy an alpha prototype of a new dashboard. Finally, we conducted informal user testing on this alpha prototype.

### *3.2.1 Obtaining the data from the original dashboard*

Scraping was done in open-source *RStudio* and predominantly used packages *pdftools*, and *pdftables* [31, 32]. All the PDF reports from each hospital were downloaded and placed in one directory. As a first step, a loop was used with the *pdftools* package which crawled through each report extracting the data into memory. This was done in conjunction with the *pdftables* package, which is essentially an online application that applies structure to the unstructured tabular data placed in memory from *pdftools*. To use this online application, an application programming interface (API) key is issued from the *PDF Tables* web site, and is used in the *RStudio* programming to pass the data to the online application [33]. The code resulted in the data being processed into a series of unstructured \*.xlsx files and downloaded locally. Then, in a final data cleaning step, data were transformed into the tables in the format specified in **Figure 3**.

#### *3.2.2 Determining metrics to present*

We chose to focus our inquiry on the data from the hospital and ICU tables, as CAUTI and CLABSI are by far the most prevalent and deadly HAIs [23]. Therefore, we scoped our alpha prototype to only display data from the ICU and the hospital tables (although we make all the data we scraped available in the downloadable dataset). This limited us to basing the dashboard on hospital- and ICU-level metrics only.

Next, we intended to present CAUTI and CLABSI frequencies and rates, whereby the numerator would be the number of infections, and the denominator would be the "number of patients catheterized". We felt that the dashboard's use of catheter-days as the rate denominator was confusing to the public, and appeared to attenuate the prevalence of patients having experienced a CAUTI or CLABSI. Although, "number of patients catheterized" was not available in the data, "annual admissions" was. Since the proportion of patients admitted annually who are catheterized probably does not vary much from hospital to hospital, we chose to use the number of admissions as the denominator and a proxy measurement.

Third, we needed to develop a way of sorting hospitals as to their likelihood of causing an HAI to allow easy comparisons by public users, so we decided to develop an equation to predict the likelihood of an HAI at the hospital. We did this by developing a linear regression model with hospital-level attributes as independent variables (IVs), and CAUTI rate in 2019 as the dependent variable (DV). We chose CAUTI over CLABSI after observing the two rates were highly correlated, and CAUTI was more prevalent.

**Table 1** describes the candidate IVs for the linear regression model. The table also includes the source of external data that were added to the hospital data. We studied our IVs, and found serious collinearity among several variables, so we used principal component analysis (PCA) to help us make informed choices about parsimony [37]. The data predominantly loaded on three factors (not shown). The first factor included all the size and utilization variables for the hospital; these were summed into a Factor 1 score. The second-factor loadings included the proportion of those aged 65 and older and the non-urban flag (**Table 1**), so those were summed as Factor 2. Proportion non-White was strongly inversely correlated with Factor 2, so it was kept for the model, and county population did not load, so it was removed from the analysis. Factor 3 loadings included teaching status, for-profit status, and Medicare Performance Score (MPS). Rather than create a score, we simply chose to include the variable from Factor 3 that led to the best model fit to represent the factor, which was MPS. Then we finalized our linear regression model, and developed a predicted CAUTI rate (*ŷ*) using our model that included the following IVs: MPS, Factor 1 score, Factor 2 score, and proportion of non-White residents in hospital county.

Next, we used the regression equation to calculate *ŷ* as a "lethality score" for each hospital. Of the 71 hospitals in the dataset, 21 were missing MPS and 8 were missing other data in the model. Therefore, only the 42 with complete data (IVs and DVs) were used to develop the regression model. As a result, the lethality score was nonsensical for some hospitals; where the residual was large, the lethality score was replaced with the 2019 CAUTI rate. If CAUTI data were missing, it was assumed that the hospital had no CAUTI cases, and therefore was scored as 0.

Once the lethality score was calculated, we chose to sort the hospitals by score, and divide them into four categories: least probable (color-coded green), somewhat probable (color-coded yellow), more probable (color-coded red), and most probable (color-coded dark gray). Due to missing CAUTI information and many hospitals having zero CAUTI cases, our data were severely skewed left, so making quartiles of the lethality score to divide the hospitals into four categories was not meaningful. To compensate, we sorted the data by lethality score and placed the

*Framework to Evaluate Level of Good Faith in Implementations of Public Dashboards DOI: http://dx.doi.org/10.5772/intechopen.101957*


#### **Table 1.**

*Conceptual model specification.*

first 23 hospitals (32%), which included all the hospitals with zero cases, in the least probable category. We placed the next 16 (23%) in somewhat probable, the next 16 (23%) in more probable, and the final 16 (23%) are most probable. We chose to use this classification in data display on the dashboard to allow for easy comparison between hospitals of risk of a patient contracting HAI.

#### *3.2.3 Choice of software and display*

R is an open-source analytics software that allows for user-developed "packages" to be added *a la carte* to the main application [38]. RStudio was developed to be an integrated development environment (IDE) for R that allows for advanced visualization capabilities that support dashboard development [39]. In RStudio, web applications like dashboards can be placed on a host server and deployed on the internet such that as users interact with the web front-end, it can query and display data from the back-end hosted on the server.

The package *Rshiny* [40] was developed to support dashboarding in RStudio, and can work with other visualization packages depending upon the design goals of the dashboard. In our newly designed dashboard, the package *leaflet* was used for a base map on which we placed the hospital icons (like the original dashboard), and add-ons were made to display other items. These add-ons were adapted from other published codes [41]. JavaScript with wrapper DT (data table) was used to display stratified ICU rates, and CSS was used for formatting.

The dashboard we developed was deployed on a server (https://natasha-dukach. shinyapps.io/healthcare-associated-infections-reports/) and code for the dashboard was published (https://github.com/NatashaDukach/HAI-MA-2019). When accessing the link to the dashboard, the user initially sees a map with icons (in the form of dots) on it indicating hospitals. The icons are color-coded according to the lethality score described previously. Clicking on an icon will expand a bubble reporting information about the hospital (**Figure 5**).

As shown in **Figure 5**, like the original dashboard, this one has a map for navigation. Unlike the original, it only has two tabs: "ICU Rate Explorer" (the one shown in **Figure 5**), and "Data Collection", which provides documentation and links to original data and code (see "A" in **Figure 5**). The hospital icons are placed on the map and coded according to our color scheme (see legend in **Figure 5** by "B"). This allows for easy comparison between hospitals. When clicking on an icon for a hospital, a bubble appears that contains the following hospital metrics: Number of admissions, number of ICU beds, overall CAUTI rate, and overall CLABSI rate. There is also a link on the bubble where the user can click to open a new box that provides CAUTI and CLABSI rates stratified by ICU. Future development plans include adding other overall rates (e.g., for SSI), and adding in data from previous years to allow for the evaluation of trends.
