**2. Data and methodology**

#### **2.1. Dataset description**

**1. Introduction**

26 Recent Developments in Tropical Cyclone Dynamics, Prediction, and Detection

inadequate.

developed in [7].

Modern statistical and dynamic forecast models continue to demonstrate low forecast skill in identifying the onset of rapid intensification (hereafter known as RI) within tropical cyclones (hereafter known as TCs). Even though storms which rapidly intensify can cost governments billions of dollars in damage upon landfall (e.g. by destroying property through flooding–as with Katrina in 2005) and RI forecasting is considered one of the top priorities for the National Hurricane Center [1], little advancement has been made in improvement of probabilistic tropical cyclone RI forecasting. While previous studies have examined the intensification patterns between RI and non-RI TCs [2, 3], the technological impairments, coupled with the complexity of these systems, have left gaps in understanding the large-scale structures associated with RI storms [1–4]. These gaps result in poor statistical forecast model accuracy, which requires prior knowledge of relevant RI variables. While recent research shows modest improvements in RI forecasts, and global models have steadily improved in their ability to predict the large-scale environmental conditions of TCs [3], forecast skill scores still remain

Current statistical forecast models blend both thermodynamic and kinematic variables in attempts to increase the skill, emphasizing meteorological processes deemed more crucial to RI prediction [1, 4–6]. Improvements to the Statistical Hurricane Intensity Prediction Scheme Rapid Intensification Index (SHIPS-RII) continue to be added regularly since the original implementation by the National Hurricane Center (NHC) in 2004 for the North Atlantic [1]. The latest enhanced SHIPS-RII consists of 10 predictors, including previous 12-hour intensity change, vertical shear, divergence at 200 hPa, total precipitable water, GOES-IR imagery, potential intensity, oceanic heat content, max sustained wind, and an inner-core dry air predictor [1]. Despite the addition of new predictors, Brier skill scores (BSS) relative to climatology for Atlantic RI forecasts remain below 20% [1]. Additionally, verification of all operational consensus intensity forecast models for the NHC, including their official intensity forecast, showed only limited improvement as Peirce skill scores remained below 0.2 [1]. Other studies have included predictors that resolve the inner-core environment more effectively, utilizing microwave passive imagery predictors in a probabilistic logistic regression (LR) model. Despite this effort, BSS values only improved to roughly 22% with either simulated real-time LR models or LR models utilizing reanalysis data [4]. Additionally, using a baseline peak wind speed of 25 knot intensity (at all RI thresholds) severely reduces skill to below 15% when compared to a probabilistic LR model utilizing current SHIPS parameters previously

In order to improve statistical model prediction of the onset of TC RI, the ability to identify distinguishing meteorological characteristics of the storm structure between RI and non-RI TCs with 24 hours lead time is crucial. While research of this nature is not new [2, 3], the approaches have differed (e.g. data selection, data reduction, meteorological variables chosen, compositing approaches). For example, Kaplan and DeMaria [2] and Kaplan et al. [8] noted that RI was more likely to occur for TCs that were situated over regions of higher than average sea surface temperature (SST), strong upper-level divergence, large low- to mid-tropospheric While the NHC defines RI as an increase in wind of 30 knots (kt) in 24 hours, several RI definitions are usually considered during the research phase of model development [1, 4]. This study examined three separate definitions of RI (following [1]), including the operational definition of a 30kt increase of wind speed in 24 hours and two experimental definitions of 25kt and 40kt increases. All Atlantic tropical and subtropical systems, from 1985–2009 from the NHC Atlantic best track data (HURDAT–[9]) were considered. For the three different RI definitions, the full database of 298 TC events were divided into RI and non-RI groups, yielding 152 RI and 146 non-RI cases with the 25kt definition, 119 RI and 179 non-RI for the 30kt definition, and 46 RI and 252 non-RI for the 40kt definition (**Figure 1** breaks these down by Saffir-Simpson scale category). Since a forecast proxy was desired, base-state meteorological fields from the National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS–[10]) reforecast database were retained 24 hours prior to the period of greatest intensification for all storms (RI and non-RI). GEFS reforecast data are provided at a 1° resolution at 3-hour forecast intervals from 0 to 72-hours. Three-dimensional base-state meteorological fields at eight vertical levels (1000–100 hPa) were utilized, including: geopotential height, temperature, *u* and *v* wind components, and specific humidity. Additionally, single-layer variables were considered, including mean sea level pressure (MSLP), skin temperature (a proxy for SST), latent heat flux, sensible heat flux, convective available potential energy (CAPE), convective inhibition (CIN), and vertical velocity at 850 hPa were evaluated.

**Figure 1.** Distribution of category per TC event type per RI definition using 25kt/24-hours, 30kt/24-hours, and 40kt/24 hours definitions.

As a primary goal was to diagnose RI using TC structure relative to the storm center, stormcentric GEFS reforecast domains for each cyclone were obtained. Storm centers were identified by determining the local minimum in GEFS MSLP nearest the NHC-defined TC center, 24 hours prior to the timestep associated with the greatest intensification. Each variable was retained on a 15° × 11° latitude/longitude grid centered on this domain. In the event multiple occurrences of peak intensification occurred for an individual TC (which occurred 28 times when using 25kt/24-hours, 13 times for 30kt/24-hours, and once for 40kt/24-hours), the first was chosen. Thus, the results presented herein deal with the first instance of peak intensification regardless of the frequency of peak intensification for a given TC.

#### **2.2. RPCA**

As the primary goal of this research was the identification of variables and spatial locations most favorable for distinguishing RI and non-RI storms, discriminatory statistical methods were needed. One method, rotated principal component analysis (RPCA), has been shown to be useful in discriminating meteorological environments of different types [11–14]. These studies also used permutation testing to evaluate magnitude differences in diagnostic variables for each environment. Both of these techniques were utilized in the current study so that both spatial configuration and magnitude difference could be assessed.

#### *2.2.1. S-mode RPCA*

single-layer variables were considered, including mean sea level pressure (MSLP), skin temperature (a proxy for SST), latent heat flux, sensible heat flux, convective available potential energy (CAPE), convective inhibition (CIN), and vertical velocity at 850 hPa were evaluated.

28 Recent Developments in Tropical Cyclone Dynamics, Prediction, and Detection

**Figure 1.** Distribution of category per TC event type per RI definition using 25kt/24-hours, 30kt/24-hours, and 40kt/24-

As a primary goal was to diagnose RI using TC structure relative to the storm center, stormcentric GEFS reforecast domains for each cyclone were obtained. Storm centers were identified by determining the local minimum in GEFS MSLP nearest the NHC-defined TC center, 24 hours prior to the timestep associated with the greatest intensification. Each variable was retained on a 15° × 11° latitude/longitude grid centered on this domain. In the event multiple occurrences of peak intensification occurred for an individual TC (which occurred 28 times when using 25kt/24-hours, 13 times for 30kt/24-hours, and once for 40kt/24-hours), the first was chosen. Thus, the results presented herein deal with the first instance of peak intensifica-

As the primary goal of this research was the identification of variables and spatial locations most favorable for distinguishing RI and non-RI storms, discriminatory statistical methods were needed. One method, rotated principal component analysis (RPCA), has been shown to be useful in discriminating meteorological environments of different types [11–14]. These

tion regardless of the frequency of peak intensification for a given TC.

hours definitions.

**2.2. RPCA**

The first approach to RPCA, S-mode analysis [13], provided a diagnosis of the spatial relationship among gridpoints for all cases. For S-mode, the similarity matrix is computed on the individual spatial locations and is eigenanalyzed to identify particular locations that group together. The S-mode rotated principal component (RPC) loadings are maps that demonstrate these spatial relationships (known as modes of variability), with the RPC scores revealing the similarity between the individual cases and the resulting S-mode loading maps. To reduce the dimensions of the eigenvector matrix, truncation of RPCs was completed by evaluating a scree plot, as well as using a congruence test. A congruence test is a way to measure pattern and magnitude similarity of a dataset, corresponding to the cosine of the angular separation between the loadings, by maximizing the dissimilarity of the two loading patterns [15]. The congruence coefficient presenting a strong relationship for any absolute value greater than 0.81 was marked as the truncation point. RI and non-RI datasets (consisting only of base-state variables for all 298 cases) were combined, where the analysis of both RI and non-RI event

**Figure 2.** Pairwise scatterplots of all six PC score vectors. RI PC scores are redpoints, while non-RI points are blue. The significant overlap among the groups demonstrates the challenges in linearly separating these types of TCs.

deviations and the loading patterns provided information on how the systems are grouping together (e.g. cooler SSTs versus warmer, upper level trough/ridge patterns, and influence of land at the surface 24-hours prior). To demonstrate the lack of linear separability in the resulting RPCs, a pairwise scatterplot of all six PC score vectors was formulated (**Figure 2**). There is significant overlap among the RI and non-RI PC scores, rendering separation via classification very difficult, motivating the need to consider additional analysis techniques.


**Table 1.** Rotated principal components variance explained for S-mode, as well as T-mode, for each event type and each RI definition. Dashes simply mean that this number of RPCs was not retained based on the previously described testing methodologies.

#### *2.2.2. T-mode RPCA*

While S-mode helped reveal the difficulties in identifying relevant RI/non-RI distinguishing characteristics, the results did not provide the necessary discrimination capability of interest in this work. Recent work has shown the value of composite analysis with T-mode RPCA in identifying discriminating characteristics for different meteorological event types [11, 12]. Following the methodology of [11, 12], a T-mode varimax-rotated RPCA [11, 16], conducted simultaneously on all GEFS reforecast fields, was completed on all RI events and all non-RI events separately. T-mode contrasts S-mode in that in T-mode, the relationships between events, as opposed to spatial locations, are of interest, and thus the correlation matrix is computed on the event dimension of the data. Following methods established in [11, 12], the resulting uncorrelated eigenvector matrix and associated eigenvalues reduced to a subset of RPCs for each event type and each RI definition (**Table 1**). Similar to the S-mode RPCA approach, the truncation point was determined through utilization of a scree plot and the congruence test. The resulting RPC loadings maintain the same dimension as the event dimension, so events were clustered by RPC loading magnitude using hierarchical clustering with Ward's minimum variance method [16]. To assess cluster quality, a cluster verification statistic (silhouette coefficient [17]) was found that includes two components:

**1.** a measure of intra-cluster spread (cluster cohesion–should be small) and

#### **2.** a measure of inter-cluster spread (cluster separation–should be large) [11].

In this study, the mean of the silhouette coefficient values for all events considered in the cluster analysis was retained as a measure of cluster analysis performance. With the silhouette coefficient, values approaching 1 suggest a minimization of cluster cohesion and a maximization of cluster separation. Negative values suggest a particular event was misclustered. The cluster analysis revealed six clusters each for RI and non-RI storm types using the 25kt/24 hours definition, seven non-RI and six RI using the 30kt/24-hours definition, and seven non-RI and five RI using the 40kt/24-hours definition (**Table 2** provides the number of events per cluster, as well as silhouette coefficient values). Events within each cluster were averaged together, yielding map types that retained unique synoptic-scale structures and provided more detailed map types of RI and non-RI TC environments than simply averaging all events together. The resulting composites allowed for the identification of spatial structure among RI/non-RI events.


**Table 2.** Silhouette coefficients and number of events per cluster through Ward's method on T-mode RPC loadings.

#### **2.3. Permutation testing**

deviations and the loading patterns provided information on how the systems are grouping together (e.g. cooler SSTs versus warmer, upper level trough/ridge patterns, and influence of land at the surface 24-hours prior). To demonstrate the lack of linear separability in the resulting RPCs, a pairwise scatterplot of all six PC score vectors was formulated (**Figure 2**). There is significant overlap among the RI and non-RI PC scores, rendering separation via classification very difficult, motivating the need to consider additional analysis techniques.

30 Recent Developments in Tropical Cyclone Dynamics, Prediction, and Detection

**S-mode T-mode T-mode T-mode RPCs Combined RI non-RI RI non-RI RI non-RI** 24 18 13 14 12 16 11 12 11 12.8 11 13 11 14 7.4 7.6 4.8 7.7 5.0 7.3 6.3 3.6 6.6 7.7 7.2 7.8 7.0 8.4 4.5 4.7 4.7 4.9 5.8 5.4 4.8 3.3 — 4.5 — — — —

**Table 1.** Rotated principal components variance explained for S-mode, as well as T-mode, for each event type and each RI definition. Dashes simply mean that this number of RPCs was not retained based on the previously described

While S-mode helped reveal the difficulties in identifying relevant RI/non-RI distinguishing characteristics, the results did not provide the necessary discrimination capability of interest in this work. Recent work has shown the value of composite analysis with T-mode RPCA in identifying discriminating characteristics for different meteorological event types [11, 12]. Following the methodology of [11, 12], a T-mode varimax-rotated RPCA [11, 16], conducted simultaneously on all GEFS reforecast fields, was completed on all RI events and all non-RI events separately. T-mode contrasts S-mode in that in T-mode, the relationships between events, as opposed to spatial locations, are of interest, and thus the correlation matrix is computed on the event dimension of the data. Following methods established in [11, 12], the resulting uncorrelated eigenvector matrix and associated eigenvalues reduced to a subset of RPCs for each event type and each RI definition (**Table 1**). Similar to the S-mode RPCA approach, the truncation point was determined through utilization of a scree plot and the congruence test. The resulting RPC loadings maintain the same dimension as the event dimension, so events were clustered by RPC loading magnitude using hierarchical clustering with Ward's minimum variance method [16]. To assess cluster quality, a cluster verification

statistic (silhouette coefficient [17]) was found that includes two components: **1.** a measure of intra-cluster spread (cluster cohesion–should be small) and

testing methodologies.

*2.2.2. T-mode RPCA*

**Variance explained (%)**

**25kt/24-hours 30kt/24-hours 40kt/24-hours**

While the composites resulting from the RPCA approach are useful for diagnosing spatial characteristics within RI and non-RI environments, magnitude differences are diagnosed more effectively using hypothesis testing. In this study, permutation tests [16] comparing magnitudes of diagnostic fields in RI and non-RI storms were utilized at each gridpoint from the study domains, yielding a spatial map of significance values associated with each variable tested. The resulting plots provided specific regions in the study domain where statistically significant magnitude differences between RI and non-RI storms existed for individual GEFS reforecast variables. These results provided insight not only into the scope of these magnitude differences but into the spatial locations of the differences, which complement the RPCA results well.
