2.2. SPEEDY model

The SPEEDY computer code is an AGCM developed to study global-scale dynamics and to test new approaches for numerical weather prediction (NWP). The dynamic variables for the primitive meteorological equations are integrated by the spectral method in the horizontal grid at each vertical level, more details in [3, 21]. The model has a simplified set of physical parameterization schemes that are similar to realistic weather forecasting numerical models. The goal of this model is to obtain computational efficiency while maintaining characteristics similar to the state-of-the-art AGCM with complex physics parameterization [32].

According to Ref. [36], the SPEEDY model simulates the general structure of global atmospheric circulation (Figure 2), and some aspects of the systematic errors are similar to many errors in the operational AGCMs. The package is based on the physical parameterizations adopted in more complex schemes of the AGCM, such as convection (simplified diagram of mass flow), large-scale condensation, clouds, short-wave radiation (two spectral bands), longwave radiation (four spectral bands), surface fluxes of momentum, energy (aerodynamic formula), and vertical diffusion. Details of the simplified physical parameterization scheme can be found in Ref. [36].

Figure 2. Schematic for global atmospheric model. Source: Center for Multiscale Modeling of Atmospheric Processes.

The boundary conditions of the SPEEDY model include topographic height and land-sea mask, which are constant. Sea surface temperature (SST), sea ice fraction, surface temperature in the top soil layer, moisture in the top soil layer, the root-zone layer, snow depth, all of which are specified by monthly means. Annual-mean fields specify bare-surface albedos, and fraction of land-surface covered by vegetation. The lower boundary conditions such as SST are obtained from the ECMWF's reanalysis in the period 1981–1990. The incoming solar radiation flux and the boundary conditions are updated daily. The SPEEDY model is a hydrostatic model in sigma coordinates. Ref. [3] also describes the vorticity-divergence transformation scheme.

The SPEEDY model is global with spectral resolution T30L7 (horizontal truncation of 30 numbers of waves and 7 levels). The vertical coordinates are defined on sigma (σ = p/p0, where p<sup>0</sup> is the surface pressure) surfaces, corresponding to 7 vertical pressures levels (100, 200, 300, 500, 700, 850, and 925 hPa). The horizontal coordinates are latitude and longitude on regular grid, corresponding to a regular grid with 96 zonal points (longitude) and 48 meridian points (latitude). The schematic for global model and its physical packages can be seen at Figure 2. The prognostic variables for the model input and output are the absolute temperature (T), surface pressure (ps), zonal wind component (u), meridional wind component (v), and an additional variable and specific humidity (q).

### 2.3. Brief description on local ensemble transform Kalman filter

back-propagation supervised algorithm, the adjustments to the weights are conducted by back propagating of the error and the target output is considered the supervisor. Ref. [14] included

The NN applications, generally, are on function approximation of modeling of nonlinear transfer functions and pattern classifications. Refs. [18, 25] reviewed applications of NN in environmental science including atmospheric sciences. They reviewed some NN concepts and some NN applications; these reviews were also for other estimation methods and its applications. Other reviews for NN applications in the atmospheric sciences, looking at prediction of air-quality, surface ozone concentration, dioxide concentrations, severe weather, etc., and pattern classifications applications in remote sensing data to obtain distinction between clouds and ice or snow were presented by [14]. Refs. [18, 25] also presented applications on classification of atmospheric circulation patterns, land cover and convergence lines from radar imagery, and classification of remote sensing data using NN. Data assimilation was not mentioned in

The SPEEDY computer code is an AGCM developed to study global-scale dynamics and to test new approaches for numerical weather prediction (NWP). The dynamic variables for the primitive meteorological equations are integrated by the spectral method in the horizontal grid at each vertical level, more details in [3, 21]. The model has a simplified set of physical parameterization schemes that are similar to realistic weather forecasting numerical models. The goal of this model is to obtain computational efficiency while maintaining characteristics

According to Ref. [36], the SPEEDY model simulates the general structure of global atmospheric circulation (Figure 2), and some aspects of the systematic errors are similar to many errors in the operational AGCMs. The package is based on the physical parameterizations adopted in more complex schemes of the AGCM, such as convection (simplified diagram of mass flow), large-scale condensation, clouds, short-wave radiation (two spectral bands), longwave radiation (four spectral bands), surface fluxes of momentum, energy (aerodynamic formula), and vertical diffusion. Details of the simplified physical parameterization scheme

Figure 2. Schematic for global atmospheric model. Source: Center for Multiscale Modeling of Atmospheric Processes.

similar to the state-of-the-art AGCM with complex physics parameterization [32].

brief introductions of MLP and the back-propagation algorithm.

270 Advanced Applications for Artificial Neural Networks

such reviews.

2.2. SPEEDY model

can be found in Ref. [36].

The analysis is the best estimate of the state of the system based on the optimizing criteria. The probabilistic state-space formulation and the requirement for updating information when new observations are encountered are ideally suited to the Bayesian approach. The Bayesian approach is a set of efficient and flexible Monte Carlo methods for solving the optimal filtering problem. Here, one attempts to construct the posterior probability density function (pdf) of the state using all available information, including the set of received observations. Since this pdf embodies all available statistical information, it may be considered as a complete solution to the estimation problem.

In the field of data assimilation, there are only few contributions in sequential estimation (EnKF or PF filters). The EnKF was first proposed by [11] and was developed by [4, 12]. It is related to particle filters [1, 10] in the context that a particle is identified as an ensemble member. EnKF is a sequential method, which means that the model is integrated forward in time and whenever observations are available; these EnKF results are used to reinitialize the model before the integration continues. The EnKF originated as a version of the Extended Kalman Filter (EKF) [28]. The classical KF method, see [27], is optimal in the sense of minimizing the variance only for linear systems and Gaussian statistics. Analysis perturbations are added to run the ensemble forecasts, the mean of ensemble forecasts is the estimation error for analysis. Ref. [35] added Gaussian white noise to run the same forecast for each member of the ensemble in LETKF. The EnKF is a Monte Carlo integration that governs the evolution of the pdf, which describes the a priori state, the forecast and error statistics. In the analysis step, each ensemble member is updated according to the KF scheme and replaces the covariance matrix by the sampled covariance computed from the ensemble forecasts. Ref. [23] applied the EnKF to an atmospheric system. They applied a state model ensemble to represent the statistical model error. The scheme of analysis acts directly on the ensemble of state models, when observations are assimilated. The ensemble of analysis is obtained by assimilation for each member of the reference model. Several methods have been developed to represent the modeling error covariance matrix for the analysis applying the EnKF approach; the local ensemble transform Kalman filter (LETKF) is one of them. Ref. [26] proposed the LETKF scheme as an efficient upgrade of the local ensemble Kalman filter (LEKF). The LEKF algorithm creates a close relationship between local dimensionality, error growth, and skill of the ensemble to capture the space of forecast uncertainties formulated with the EnKF scheme (e.g., [45]). In addition, Ref. [29] describes the theoretical foundation of the operational practice of using small ensembles, for predicting the evolution of uncertainties in high-dimension operational NWP models.

The LETKF scheme is a model-independent algorithm to estimate the state of a large spatial temporal chaotic system [38]. The term "local" refers to an important feature of the scheme: it solves the Kalman filter equations locally in model grid space. A kind of ensemble square root filtering [32, 45], in which the analysis ensemble members are constructed by a linear combination of the forecast ensemble members. The ensemble transform matrix, composed of the weights of the linear combination, is computed for each local subset of the state vector independently, which allows essentially parallel computations. The local subset depends on the error covariance localization [33]. Typically, a local subset of the state vector contains all variables at a grid point. The LETKF scheme first separates a global grid vector into local patch vectors with observations. The basic idea of LETKF is to perform analysis at each grid point simultaneously using the state variables and all observations in the region centered at given grid point. The local strategy separates groups of neighboring observations around a central point for a given region of the grid model. Each grid point has a local patch; the number of local vectors is the same as the number of global grid points [35].

The algorithm of EnKF follows the sequential assimilation steps of classical Kalman filter, but it calculates the error covariance matrices as described below:

Each member of the ensemble gets its forecast x f n�1 n oð Þ<sup>i</sup> : i = 1,2,3,���,k, where k is the total members at time tn, to estimate the state vector xf of the reference model. The ensemble is used to calculate the mean of forecasting xf � �:

$$\overline{\mathfrak{a}}' \equiv k^{-1} \sum\_{i=1}^{k} \left\{ \mathfrak{x}' \right\}^{(i)}.\tag{4}$$

Therefore, the model error covariance matrix:

$$P^{\ell} = (k-1)^{-1} \sum\_{i=1}^{k} \left( \left\{ \mathbf{x}^{\ell} \right\}^{(i)} - \overline{\mathbf{x}}^{\ell} \right) \left( \left\{ \mathbf{x}^{\ell} \right\}^{(i)} - \overline{\mathbf{x}}^{\ell} \right)^{T}. \tag{5}$$

The analysis step determines a state estimate to each ensemble member:

Data Assimilation by Artificial Neural Networks for an Atmospheric General Circulation Model http://dx.doi.org/10.5772/intechopen.70791 273

$$\left\{\mathbf{x}^{a}\right\}^{(i)} = \left\{\mathbf{x}^{f}\right\}^{(i)} + \mathcal{W}\_{\mathcal{K}} \left[\mathbf{x}^{a\flat s} - H\left(\left\{\mathbf{x}^{f}\right\}^{(i)}\right)\right] \tag{6}$$

$$\mathcal{W}\_{\mathcal{K}} = P^{\mathcal{f}} H^{\mathcal{T}} \left[ H P^{\mathcal{f}} H^{\mathcal{T}} + \mathcal{R} \right]^{-1}. \tag{7}$$

The analysis {x<sup>a</sup> } (i) <sup>i</sup> = 1,2,3,���,k, (Eq. 6) by solving (Eq. 7) for Wk to get the optimal weight (e.g., Kalman gain). The matrix H represents the observation operator. The covariance matrix R identifies the observation error. The analysis step also updates the covariance error matrix P<sup>a</sup> (Eq. 8)

$$P^{\mathfrak{a}} = (k-1)^{-1} \sum\_{i=1}^{k} \left( \left\{ \mathfrak{x}^{\mathfrak{a}} \right\}^{(i)} - \mathfrak{x}^{\mathfrak{a}} \right) \left( \left\{ \mathfrak{x}^{\mathfrak{a}} \right\}^{(i)} - \mathfrak{x}^{\mathfrak{a}} \right)^{T} \tag{8}$$

with the appropriate ensemble analyses mean:

atmospheric system. They applied a state model ensemble to represent the statistical model error. The scheme of analysis acts directly on the ensemble of state models, when observations are assimilated. The ensemble of analysis is obtained by assimilation for each member of the reference model. Several methods have been developed to represent the modeling error covariance matrix for the analysis applying the EnKF approach; the local ensemble transform Kalman filter (LETKF) is one of them. Ref. [26] proposed the LETKF scheme as an efficient upgrade of the local ensemble Kalman filter (LEKF). The LEKF algorithm creates a close relationship between local dimensionality, error growth, and skill of the ensemble to capture the space of forecast uncertainties formulated with the EnKF scheme (e.g., [45]). In addition, Ref. [29] describes the theoretical foundation of the operational practice of using small ensembles, for predicting the evolution of uncertainties in high-dimension operational

The LETKF scheme is a model-independent algorithm to estimate the state of a large spatial temporal chaotic system [38]. The term "local" refers to an important feature of the scheme: it solves the Kalman filter equations locally in model grid space. A kind of ensemble square root filtering [32, 45], in which the analysis ensemble members are constructed by a linear combination of the forecast ensemble members. The ensemble transform matrix, composed of the weights of the linear combination, is computed for each local subset of the state vector independently, which allows essentially parallel computations. The local subset depends on the error covariance localization [33]. Typically, a local subset of the state vector contains all variables at a grid point. The LETKF scheme first separates a global grid vector into local patch vectors with observations. The basic idea of LETKF is to perform analysis at each grid point simultaneously using the state variables and all observations in the region centered at given grid point. The local strategy separates groups of neighboring observations around a central point for a given region of the grid model. Each grid point has a local patch; the number of

The algorithm of EnKF follows the sequential assimilation steps of classical Kalman filter, but

members at time tn, to estimate the state vector xf of the reference model. The ensemble is

k

i¼1

xf � �ð Þ<sup>i</sup> � xf � �

xf � <sup>k</sup>�<sup>1</sup><sup>X</sup>

k

i¼1

The analysis step determines a state estimate to each ensemble member:

f n�1 n oð Þ<sup>i</sup>

xf � �ð Þ<sup>i</sup>

xf � �ð Þ<sup>i</sup> � xf � �<sup>T</sup>

: i = 1,2,3,���,k, where k is the total

: (4)

: (5)

local vectors is the same as the number of global grid points [35].

it calculates the error covariance matrices as described below:

Pf <sup>¼</sup> ð Þ <sup>k</sup> � <sup>1</sup> �<sup>1</sup><sup>X</sup>

Each member of the ensemble gets its forecast x

used to calculate the mean of forecasting xf � �:

Therefore, the model error covariance matrix:

NWP models.

272 Advanced Applications for Artificial Neural Networks

$$\overline{\mathfrak{X}}^{a} \equiv \mathfrak{k}^{-1} \sum\_{i=1}^{k} \left\{ \mathfrak{x}^{a} \right\}^{(i)}.\tag{9}$$

The LETKF scheme has been applied to a low-dimensional AGCM SPEEDY model [32], a realistic model according to [42]. The LETKF scheme was also employed in the following: the AGCM for the Earth Simulator by [35] and the Japan Meteorological Agency operational global and mesoscale models by [34]; the Regional Ocean Modeling System by [41]; the global ocean model known as the Geophysical Fluid Dynamics Laboratory (GFDL) by [39]; and GFDL Mars AGCM by [15].

#### 3. MLP-DA in assimilation for SPEEDY model

The NN configuration for this experiment is a set of multilayer perceptron, hereafter, referred to as MLP-DA. On the present paper, the NN configuration (number of layers, nodes per layer, activation function, and learning rate parameter) was defined by empirical tests, and we found the following characteristics:


The vectors values represent individual grid points for a single variable with a correspondent observation value on model point localization. The grid points is considered where observation value exists, see Figure 3. In the training algorithm, the MLP-DA computes the output and compared it with the "input" analysis vector of LETKF results (the target data), but it is not a node for the MLP generalization. The output vectors represent the analysis values for one grid point too. Care must be taken in specifying the number of neurons. Too many neurons can lead the NN to memorize the training data (over fitting), instead of extracting the general features that allow the generalization. Too few neurons may force the NN to spend too much time trying to find an optimal representation and thus wasting valuable computation time.

One strategy used to collect data and to accelerate the processing of the MLP-DA training was to divide the entire globe into six regions: for the Northern Hemisphere, 90 N and three longitudinal regions of 120 each; for the Southern Hemisphere, 90 S and three longitudinal regions of 120 each. This division provides the same size for each region, but the number of observations is distinct, as illustrated by Figure 3. This regional division is applied only for the MLP-DA; the LETKF procedures are not modified.

The MLP-DA scheme was developed with a set of 30 NN (six regions with five prognostic variables (ps, u, v, T, and q)). Each grid point has all vertical layers values for the model. One MLP with characteristics described above was designed for each meteorological variable of the SPEEDY model and each region. Each MLP has two inputs (model and observation vectors), one output neuron which is the analysis vector, and the training scheme is the back-propagation algorithm.

The MLP-DA is designed to emulate the global LETKF analysis for SPEEDY initial condition. The LETKF analysis is the mean field of an ensemble of analyses. Fortran codes for SPEEDY and LETKF [32] were adapted to create the training data set for that period. The upper levels and the surface covariance error matrices to run the LETKF system, as well as the SPEEDY model boundary conditions data and physical parameterizations, are the same as those used for Miyoshi's experiments.

The initial process is the implementation of the model, it assumes that it is perfect (initialization = 0); and the SPEEDY model T30 L7 was integrated for 1 year of spin-up, i.e., the period required for a model to reach steady state and obtain the simulated atmosphere. The model

Figure 3. Observations localizations in global area. The dot points represent radiosonde stations (about 415) divided in six regions.

ran (without interruption) four times per day, from 01 January 1981 until 31 December 1981 and the last result is the initial condition for SPEEDY to 01 January 1982 0000 UTC. The model fields, so-called "true" (or control) model, are generated without data assimilation (each 6 hours forecast, is the initial condition for the next execution). The "true" model forecasts collected for executions without DA, considered four times per day (0000, 0600, 1200, 1800 UTC), the model run from 01 January 1982 through 31 December 1984 and collected analysis for each run.

The synthetic observations are generated, reading the "true" SPEEDY model fields, adding a random noise of meteorological variables: surface pressure (ps), zonal wind component (u), vertical wind component (v), absolute temperature (T), and specific humidity (q). The observation localization is on grid model point. An observation mask is designed, adding a positive flag to grid point, where the observation should be considered; the locations simulate the WMO data stations observations from radiosonde (Figure 3). Except for ps observations, the other observations are upper level with seven levels. Both assimilation schemes, LETKF and MLP-DA, use the same number of observations at the same grid point, i.e., the observation localization mask.
