1. Introduction

For operating systems in weather forecasting, one of the challenges is to obtain the most appropriate initial conditions to ensure the best prediction from a physical mathematical

distribution, and eproduction in any medium, provided the original work is properly cited.

model that represents the evolution of the atmospheric dynamics. Performing a smooth melding of data from observations and model predictions, the assimilation process carries out a set of procedures to determine the best initial condition. Atmospheric observed data are used to create meteorological fields over some spatial and/or temporal domain.

The analysis, i.e., initial condition, for NWP is combination of measurements and model predictions to obtain a representation of the state of the modeled system as accurate as possible. The analysis is useful in itself as a description of the physical system, but it can be used as an initial state for the further time evolution of the system [22]. The research of data assimilation methods has been studied for atmospheric and oceanic prediction, besides other dynamics researches like ionosphere and hydrological. The different algorithms of data assimilation were applied varying in complexity, optimality, and formulation. The approach of Bayesian scheme [31] uses ensembles of integrations of prediction models, where added perturbations to initial conditions and model formulation; the mean of ensemble forecasts can be interpreted as a probabilistic prediction. The ensemble Kalman filter (EnKF) [11, 23] uses a probability density function associated with the initial condition, characterizing the Bayesian approaches [9], and represents the model errors by an ensemble of estimates in state space. The Kalman filter (KF) [27] is one good technique to estimate an initial condition to a linear dynamic system. A useful overview of most common data assimilation methods used in meteorology and oceanography and detailed mathematical formulations can be found in texts such as Daley [9] and Kalnay [29].

The modern DA techniques represent a computational challenge, even with the use of parallel computing with thousands of processors. Nowadays, the operational NWP is using a higher resolution model, and the amount of observations has an exponential growth because of launch of new satellite. There is a computational challenge to get the analysis (initial condition) to run models, and so we need to make a prediction on time. The computational challenge to the data assimilation techniques lies in millions of equations involved in NWP models.

The DA algorithms are constantly updated to improve their performance. The example is the version of the EnKF [11] restricted to small areas (local); the local ensemble Kalman filter (LEKF) [38] is a version of the EnKF. We propose the application of artificial neural networks (NNs) like a DA technique to get a quality analysis and to solve the computational challenge.

First, the application of NN was suggested as a possible technique for data assimilation by [24, 30, 43]. The researches with NN (for data assimilation method) were initiated at INPE (National institute for Space Researcher) with Nowosad [37], see also [44, 5]; they used an NN over all spatial domains. Later, this method was improved by [16, 17], where they introduced a modification on the NN application, in which the analysis was obtained at each grid point, instead of at all points of the domain. They also evaluated the performance of two feedforward NN (multilayer perceptron and radial basis function) and two recurrent NN (Elman and Jordan, see description in [19, 20]) [17]. Ref. [13] applied NN to emulate the particle filter and the variational data assimilation (4D var) for the Lorenz chaotic system. In 2012, Furtado [40] used an ocean model to emulate a variational method called representer. The NN technique was successful for all experiments, but they use theoretical or low-dimensional models. In 2010, Refs. [6, 7] applied this approach of supervised NN to an atmospheric general circulation model (AGCM) to emulate a LETKF method. This is the experiment described in this paper, this experiment is the first one to use the 3D global atmospheric model; but the NN methodology research continues, see [41, 42], where this method is applied to FSU (Florida State Model) AGCM to emulate the LETKF data assimilation method too.

In every experiment, NNs were applied to mimic other data assimilation methods to obtain the analyses to initiate the forecast models. They do not use an error model estimation or error observation estimation. The main advantage to using NN is the speed-up of the data assimilation process.

This paper presents the approach based on a set of NN multilayer perceptron (MLP) [Section 3] employed to emulate the LETKF. The LETKF technique was used as the reference analysis, see [29, 32], Section 2.3. More information about LETKF can be obtained from [2, 26, 35]. The initial conditions generated by NNs are applied to a nonlinear dynamical system; the AGCM is the Simplified Parameterizations PrimitivE-Equation Dynamics (SPEEDY). The DA method is tested with synthetic conventional data, simulating measurements from surface stations (data at each 6 hours on a day) and upper-air soundings (data at each 12 hours on a day). The application of NN produces a significant reduction for the computational effort compared to LETKF. The goal of using NN approach is to obtain a similar quality for analyses with better computational performance for prediction process.

Summarizing, the NN technique uses the function:

$$\mathbf{x}^{a} = F\_{\text{NN}}\left(\mathbf{y}^{\rho}, \mathbf{x}^{f}\right) \tag{1}$$

where FNN is the data assimilation process, yo represents the observations, xf is a model forecast (simulated), and xa is the analysis field.

The observations used in operational data assimilation are conventional and satellite data. The observations include surface and upper-air observations; here, we simulate observations of one type of measurement, meteorological balloons. The grid of synthetic observations seeks to reproduce the stations of World Meteorological Organization (WMO) of radiosonde observations.

The experiment was conducted using the SPEEDY model [3, 21], which is a 3D global atmospheric model, with simplified physics parameterization by [36]. The spatial resolution considered is T30 L7 for the spectral method explained in Section 2.2. This paper shows that the analysis computed by the NN has the similar quality as the analysis produced by LETKF with minor computational effort.
