1. Introduction

Artificial neural networks (ANN), which are mathematical models for function approximation, classification, pattern recognition, nonlinear control, etc., have been successfully applied in the field of time series analysis and forecasting instead of linear models such as 1970s ARIMA [1] since 1980s [2–7]. In [2], Casdagli used a radial basis function network (RBFN) which is a kind of feed-forward neural network with Gaussian hidden units to predict chaotic time series data, such as the Mackey-Glass, the Ikeda map, and the Lorenz chaos in 1989. In [3, 4], Lendasse et al. organized a time series forecasting competition for neural network prediction methods with a five-block artificial time series data named CATS since 2004. The goal of CATS competition was to predict 100 missing values of the time series data in five sets which included 980 known values and 20 successive unknown values in each set (details are in Section 3.1). There were 24 submissions to the competition, and five kinds of methods were selected by the IJCNN2004: filtering techniques including Bayesian methods, Kalman filters, and so on; recurrent neural networks

(RNNs); vector quantization; fuzzy logic; and ensemble methods. As the comment of the organizers, the different prediction precisions were reported though the similar prediction methods were used for the know-how and experience of the authors. So the development of time series forecasting by ANN is still on the way.

2. The DBN model for time series forecasting

and multilayer perceptron (MLP) is shown in Figure 1.

of a visible unit and a hidden unit are according to the following:

p hj <sup>¼</sup> <sup>1</sup>j<sup>v</sup> <sup>¼</sup> <sup>1</sup>

p vð Þ¼ <sup>i</sup> <sup>¼</sup> <sup>1</sup>j<sup>h</sup> <sup>1</sup>

The model of time series forecasting is given as the following:

Training Deep Neural Networks with Reinforcement Learning for Time Series Forecasting

Denote t = 1, 2, 3, …, where T is the time, n is the dimensionality of the input of function f(x), xt is the time series data, and xtþ<sup>1</sup> is unknown data in the future as

A deep belief net (DBN) composed by restricted Boltzmann machines (RBMs)

Restricted Boltzmann machine (RBM) is a kind of probabilistic generative neural network which composed by two layers of units: visible layer and hidden layer

Units of different layers connect to each other with weights wij ¼ wji, where i ¼ 1, 2, …, n and j ¼ 1, 2, …, m are the numbers of units of visible layer and hidden layer, respectively. The outputs of units vi, hj are binary, i.e., 0 or 1, except for the initial value of visible units which is given by the input data. The probabilities of 1

<sup>1</sup> <sup>þ</sup> exp �bj � <sup>∑</sup><sup>n</sup>

<sup>1</sup> <sup>þ</sup> exp �bi � <sup>∑</sup><sup>m</sup>

<sup>i</sup>¼<sup>1</sup>wjivi

<sup>j</sup>¼<sup>1</sup>wijhj

(2)

(3)

xtþ<sup>1</sup> ¼ f xt ð Þ ; xt�1; …; xt�nþ<sup>1</sup> (1)

2.1 The structure of the model

DOI: http://dx.doi.org/10.5772/intechopen.85457

well as the output of model.

2.2 RBM

Figure 1.

Figure 2.

41

The structure of RBM.

The structure of DBN for time series forecasting.

(see Figure 2).

As a kind of classifiers or a kind of function approximators, the advances of the ANN are bought out by the nonlinear transforms to the input space. In fact, units (or neurons) with nonlinear firing functions connected to each other usually produce higher dimensional output space and various feature spaces in the networks. Additionally, as a connective system, it is not necessary to design fixed mathematical models for different nonlinear phenomena, but adjusting the weights of connections between units. So according to the report of NN3—Artificial Neural Networks and Computational Intelligence Forecasting Competition [5], there have been more than 5000 publications of time series forecasting using ANN till 2007.

To find the suitable parameters of ANN, such as weights of connections between neurons, error back-propagation (BP) algorithm [6] is generally utilized in the training process of ANN. However, due to every sample data (a pair of the input data and the output data) is used in the BP method, noise data influences the optimization of the model, and robustness of the model becomes weak for unknown input. Another problem of ANN models is how to determine the structure of the network, i.e., the number of layers and the number of neurons in each layer. To overcome these problems of BP, Kuremoto et al. [7] adopted a reinforcement learning (RL) method "stochastic gradient ascent (SGA)" [8] to adjust the connection weights of units and the particle swarm optimization (PSO) to find the optimal structure of ANN. SGA, which is proposed by Kimura and Kobayshi, improved Williams' REINFORCE [9], which uses rewards to modify the stochastic policies (likelihood). In SGA learning algorithm, the accumulated modification of policies named "eligibility trace" is used to adjust the parameters of model (see Section 2). In the case of time series forecasting, the reward of RL system can be defined as a suitable error zone to instead of the distance (error) between the output of the model and the teach data which is used in BP learning algorithm. So the sensitivity to noise data is possible to be reduced, and the robustness to the unknown data may be raised. As a deep learning method for time series forecasting, Kuremoto et al. [10] firstly applied Hinton and Salakhutdinov's deep belief net (DBN) which is a kind of stacked auto-encoder (SAE) composed by multiple restricted Boltzmann machines (RBMs) [11]. An improved DBN for time series forecasting is proposed in [12], which DBN is composed by multiple RBMs and a multilayer perceptron (MLP) [6]. The improved DBN with RBMs and MLP [6] gives its priority to the conventional DBN [5] for time series forecasting due to the continuous output unit is used; meanwhile the conventional one had a binary value unit in the output layer.

As same as the RL method, SGA adopted to MLP, RBFN, and self-organized fuzzy neural network (SOFNN) [7]; the prediction precision of DBN utilized SGA may also be raised comparing to the BP learning algorithm. Furthermore, it is available to raise the prediction precision by a hybrid model which forecasts the future data by the linear model ARIMA at first and modifying the forecasting by the predicted error given by an ANN which is trained by error time series [13, 14].

In this chapter, we concentrate to introduce the DBN which is composed by multiple RBMs and MLP and show the higher efficiency of the RL learning method SGA for the DBN [15, 16] comparing to the conventional learning method BP using the results of time series forecasting experiments. Kinds of benchmark data including artificial time series data CATS [3], natural phenomenon time series data provided by Aalto University [18], and TSDL [18] were used in the experiments.

Training Deep Neural Networks with Reinforcement Learning for Time Series Forecasting DOI: http://dx.doi.org/10.5772/intechopen.85457
