**3.4 The CNN\_MULTH\_10 model**

This CNN model uses a dedicated CNN block for each of the five input attributes in the stock price data. In other words, for each input variable, a separate CNN is used for feature extrication. We call this a multivariate and multi-headed CNN model. For each

sub-CNN model, a couple of convolutional layers were used. The convolutional layers have a feature space dimension of 32 and a filter size (i.e., kernel size) of 3. The convolutional layers are followed by a max-pooling layer. The size of the feature space is reduced by a factor of 1/2 by the max-pooling layer. Following the computation rule discussed under the CNN\_MULTV\_10 model, the data shape of the output from the max-pooling layer for each sub-CNN model is (3, 32). A flatten operation follows converting the data into a single-dimensional array of size 96 for each input variable. A concatenation operation follows that concatenates the five arrays, each containing 96 values, into a single one-dimensional array of size 96\*5 = 480. The output of the concatenation operation is passed successively through two dense layers containing 200 nodes and 100 nodes, respectively. In the end, the output layer having five nodes yields the forecasted five values as the daily *open* stock prices for the coming week. The epoch number and the batch size used in training the model are 70 and 16, respectively. **Figure 4** shows the structure and data flow of the CNN\_MULTH\_10 model.

**Table 4** presents the necessary calculations for finding the number of parameters in the CNN\_MULTH\_10 model. Each of the five convolutions layers, *conv1d\_1*, *conv1d\_3*, *conv1d\_5*, *conv1d\_7*, and *convid\_9*, involves 128 parameters. For each of these layers, *k* = 3, *d* = 1 and *f* = 3, and hence the number of trainable parameters is: (3 \* 1 + 1) \* 32 = 128. Hence, for the five convolutional layers, the total number of parameters is

### **Figure 4.**

*The schematic architecture of the model CNN\_MULTH\_10.*


### **Table 4.**

*The number of parameters in the model CNN\_MULTH\_10 model.*

*Design and Analysis of Robust Deep Learning Models for Stock Price Prediction DOI: http://dx.doi.org/10.5772/intechopen.99982*

128 \* 5 = 640. Next, for each of the five convolutional layers, conv1d\_2, conv1d\_4, conv1d\_6, conv1d\_8, and con1d\_10, involves 3104. Each layer of this group has *k* = 3, *d* = 32, and *f* = 32. Hence the number of trainable parameters for each layer is: (3\*32 + 1) \* 32 = 3104. Therefore, for the five convolutional layers, the total number of parameters is 3104 \* 5 = 15, 520. The dense layers, *dense\_1*, *dense\_2*, and *dense\_3* involve 96200, 20100, and 505 parameters using (2). Hence, the model includes 132,965 parameters.

### **3.5 The LSTM\_UNIV\_5 model**

This model is based on an input of the univariate information of the *open* values of the last week's stock price records. The model predicts the future five values in sequence as the daily *open* index for the coming week. The input has a shape of (5, 1) that indicates that the previous week's daily *open* index values are passed as the input. An LSTM block having 200 nodes receives that data from the input layer. The number of nodes at the LSTM layer is determined using the *grid-search*. The results of the LSTM block are passed on to a fully connected layer (also known as a dense layer) of 100 nodes. Finally, the output layer containing five nodes receives the output of the dense layer and produces the following five future values of open for the coming week. In training the model, 20 epochs and 16 batch-size are used. **Figure 5** presents the structure and data flow of the model**.**

As we did in the case of the CNN models, we now compute the number of parameters involved in the LSTM model. The input layers do not have any parameters, as the role of these layers is to just receive and forward the data. There are four gates in an LSTM network that have the same number of parameters. These four gates are known as (i) *forget gate*, (ii) *input gate*, (iii) *input modulation gate*, and the *output gate*. The number of parameters (*n1*) in each of the gates in an LSTM network is computed using (3), where *x* denotes the number of LSTM units, and *y* is the input dimension (i.e., the number of features in the input data)

$$n\_1 = (\mathfrak{x} + \mathfrak{y}) \* \mathfrak{x} + \mathfrak{x} \tag{3}$$

**Figure 5.** *The schematic architecture of the model LSTM\_UNIV\_5.*

Hence, the total number of parameters in an LSTM layer will be given by 4 \* *n1*. The number of parameters (*n2*) in a dense layer of an LSTM network is computed using (4), where *pprev* and *pcurr* are the number of nodes in the previous layer and the current layer, respectively. The bias parameter of each node in the current layer is represented by the last term on the right-hand side of (4).

$$m\_2 = \left(p\_{prev} \* p\_{curr} + p\_{curr}\right) \tag{4}$$

The computation of the number of parameters associated with the model LSTM\_UNIV\_5 is depicted in **Table 5**. In **Table 5**, the number of parameters in the LSTM layer is computed as follows: 4\*[(200 + 1) \* 200 + 200] = 161,600. The number of parameters in the dense layer, dens\_4 is computed as: (200 \* 100 + 100) = 20,100. Similarly, the parameters in the dense layers, *dense\_5* and *dense\_6*, are computed. The total number of parameters in the LSTM\_UNIV\_5 model is found to be 182, 235.

### **3.6 The LSTM\_UNIV\_10 model**

LSTM\_UNIV\_10 model: This univariate model uses the last couple of weeks' open index input and yields the daily forecasted open values for the coming week. The same values of the parameters and hyperparameters of the model LSTM\_UNIV\_5 are used here. Only, the input data shape is different. The input data shape of this model is (10, 1). **Figure 6** presents the architecture of this model.

**Table 6** presents the computation of the number of parameters involved in the modelLSTM\_UNIV\_10. Since the number of parameters in the LSTM layers depends only on the number of features in the input data and the node-count in the LSTM layer, and not on the number of input records in one epoch, the model LSTM\_UNIV\_10 has an identical number of parameters in the LSTM layer as that of the model LSTM\_UNIV\_5. Since both the models have the same number of dense layers and have the same architecture for those layers, the total number of parameters for both the models are the same.

### **3.7 The LSTM\_UNIV\_ED\_10 model**

This LSTM model has an encoding and decoding capability and is based on the input of the *open* values of the stock price records of the last couple of weeks. The model consists of two LSTM blocks. One LSTM block performs the encoding operation, while the other does the decoding. The encoder LSTM block consists of 200


### **Table 5.**

*The number of parameters in the model LSTM\_UNIV\_5 model.*

*Design and Analysis of Robust Deep Learning Models for Stock Price Prediction DOI: http://dx.doi.org/10.5772/intechopen.99982*

### **Figure 6.**

*The schematic architecture of the model LSTM\_UNIV\_10.*


### **Table 6.**

*The number of parameters in the model LSTM\_UNIV\_10.*

nodes (determined using the *grid-search* procedure). The input data shape to the encoder LSTM is (10, 1). The encoding layer yields a one-dimensional vector of size 200 – each value corresponding to the feature extracted by a node in the LSTM layer from the ten input values received from the input layer. Corresponding to each timestamp of the output sequence (there are five timestamps for the output sequence for the five forecasted *open* values), the input data features are extracted once. Hence, the data shape from the repeat vector layer's output is (5, 200). It signifies that in total 200 features are extracted from the input for each of the five timestamps corresponding to the model's output (i.e., forecasted) sequence. The second LSTM block decodes the encoded features using 200 nodes.

The decoded result is passed on to a dense layer. The dense layer learns from the decoded values and predicts the future five values of the target variable (i.e., *open*) for the coming week through five nodes in the output layer. However, the forecasted values are not produced in a single timestamp. The forecasted values for the five days are made in five rounds. The round-wise forecasting is done using a *TimeDistributedWrapper* function that synchronizes the decoder LSTM block, the fully connected block, and the output layer in every round. The number of epochs and the batch sizes used in training the model are 70 and 16, respectively. **Figure 7** presents the structure and the data flow of the LSTM\_UNIV\_ED\_10 model.

### **Figure 7.**

*The schematic architecture of the model LSTM\_UNIV\_ED\_10.*

The computation of the number of parameters in the LSTM\_UNIV\_ED\_10 model is shown in **Table 7**. The input layer and the repeat vector layer do not involve any learning, and hence these layers have no parameters. On the other hand, the two LSTM layers, *lstm\_3* and *lstm\_4*, and the two dense layers, *time\_distributed\_3*, and *time\_distributed\_4* involve learning. The number of parameters in the *lstm\_3* layer is computed as: 4 \* [(200 + 1) \* 200 + 200] = 161, 600. The computation of the number of parameters in the *lstm\_4* layer is as follows: 4 \* [(200 + 200) \* 200 + 200] = 320, 800. The computations of the dense layers' parameters are identical to those in the models discussed earlier. The total number of parameters in this model turns out to be 5,02,601.
