*Deep Learning for Subtyping and Prediction of Diseases: Long-Short Term Memory DOI: http://dx.doi.org/10.5772/intechopen.96180*

training data is the rate for the number of positive/number of tests for each day between 01/22/2020–2112/22/2020. Data set was taken from publicly available h ttps://covidtracking.com/data/national. web site and data are updated each day between about 6 pm and 7:30 pm Eastern Time Zone. The initiative relies upon publicly available data from multiple sources. States in the USA are not consistent in how and when they release and update their data, and some may even retroactively change the numbers they report. This can affect the predictions presented in these data visualizations (**Figure 11a-d**). The steps for example 1 are summarized in the **Table 1** (MATLAB 2020b) and results are illustrated in **Figure 11a-d.** LSTM network was trained on the first 90% of the sequence and tested on the last 10%. Therefore, results reveal predicting the positive last 38 days.

This example trains an LSTM network to forecast the number of positively tested given the number of cases in previous days. The training data contains a single time series, with time steps corresponding to days and values corresponding to the number of cases. To make predictions on a new sequence, reset the network state using the "*resetState*" command in MATLAB. Resetting the network state prevents previous predictions from affecting the predictions on the new data. Reset the network state, and then initialize the network state by predicting on the training data (MATLAB, 2020b). The solid line with red color in **Figure 11a** and **c** indicates the number of cases predicted for the last 30 days.

#### **Figure 11.**

inherent to time lags. In other words, with *peephole connections* the information conveyed by time intervals between sub-patterns of sequences is included to the network recurrent. Thus, *peephole connections* concatenate the previous cell state (*Ct*-1) information to the forget, input and output gates. That is, the expression of

> , h*t*�1,*Ct*�<sup>1</sup> <sup>þ</sup> *<sup>b</sup>*ð Þ*<sup>f</sup>*

, h*t*�1,*Ct*�<sup>1</sup> <sup>þ</sup> *<sup>b</sup>*ð Þ*<sup>i</sup>*

(22)

<sup>þ</sup> *<sup>b</sup>*ð Þ*<sup>o</sup>*

This configuration was offered to improve the predictive ability of LSTMs to

Gated recurrent units (GRU) is a simplified version of the standard LSTM designed in a manner to have more persistent memory in order to make it easier for RNNs. A GRU is called gated RNN and introduced by [22]. In a GRU, forget and input gates are merged into a single gate named "update gate". Moreover, the cell state and hidden state are also merged. Therefore, the GRU has fewer parameters and has been shown to outperform LSTM on some tasks to capture long-term

This configuration of LSTM was introduced by Krause et al., [23]. The architecture is for sequence modeling combines LSTM and multiplicative RNN architectures. mLSTM is characterized by its ability to have different recurrent transition

The core idea behind the LSTM with Attention frees the encoder-decoder architecture from the fixed-length internal illustration. This is one of the most transformative innovations in sequence to uncover the mobility regularities in the hidden node of LSTM. The LSTM with attention was introduced by Wu et al., [24] for Bridging the Gap between Human and Machine Translation in Google's Neural Machine Translation System. The LSTM with attraction consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. Most likely, this type of LSTM continues to power Google Translate to this day.

Two examples using MATLAB for LSTM will be given for this particular

*Example* **1** This example shows how to forecast time series data for COVID19 in the USA using a long short-term memory (LSTM) network. The variable used in

functions for each possible input, which makes it more expressive for

autoregressive density estimation. Krause et al. concluded that mLSTM outperforms standard LSTM and its deep variants for a range of character-level language

these gates with peephole connection would be:

count and time distances between rare events [21].

*4.3.2 Gated recurrent units*

*Deep Learning Applications*

*4.3.3 Multiplicative LSTMs (mLSTMs*)

dependencies.

modeling tasks.

**4.4 Examples**

chapter.

**72**

*4.3.4 LSTM with attention*

<sup>f</sup>*<sup>t</sup>* <sup>¼</sup> *<sup>σ</sup> <sup>W</sup>*ð Þ*<sup>f</sup>* <sup>p</sup>*<sup>t</sup>*

*it* <sup>¼</sup> *<sup>σ</sup> <sup>W</sup>*ð Þ*<sup>i</sup>* <sup>p</sup>*<sup>t</sup>*

*ot* <sup>¼</sup> *<sup>σ</sup> <sup>W</sup>*ð Þ*<sup>o</sup> ht*�1, *pt*

*Total daily number of positively tested COVID19 and the rate (positively tested/number of test) conducted in the USA. (a) Plot of the training time series of the number of positively tested COVID19 with the forecasted values, (b) compare the forecasted values of the number of positively tested with the test data set. This graph shows the total daily number of virus tests conducted in each state and of those tests, how many were positive each day. (c) Plot of the training time series of the rate of positively tested COVID19 (d) compare the forecasted values of the rate of positively tested with the rates in the test data set. The trend line in blue shows the actual number of positive cases and the trend line in red shows the number predicted for the last 38 days.*


is the cancer type and non-cancer causes of death identified from the survey. https://seer.cancer.gov/data/. The steps for example 2 are summarized in **Table 2** (MATLAB 2020b) and because of space limitation, only the cloud of cancer from

*Deep Learning for Subtyping and Prediction of Diseases: Long-Short Term Memory*

To import the text data as strings, specify the text

Partition data into sets for training and validation. Partition the data into a training partition and a held-out partition for validation and testing. Specify the holdout percentage to be 30%.

Extract the text data and labels from the partitioned tables. Here TextVariableName is column name for the group variable (for example age group) and TextVariableName is the column name for the text variable (in this example the

To check that you have imported the data correctly, visualize the training text data using a

Preprocess the training data and the validation data using the preprocessText function.

Encoding to convert the documents into sequences of numeric indices.

Initialize the embedding weights

MiniBatchSize: Classifies data using mini batches

of size

type to be 'string'.

name of cancer types)

word cloud.

**Example 2. MATLAB Codes and descriptions of the codes a. Data prepretion Descriptions**

**b. Partition data Descriptions**

**c. Extract the text data Descriptions**

**d. Create a cloud for the text Descriptions**

**f. Convert document to sequences Descriptions**

**g. Create and Train LSTM Descriptions**

**h. Specify training options Descriptions**

data = readtable(filename,'TextType','string');

*DOI: http://dx.doi.org/10.5772/intechopen.96180*

dataTrain = DataName(training(crossval),:); dataValidation = DataName (test(crossval),:);

textDataTrain = dataTrain.TextVariableName\_;

crossval = cvpartition('DataName. ClassVaribleName',0.30);

textDataValidation = dataValidation.

YTrain = dataTrain. TextVariableName; YValidation = dataValidation. TextVariableName;

filename = "filename"

TextVariableName \_;

wordcloud(textDataTrain);

**e. Preprocess Text Data**

(textDataTrain);

(textDataValidation);

sequenceLength = 10; XTrain = doc2sequence(enc,

XTrain(1:5)

inputSize = 1;

layers = [...

numWords)

softmaxLayer classificationLayer]

**75**

'MiniBatchSize',30, ...

documentsTrain = preprocessText

documentsValidation = preprocessText

enc = wordEncoding(documentsTrain);

XValidation = doc2sequence(enc,

embeddingDimension = 30; numHiddenUnits = 200; numWords = enc.NumWords;

sequenceInputLayer(inputSize)

fullyConnectedLayer(numClasses)

options = trainingOptions('adam', ...

documentsTrain,'Length',sequenceLength);

numClasses = numel(categories(YTrain));

wordEmbeddingLayer(embeddingDimension,

lstmLayer(numHiddenUnits,'OutputMode','last')

documentsValidation,'Length',sequenceLength);

figure

#### **Table 1.**

*MATLAB codes and specification of cods for Example 1\* .*

*Example* **2**: Data from example 2 is from SEER 2017 for different age groups. SEER collects cancer incidence data from population-based cancer registries of the U.S. population. The SEER registries collect data on patient demographics, primary tumor site, tumor morphology, stage at diagnosis, and the first course of treatment, and they follow up with patients for vital status. The example given in this chapter
