**2. Theoretical and mathematical background**

#### **2.1 Mathematical foundation**

In signal processing [17], several mathematical methods like sampling frequency, Nyquist filtering, Fourier analysis series and transform, *Z* transform, pole zero plot are used for processing signals.

The reduction of a continuous-time signal to a discrete-time signal is known as sampling and the sampling frequency represents the number of samples per second collected from a continuous signal to create a discrete or digital signal. There are few applications of the sampling process. The sampling process is utilized in music

recordings to ensure sound quality. The sampling technique is also used to convert analog to discrete data. It is also used in speech recognition systems, radar and radio navigation, sensor data evaluation, modulation and demodulation, and pattern recognition systems.

#### *2.1.1 Sampling frequency or sampling rate*

The sampling frequency [18] or sampling rate *fs* is defined as the average number of samples acquired in 1 second, therefore *fs* <sup>¼</sup> <sup>1</sup> *<sup>T</sup>* where *T* is the sampling period and is measured with the unit samples per second or hertz. The sampling theorem indicates the lowest sampling frequency where a continuous-time signal must have been uniformly sampled in order for the original signal to be fully recovered or reconstructed using just these samples.

If a continuous-time signal has no frequency components greater than a sampling rate of *W* Hz (where *W* is called the bandwidth), then uniform samples taken at a rate of *fs* samples per second can be used to identify it completely [19]. This implies *fs* <sup>&</sup>gt; <sup>¼</sup> <sup>2</sup>*<sup>W</sup>* and when it comes to the sampling period *<sup>T</sup>* <sup>&</sup>lt; <sup>¼</sup> <sup>1</sup> <sup>2</sup>*<sup>W</sup>*. Here 2*W* is termed as the Nyquist rate.

#### *2.1.2 Nyquist filter*

A Nyquist filter is an electrical filter that equalizes the visual characteristics of TV receivers. In receivers, a Nyquist filter is utilized to equalize the low and high-frequency components of the VF signal. It plays an essential role in the creation of n bandlimited pulses in wired and wireless communication systems to ensure minimal inter symbol interference. Its principal application is as a pulse-shaping filter. Nyquist filters are a form of multi-rate finite impulse response filter that is also known as *Mth* band filters.

The following equation indicates the impulse response of a Nyquist filter *h n*ð Þ:

$$h(Mn+k) = \begin{cases} c & n=0\\ 0 & otherwise \end{cases} \tag{1}$$

where, *c* and *k* are constants.

The following equation satisfies the *z*-transform of a Nyquist filter *H z*ð Þ:

$$\sum\_{k=0}^{M-1} H(\mathbf{z}\mathcal{W}^k) = \mathbf{M}\mathbf{c} = \mathbf{1} \tag{2}$$

where, *<sup>W</sup>* <sup>¼</sup> *<sup>e</sup>*� *<sup>j</sup>*2*<sup>π</sup> <sup>M</sup>* and *<sup>c</sup>* <sup>¼</sup> <sup>1</sup> *M*.

The frequency responses of all *M* uniformly shifted versions of *H*(*z*) add up to a constant because the frequency response of *H zW<sup>k</sup>* � � is the shifted version of the frequency response of *H*(*z*).

#### *2.1.3 Fourier series and Fourier transform*

The Fourier series is a periodic function made up of harmonically compatible sinusoids that are integrated together using a weighted summation. The Fourier series is an infinite series that can be used to solve several forms of differential equations. It's mainly composed of an infinite sum of sines and cosines, and it's valuable for evaluating periodic functions since it's periodic. The Fourier series is widely utilized in telecommunications systems for voice signal modulation and demodulation.

The Fourier transform is a technique for transforming time-domain signals to frequency-domain signals. The Fourier transform is a useful image processing method for decomposing an image into sine and cosine components. The image in the Fourier or frequency domain is represented by the output of the transformation, whereas the spatial domain equivalent is represented by the input image. It's utilized in electrical circuit design, solving differential equations, signal processing, signal analysis, image processing, and filtering, among other things.

The Fourier transform is a mathematical approach for converting a time function, *x t*ð Þ, to a frequency function, *X*ð Þ *ω* . It has a lot in common with the Fourier series. The Fourier transform of a function can be determined as a specific instance of the Fourier series when the period is *T* ! ∞.

The Fourier transform of a sequence is represented as:

$$\varkappa(t) = \sum\_{n = -\infty}^{\infty} c\_n e^{in\alpha\_\* t} \tag{3}$$

where *cn* is provided by the Fourier series analysis equation:

$$\mathcal{L}\_{\mathfrak{n}} = \frac{1}{T} \int\_{T} \varkappa(t) e^{-j n a\_{0} t} dt \tag{4}$$

It can also be written as:

$$X(e^{j\alpha}) = \sum\_{n=-\infty} [n] e^{-j\alpha n} \tag{5}$$

As *<sup>T</sup>* ! <sup>∞</sup> the initial frequency, *<sup>ω</sup>*<sup>0</sup> <sup>¼</sup> <sup>2</sup>*<sup>π</sup> <sup>T</sup>* decreases dramatically and the quantity *nω*<sup>0</sup> becomes a continuous quantity that may take on any value (because n has a range of �∞). So, we generate a special variable *ω* ¼ *nω*<sup>0</sup> and set *X*ð Þ¼ *ω Tcn*. The analytical equation for the Fourier transform is obtained by substituting these values in the previous equation. This transform is also called the forward Fourier transform.

The analysis equation of forward Fourier transform is:

$$X(\alpha) = \int\_{-\infty}^{+\infty} X(t)e^{-j\alpha t}dt\tag{6}$$

On the other hand, the synthesis equation of inverse Fourier transform is:

$$\boldsymbol{\omega}(t) = \frac{1}{2\pi} \int\_{-\infty}^{+\infty} \boldsymbol{X}(\boldsymbol{\omega}) \boldsymbol{e}^{j\boldsymbol{a}t} d\boldsymbol{\omega} \tag{7}$$

#### *2.1.4 Z-transform*

*Z* transform is a useful mathematical tool for converting differential equations to algebraic equations. *Z* transform is utilized when converting a discrete-time domain signal to a discrete frequency domain signal. It has a broad range of statistical and digital signal processing applications. It is mostly used to process and evaluate digital data.

The bilateral *z*-transform of a discrete-time signal *x*(*n*) is given as:

*Deep Learning Algorithms for Efficient Analysis of ECG Signals to Detect Heart Disorders DOI: http://dx.doi.org/10.5772/intechopen.103075*

$$Z.T[\varkappa(n)] = X(Z) = \sum\_{n = -\infty}^{\infty} \varkappa(n) z^{-n} \tag{8}$$

The unilateral *z*-transform of a discrete-time signal *x*(*n*) is represented by the following equation:

$$Z.T[\varkappa(n)] = X(Z) = \sum\_{n=0}^{\infty} \varkappa(n) z^{-n} \tag{9}$$

Fourier transform and *Z* transform equations have an operation in an embedded system. If we substitute *z* with *ejw* then the *z*-transform becomes the Fourier transform. On the other hand, when |*z*| = 1, the Fourier transform is simply *X*(*z*) with *z* ¼ *ejw* and the *z*-transform correlates to the Fourier transform. If we express *z* in polar form, we get *<sup>z</sup>* <sup>¼</sup> *rejw*.

A system's Fourier transform and *z*-transform can be written as:

$$H(w) = \sum\_{k=0}^{M} b\_k \mathfrak{e}^{-jak} \tag{10}$$

$$H(\mathbf{z}) = \sum\_{k=0}^{M} b\_k \mathbf{z}^{-k} \tag{11}$$

$$H(w) = H(\mathfrak{e}^{jw}) = H(z)|\_{x = \mathfrak{e}^{jw}} \tag{12}$$

#### *2.1.5 Pole-zero plot*

The pole-zero plot is a valuable tool for relating a system's Frequency domain and *Z*-domain representations. A pole-zero plot is a graphical depiction of a rational transfer function in the complex plane that aids in the communication of system attributes.

Pole-zero plot can be expressed as the following equation:

$$H(z) = \frac{B(z)}{A(z)} = \frac{\sum\_{k=0}^{M} b\_k z^{-k}}{1 + \sum\_{k=0}^{N} a\_k z^{-k}}\tag{13}$$

where the numerator and denominator are both polynomials in *z*. The zeros of *H*(*z*) are the values of *z* for which *H z*ð Þ¼ 0, while the poles of *H*(*z*) are the values of *z* for which *H*(*z*) is ∞. *M* and *N* are the order of the numerator and denominator polynomial, respectively. On the other hand, *bk* is the *mth* coefficient of the numerator polynomial whereas *ak* is the *nth* coefficient of the denominator polynomial.

#### **2.2 ECG signal**

The electrocardiogram (ECG) signal is a representation of the electrical impulses of the heart that can be seen from the strategic points of the human body. It can be visually depicted by a quasi-periodic voltage signal. ECG refers to a 12-lead ECG recorded while laying down and electrodes or sticky patches are put on the body surface and often over the chest and limbs to record a standard surface ECG. These electrode wires are linked

to a 12-lead ECG machine which records data from 12 distinct locations on the body's surface. The aggregate amplitude of the heart's electrical potential is then monitored and recorded over a period of time from those distinct angles ("leads").

The graphical representation of the heart's electrical activity is formed by analyzing numerous electrodes in **Figure 1(a)**. There are three types of leads: limb augmented limb, and precordial or chest. Three limb leads and three augmented

*Deep Learning Algorithms for Efficient Analysis of ECG Signals to Detect Heart Disorders DOI: http://dx.doi.org/10.5772/intechopen.103075*

**Figure 2.** *PQRST waveform [22].*

limb leads are organized in the coronal plane like the spokes of a wheel, and six precordial leads or chest leads are organized in the perpendicular transverse plane. In three-dimensional space, each of the 12 ECG leads represents a distinct direction of cardiac activation. The conventional ECG leads are denoted as lead I, II, III, aVF, aVR, aVL, V1, V2, V3, V4, V5, and V6. The limb leads are I, II, III, aVR, aVL, and aVF whereas the precordial leads are V1, V2, V3, V4, V5, and V6.

The 12-lead ECG is typically made up of 10 electrodes linked to the body, each monitoring a distinct electrical potential difference. The 10 electrodes in a 12-lead ECG are RA, RL, LA, LL, V1, V2, V3, V4, V5, and V6. Each of the 10 electrodes has a different placement as shown in **Figure 1(b)**. RA is used to place on the right arm and similarly, LA is used to place on the left arm. RL is located in the lower end of the inner portion of the calf muscle on the right leg, similarly, LL is placed in the same standard position but on the left leg. V1 is placed in the fourth intercostal space (between ribs 4 and 5) immediate right of the sternum. V2 is placed in the fourth intercostal space (between ribs 4 and 5) immediate left of the sternum. V3 is placed between leads V2 and V4 where V4 is placed in the fifth intercostal space (between ribs 5 and 6) in the midclavicular line. On the other hand, V5 and V6 are placed in the left anterior axillary line and midaxillary line, respectively. The electrodes which are located on the limbs are called limb leads which are leads I, II, and III. Lead I refer to the voltage difference between LA and RA, that is, Lead I = LA-RA. Similarly, Lead II denotes the voltage difference between LL and RA, that is, Lead II = LL-RA. And Lead III denotes the voltage between LL and LA, that is, Lead III = LL-LA.

Lastly, a PQRST complex is part of an ECG complex which is shown in **Figure 2**. The P wave is produced by the sinoatrial node which is the heart's pacemaker and implies atrial depolarization in an ECG complex. The atrioventricular node generates the QRS wave. Ventricular depolarization is represented by the QRS, while ventricular repolarization is indicated by the T wave.

#### **2.3 Deep learning**

#### *2.3.1 Artificial neural network*

In biology, neural networks develop the structure of animal brains, where the phrase "artificial neural networks" comes from. It is widely used in deep learning

**Figure 3.** *Architecture of a general ANN [24].*

algorithms. An artificial neural network (ANN) [23] generally consists of three layers, namely, the input layer, hidden layer, and output layer. The hidden layers are present in-between input and output layers. It executes all the calculations to find hidden features and patterns. A shallow neural network consists of only one hidden layer and a deep neural network consists of multiple hidden layers. Generally, each node in one layer is linked to every other node in the next layer. By increasing the number of hidden layers, the network becomes deeper. This architecture is demonstrated in **Figure 3**.

#### *2.3.2 Convolutional neural network*

Based on the concept of ANNs, a convolutional neural network (CNN) [25] was formulated which is a deep learning method that can take an image as input and learn some filters that can be used to extract essential features from those images. The brain is the source of inspiration for convolutional neural networks. CNN performs a linear mathematical procedure known as a convolution in the several hidden layers between an input and output layer. The general mathematical expression of convolution operation is provided in the following equation:

$$Y = W \* X + b \tag{14}$$

where *W* and *X* represent the filter and the input, respectively whereas *b* represents the bias matrix and the \* represents the convolution operation between the matrices *W* and *X*.

CNN's have the benefit of being able to construct an internal demonstration of a two-dimensional image. This enables the model to learn position and scale in different data formats, which is essential when working with images.

## *2.3.3 Recurrent neural network*

A recurrent neural network (RNN) [26] is a form of artificial neural network which is designed to operate with time series, analyzing temporal and sequential data. It's one of the algorithms responsible for the incredible advances in deep learning over the last few years. RNN can handle inputs/outputs of varying lengths. The idea of

*Deep Learning Algorithms for Efficient Analysis of ECG Signals to Detect Heart Disorders DOI: http://dx.doi.org/10.5772/intechopen.103075*

"memory" in RNNs is used to store the states or information of earlier inputs in order to generate the sequence's next output. It has the ability to store or memorize historical information.

Long short term memory (LSTM) [27] is a type of recurrent neural network and LSTM networks are well-suited to categorize, processing, and generating predictions based on time series data as there might be delays of undetermined duration between critical occurrences in a time series. LSTMs were designed to explode gradients and solve the problem of vanishing gradients that can occur while training standard RNNs.

LSTM uses the concept of gates. It has three gates which are input gate, forget gate, and output gate. The input gate determines what new information will be stored in the cell state. The forget gate determines what information to throw away from the cell state whereas the output gate is used to activate the LSTM block's final output. In LSTM, output of the gates are operated with sigmoid activation functions, which calculates a value between 0 and 1, which is usually rounded to either 0 or 1 depending upon a predetermined threshold. "0" indicates that the gates are blocking everything and "1" denotes gates that enable everything to pass through it. The LSTM gates have the following equations:

$$\begin{aligned} i\_t &= \sigma(w\_i[h\_{t-1}, \mathbf{x}\_t] + b\_i) \\ f\_t &= \sigma(w\_f[h\_{t-1}, \mathbf{x}\_t] + b\_f) \\ o\_t &= \sigma(w\_o[h\_{t-1}, \mathbf{x}\_t] + b\_o) \end{aligned} \tag{15}$$

where, *it*, *ft* , *ot* represents input, forget, and output gates, respectively whereas *wx*, *bx* and *xt* represents weights and biases of gate *x* and input at the current timestamp, respectively. On the other hand, *σ* is the sigmoid function. Lastly, *ht*�<sup>1</sup> indicates the output of the LSTM block at *<sup>t</sup>* � <sup>1</sup>*th* timestamp.

The cell state, candidate cell state, and final output equations are given as follows:

$$\begin{aligned} \overline{c} &= \tanh\left(w\_t[h\_{t-1}, \chi\_t] + b\_c\right) \\ c\_t &= f\_t \* c\_{t-1} + i\_t \* \overline{c}\_t \\ h\_t &= o\_t \* \tanh\left(c^t\right) \end{aligned} \tag{16}$$

where, *ct* and *ct* represents cell state and candidate for cell state at timestamp(*t*) where the rest of the notations follows from the previous equations.

The architecture of LSTM at any timestamp *t* is shown in **Figure 4**.

Bidirectional LSTMs [29] are a kind of LSTM that can be used to increase model performance on sequence classification issues. Bidirectional long-short term memory is the process of allowing any neural network to store sequence information in both backward (future to past) and forward (forward to future) directions. BI-LSTM is typically used when sequence to sequence activities are required. Text classification, speech recognition, and forecasting models can all benefit from using this type of network. **Figure 5** shows the architecture of a BI-LSTM.

### **3. Problem statement**

Before the invention of CAD, diagnosis used to be done manually and manual diagnostic procedures were time-consuming, less accurate. In the manual diagnostic

**Figure 4.** *Graphical representation of LSTM unit [28].*

**Figure 5.** *Graphical representation of bi-directional LSTM unit [30].*

procedures, there might be errors in the calculation of computational and statistical features. To counteract the faults in manual diagnostic procedures, deep learning has been introduced to diagnosis. CAD application has heightened the diagnostic performance of non-expert radiologists. Regardless of radiologist expertise, the fundamental benefit of CAD is the minimum false-negative rate and enhanced sensitivity. CAD technologies are faster, more dependable, more accurate and also help to improve in the calculation of computational and statistical features [31]. In this regard, this study focuses on speculating about some of the valuable technologies and trying to approach a conventional solution.
