**5.2 Preprocessing**

In the preprocessing, a sixth-order bandpass IIR filter with lower 3-dB frequency of 20 Hz, higher 3-dB frequency of 600 Hz, and the sample rate of 2000 Hz was used to remove any high or low noise. After filtering, the signal is segmented. Segmentation of the PCG signals into heart cycles and marking of cycle starting instances are very important to generate the epoch of interest for the machine learning.

An automatic code was used to identify the S1 peaks of the PCG signal. One complete cycle of PCG signal is from one S1 peak to another S1 peak. This was segmented along with a time offset to capture the beginning of S1 and the ending of S2 as shown in **Figure 17**. In this study, MIT-BIH benchmark dataset was used and randomly partitioned into two subsets. The first set is for training and validation (80% of data), and the second set is for testing (20% of data). The ML models were trained to classify a two-class problem (normal and abnormal). For evaluation, confusion matrix and standard statistical evaluation parameters for each algorithm were calculated for each fold. After five-fold cross-validation, these parameters were used to evaluate the performance of the algorithms.

For feature extraction, 27 features encompassing t-domain, f-domain, and mel-frequency cepstral coefficient (MFCC) features were extracted for each heart sound cycle. The t-domain features were mean, median, standard deviation, signal twenty-fifth percentile, signal seventy-fifth percentile, signal interquartile range, mean absolute deviation, skewness, kurtosis, and Shannon's entropy, while the f-domain features were spectral entropy, signal magnitude at maximum frequency,

**89**

database).

**5.4 Results**

in **Table 6**.

*Machine Learning in Wearable Biomedical Systems DOI: http://dx.doi.org/10.5772/intechopen.93228*

frequency cepstral features.

**5.3 Feature reduction**

**Figure 17.**

*segments.*

the maximum frequency in the power spectrum, and ratio of signal energy between the maximum frequency range and the overall signal. Other features were mel-

*Normal and abnormal HS: (a and d) detection of peaks; (b and e) overlaid segments; (c and f) average of the* 

Neighborhood component analysis (NCA) is a non-parametric and embedded method for selecting features to provide maximum prediction accuracy of classification algorithms. NCA with built-in functions in Statistics and Machine Learning Toolbox™ of MATLAB was used. It can be noted that 15 features were the most important features out of the 27 features, which were kurtosis, maximum frequency value, and all the MFCC features. In addition, the parameters for the trained model were tuned to optimize their hyperparameters. The best performing algorithms were optimized to calculate the performance measures. Statistical measures were then calculated for the testing dataset (20% of the whole

Twenty-two different algorithms [three decision trees, two discriminant analyses, six support vector machines (SVMs), six k-nearest neighbors (KNNs), and five ensemble classifiers] were trained for the testing dataset with 27 features. The validation accuracy and their corresponding performance measures are listed

It is obvious from the above table that the best validation accuracy was observed for "Fine Tree" classifier. Moreover, the accuracy of classifying normal is higher

*Machine Learning in Wearable Biomedical Systems DOI: http://dx.doi.org/10.5772/intechopen.93228*

**Figure 17.**

*Sports Science and Human Health - Different Approaches*

*System overall with modified stethoscope chest.*

*Blocks of the machine learning-based abnormality detection algorithm.*

were used to evaluate the performance of the algorithms.

were done by Statistics and Machine Learning Toolbox in the MATLAB and using Numpy (v1.13.3), Matplotlib (v3.0.2), PyBrain (v0.31), and Scikit learn (v0.20)

In the preprocessing, a sixth-order bandpass IIR filter with lower 3-dB frequency

of 20 Hz, higher 3-dB frequency of 600 Hz, and the sample rate of 2000 Hz was used to remove any high or low noise. After filtering, the signal is segmented. Segmentation of the PCG signals into heart cycles and marking of cycle starting instances are very important to generate the epoch of interest for the machine

An automatic code was used to identify the S1 peaks of the PCG signal. One complete cycle of PCG signal is from one S1 peak to another S1 peak. This was segmented along with a time offset to capture the beginning of S1 and the ending of S2 as shown in **Figure 17**. In this study, MIT-BIH benchmark dataset was used and randomly partitioned into two subsets. The first set is for training and validation (80% of data), and the second set is for testing (20% of data). The ML models were trained to classify a two-class problem (normal and abnormal). For evaluation, confusion matrix and standard statistical evaluation parameters for each algorithm were calculated for each fold. After five-fold cross-validation, these parameters

For feature extraction, 27 features encompassing t-domain, f-domain, and mel-frequency cepstral coefficient (MFCC) features were extracted for each heart sound cycle. The t-domain features were mean, median, standard deviation, signal twenty-fifth percentile, signal seventy-fifth percentile, signal interquartile range, mean absolute deviation, skewness, kurtosis, and Shannon's entropy, while the f-domain features were spectral entropy, signal magnitude at maximum frequency,

**88**

**Figure 16.**

**Figure 15.**

learning.

libraries in Python.

**5.2 Preprocessing**

*Normal and abnormal HS: (a and d) detection of peaks; (b and e) overlaid segments; (c and f) average of the segments.*

the maximum frequency in the power spectrum, and ratio of signal energy between the maximum frequency range and the overall signal. Other features were melfrequency cepstral features.

#### **5.3 Feature reduction**

Neighborhood component analysis (NCA) is a non-parametric and embedded method for selecting features to provide maximum prediction accuracy of classification algorithms. NCA with built-in functions in Statistics and Machine Learning Toolbox™ of MATLAB was used. It can be noted that 15 features were the most important features out of the 27 features, which were kurtosis, maximum frequency value, and all the MFCC features. In addition, the parameters for the trained model were tuned to optimize their hyperparameters. The best performing algorithms were optimized to calculate the performance measures. Statistical measures were then calculated for the testing dataset (20% of the whole database).

#### **5.4 Results**

Twenty-two different algorithms [three decision trees, two discriminant analyses, six support vector machines (SVMs), six k-nearest neighbors (KNNs), and five ensemble classifiers] were trained for the testing dataset with 27 features. The validation accuracy and their corresponding performance measures are listed in **Table 6**.

It is obvious from the above table that the best validation accuracy was observed for "Fine Tree" classifier. Moreover, the accuracy of classifying normal is higher

than abnormal and this is because of the imbalanced dataset. Therefore, to reduce the potential over-fitting of the features, a reduction in the number of features used in the training process. The training dataset was re-trained with the reduced number of features (15), and all evaluation parameters were calculated. **Table 7** summarizes the evaluation measures for identifying the best algorithm with feature selection.

From **Table 6**, the overall accuracy was reduced as well as classifying normal and abnormal both were also reduced even though the same algorithms were performing best in the classification after feature reduction. Therefore, it can be said that the features used for classification are optimized and cannot be reduced.

To improve the performance of the best performing algorithms by optimizing the hyperparameters of the algorithms, it was observed that the performance of the ensemble algorithm can be improved. Two important parameters were optimized for the ensemble algorithms: "Distance" and "Number of neighbors."


#### **Table 6.**

*Performance measures of three best performing algorithms for full-feature set.*


**91**

*Machine Learning in Wearable Biomedical Systems DOI: http://dx.doi.org/10.5772/intechopen.93228*

This chapter summarizes the findings of four different applications of wearable devices to tackle four critical clinical problems. The smart wearable devices reported here can help patients in different settings to manage their diseases, which will reduce frequent hospital visit requirement and can elevate their living standard. The

summary of the results from each of the case study can be provided as below. The FSR-based smart insole can acquire high-quality vGRF for different gait cycles, and it was found that the flexible piezoelectric sensors were performing poor in calibration due to their sensitivity to 3D forces requiring special force calibration machines to control the applied force in x, y, or z directions. Therefore, piezoelectric sensors cannot be utilized as a substitute for FSR in smart insole application. ReliefF feature selection algorithm produced the best result when combined with Gaussian process regression (GPR) for predicting the systolic and diastolic blood pressure using PPG signal. The feature selected using a combination of ReliefF and GPR performed the best in estimating SBP, while correlation-based feature selection (CFS) and GPR performed best for DBP. It can be noted that this optimized approach

can estimate SBP and DBP with the RMSE of 6.74 and 3.57, respectively.

further by optimizing the hyperparameters of the algorithms.

Extended modified B distribution shows the best performance in classifying ST elevation and T-wave inversion in the heart attack detection case study using ECG signals. The variance of the results from EMBD technique showed that the variation for different iterations was at the minimum for the EMBD distribution (**Table 3**). Thus, EMBD distribution was more robust in heart attack detection in case of noisy

Heart sound signals can be accurately classified using "Fine Tree" classifier compared with 22 different algorithms [three decision trees, two discriminant analyses, six support vector machines (SVMs), six k-nearest neighbors (KNNs), and five ensemble classifiers]. Feature reduction technique did not help in improving the performance. It was observed that the best performing algorithms can be improved

The chapter discusses in detail with case studies about the different opportunities available in the design and development of wearable medical devices, which can help in real-time healthcare monitoring. The chapter has not only discussed on the study of characterizing sensors for specific wearable device but also how reliable dataset can be utilized to develop trained model for medical diagnosis. It has also shown how to design and develop a complete wearable device with real-time monitoring and alarming in case of emergency. These smart wearable solutions, if properly designed and deployed, can help millions of users to take advantages of the wearable technologies and thereby can monitor their health

This work was made possible by NPRP12S-0227-190164 from the Qatar National Research Fund, a member of Qatar Foundation, Doha, Qatar. The statements made

**6. Discussion**

ECG data.

**7. Conclusion**

status in different setting.

**Acknowledgements**

herein are solely the responsibility of the authors.

#### **Table 7.**

*Performance matrix for the three best performing algorithms on reduced feature set.*
