**3. Research methodology**

In this section, we shall describe the tools and the methods we shall use. We shall discuss the sensors used, the datasets used, the algorithms utilized and the floor plan of the apartment that will be used for the senior's smart house.

#### **3.1 Implementation procedure, software, and hardware**

Since the research is about seniors, the dataset had to be slashed to only utilize data that applies to seniors. The age of participants was the main factor used to extract records. Some activities were removed as we considered them not necessary for

#### *Autonomous Update of a Dataset for Anomaly Detection Services in Elderly Care Smart House DOI: http://dx.doi.org/10.5772/intechopen.103953*

seniors. Based on this In the Sisfall of 56,786 records, only 14,000 have seniors. In the MobiFall dataset on the 62,259 records, only 16,598 have seniors. In the Ucihar dataset on the 7352 records, only 2018 has seniors. After extracting these records, seven classifier algorithms were used to train the model on all three datasets and then validate the models.

The classification was for two separate tasks. The first task was fall detection. The MobiAct and Sisfall were used in this task. Each dataset has its subsection used in the experiment. The performances of the algorithms were compared to the dataset subclass from these two datasets. The best algorithm was identified. The second task was to identify if the ADLs were Laying or not. In this task, only the Ucihar dataset was used. The eight classifiers were again used to detect if Laying was the activity performed or not. This task was then linked to some other external tasks that tell the location where Laying was occurring if Laying was identified as the current ADLs.

The experiment was conducted using Python 3 software package. The computer system had the following specification: Processor: Intel(R) Core (TM) i7-4510U, CPU: 2.00GHz, 2 cores, RAM Memory: 4 GB DDR3 1600 MHz, OS: Windows 64 bits.

#### **3.2 Sensors**

The sensors we shall use in collecting data are a gyroscope and accelerometer for the XYZ plane. Nowadays These are readily available in mobile phones and smartwatch devices, which are usable for human activity recognition. The sensor using cameras and environment sensors has the advantage that they do not require excessive preparation and arrangement for senior citizens. However, the sense that the senior is under surveillance might be a discomforting feeling for most seniors. Therefore, this discomfort has made us decide to use an accelerometer and gyroscope other than cameras. Therefore, in this work gyroscope and accelerometer which are embedded in smart are utilized. This sensor data can be analyzed and processed in separate locations. The data we collected from the sensors is not labeled. Therefore, to enable labeling we have mapped our data to the data labeled by previous researchers. The labeled data will be obtained from publicly available data sets.

#### **3.3 Datasets**

Several datasets exist for human activity recognition (HAR) and fall classification. These can be used to classify test data as either a fall or an ADL. In our research, we shall use the publicly available datasets; SisFall, MobiAct, and Ucihar dataset.

In research [11, 21] the Sisfall dataset is generated in which most of the subjects are between the ages of 20 and 47 years. The primary research was to create a prototype dataset for fall detection. The second dataset is MobiAct in research [12, 22]. In the MobiAct dataset, the subjects' ages are from 40 to 47 also doing both the falls and the ADL. This Sisfall and the MobiAct dataset are used primarily for the exploration of fall detection. The third dataset is the Ucihar [13, 23]. This dataset does explore the identification of ADL. There is no fall detection in Ucihar. We shall use this in ADL identification where appropriate as shall require a labeled dataset that could define ADL. In our research we focus on senior citizens hence we prefer data for people about 60-year-old and above. However, this data is not readily available hence shall show infer it in various datasets. Below is **Table 2** which shows the composition of the dataset to be used in the experiment.


#### **Table 2.**

*Composition of the selected datasets used in the experiment.*

The experimental devices must be held compatible with mobile devices that were sensing the data. For fall detection we shall classify falls as dangerous activities. Therefore, we should need to send a warning message if a fall occurs unlike when an ADL occurs. The data has been labeled by the above public libraries (Sisfall, MobiAct, and Ucihar). We shall use the accelerometer, which records the speed of objects. And using this speed we could tell the presence or absence of a fall. The gyroscope would be used for rotational movements which is another parameter in the detection of a fall by the senior citizen. The accuracy is the efficiency of the system. We shall use selected sections from the three sample datasets, which are more appropriate per our requirements.

After preprocessing the raw data, a training and test dataset is derived which has a smaller number of records. The total data fields are nine both in MobiAct and Sisfall as shown in **Figures 2** and **3** respectively.

Ucihar is for the activity classification classifier. Unlike the above dataset, the original dataset has a total of 548 columns. While **Figure 4** below shows only 14 columns that have the largest value in importance for classifier purposes.

In the sample datasets, there are various scenarios of the data record. However, in this research, only the scenarios that are best suited to our purpose were used. The

