**6.2. Experimental setup**

The main goal of our experiments was to show that HMM models can be efficiently used for behavioral pattern recognition, behavior change detection and anomaly detection. In order to achieve this goal we faced several challenges: identification of adequate model evaluation (selection) measure, identify optimal number of behavioral states for each care recipient and each activity and finally to characterize identified behaviors (clusters or behavioral patterns). Since HMM models cannot implicitly learn optimal number of hidden states, we built HMM models with varying number of clusters (in the range 2–10) for each care recipient and each activity. Additionally, since there is no consensus for evaluation of cluster models in unsupervised setting, each model was evaluated with log likelihood, BIC and AIC evaluation measures. So setting we conducted 810 experiments in total (3 care recipients × 10 activities × 9 variations of state numbers × 3 evaluation measures). Each experiment lasted for 15–24 s (including learning and evaluation). Since HMM is one of the most scalable algorithm from


likelihood are desired, however in general, likelihood monotonically increases with increase of model complexity. This means that larger number of clusters will almost always be preferred by log likelihood criteria. The aim of the first analyses was to inspect how AIC and BIC measurements capture degree of changes (slope) in likelihood values of the model. Distributions of average values of log likelihood, AIC and BIC over different model complexi-

Temporal Clustering for Behavior Variation and Anomaly Detection from Data Acquired…

http://dx.doi.org/10.5772/intechopen.75203

127

On X-axis numbers of clusters are showed and on Y-axis average AIC, BIC and log likelihood values (over all experiments), respectively. It can be seen on figure below that AIC values follow adequately identify steep growth of log likelihood on log likelihood curve. Meaning that average AIC shows better model performance while log likelihood performance increases in large steps. Optimal number of clusters (in average over all experiments) according to AIC measure is 5 where "elbow" in AIC curve is detected. This point corresponds to transition from higher growth (for number of clusters 2–5) of log likelihood to Lower growth (for number of clusters 6–10). On the other side, BIC model selector, ignores steep increase of log likelihood and iden-

**Figure 6.** Distribution of average values of log likelihood, AIC, and BIC over different model complexities.

ties (numbers of states) are shown on **Figure 6**.

tifies three as optimal number of clusters.

**Table 1.** Observed activity measurements.

Probabilistic Graphical models family (it is frequently used for signal processing and speech recognition) it allows adaption for much larger series as City4Age streaming data arrives. After building models, they are applied to activity measure time series for each citizen and each activity. In this way we labeled each time point with cluster (behavioral pattern or state) assignment. When scoring HMM models, probabilities that time point originates from cluster distributions are identified and largest probabilities are stored for anomaly detection purposes. Experimental setup is implemented in Python. Hmmlearn library is used for building HMM models while Pandas DataFrame is used for data manipulation. All experiments are conducted on a testing cloud comprising three servers with quad-core Intel Xeon class CPU each, 8 GB of RAM combined for data storage processes and up to 252 GB of RAM combined at disposal for data analytics and applicative processes.

#### **6.3. Results and discussion**

In this section we will analyze and discuss experimental results from the aspects of identification of adequate model selector, behavioral pattern recognition, behavioral change (transition) recognition and anomaly detection.

#### *6.3.1. Identification of adequate model selector*

Since there is no consensus about the best HMM model selection and evaluation metric in unsupervised setting, our first objective was to identify well suited metric for data at hand. Good metric should enable automated identification of parsimonious solutions: ones with high performance but as less complex as possible. For that purpose, we inspected general behavior of AIC, BIC over all experiments (care recipients and activity measures) and correlated these values with log likelihood performances. Log likelihood measures how probable is model given the series data. It is intuitively clear that models with maximum possible log likelihood are desired, however in general, likelihood monotonically increases with increase of model complexity. This means that larger number of clusters will almost always be preferred by log likelihood criteria. The aim of the first analyses was to inspect how AIC and BIC measurements capture degree of changes (slope) in likelihood values of the model. Distributions of average values of log likelihood, AIC and BIC over different model complexities (numbers of states) are shown on **Figure 6**.

On X-axis numbers of clusters are showed and on Y-axis average AIC, BIC and log likelihood values (over all experiments), respectively. It can be seen on figure below that AIC values follow adequately identify steep growth of log likelihood on log likelihood curve. Meaning that average AIC shows better model performance while log likelihood performance increases in large steps.

Optimal number of clusters (in average over all experiments) according to AIC measure is 5 where "elbow" in AIC curve is detected. This point corresponds to transition from higher growth (for number of clusters 2–5) of log likelihood to Lower growth (for number of clusters 6–10). On the other side, BIC model selector, ignores steep increase of log likelihood and identifies three as optimal number of clusters.

Probabilistic Graphical models family (it is frequently used for signal processing and speech recognition) it allows adaption for much larger series as City4Age streaming data arrives. After building models, they are applied to activity measure time series for each citizen and each activity. In this way we labeled each time point with cluster (behavioral pattern or state) assignment. When scoring HMM models, probabilities that time point originates from cluster distributions are identified and largest probabilities are stored for anomaly detection purposes. Experimental setup is implemented in Python. Hmmlearn library is used for building HMM models while Pandas DataFrame is used for data manipulation. All experiments are conducted on a testing cloud comprising three servers with quad-core Intel Xeon class CPU each, 8 GB of RAM combined for data storage processes and up to 252 GB of RAM combined

**Geriatric sub-factor Activity Measure unit Walking** WALK\_STEPS # of steps

**Quality of sleep** SLEEP\_LIGHT\_TIME seconds

**Physical activity** PHYSICALACTIVITY\_SOFT\_TIME seconds

WALK\_DISTANCE meters

SLEEP\_DEEP\_TIME seconds SLEEP\_AWAKE\_TIME seconds SLEEP\_WAKEUP\_NUM seconds SLEEP\_TOSLEEP\_TIME seconds

PHYSICALACTIVITY\_MODERATE\_TIME seconds PHYSICALACTIVITY\_INTENSE\_TIME seconds PHYSICALACTIVITY\_CALORIES # of calories

In this section we will analyze and discuss experimental results from the aspects of identification of adequate model selector, behavioral pattern recognition, behavioral change (transi-

Since there is no consensus about the best HMM model selection and evaluation metric in unsupervised setting, our first objective was to identify well suited metric for data at hand. Good metric should enable automated identification of parsimonious solutions: ones with high performance but as less complex as possible. For that purpose, we inspected general behavior of AIC, BIC over all experiments (care recipients and activity measures) and correlated these values with log likelihood performances. Log likelihood measures how probable is model given the series data. It is intuitively clear that models with maximum possible log

at disposal for data analytics and applicative processes.

**6.3. Results and discussion**

**Table 1.** Observed activity measurements.

126 Recent Applications in Data Clustering

tion) recognition and anomaly detection.

*6.3.1. Identification of adequate model selector*

**Figure 6.** Distribution of average values of log likelihood, AIC, and BIC over different model complexities.

After this point, BIC curve grows super linearly meaning that it does not prefer models with higher number of clusters than 2 or 3. Deeper inspection of AIC, BIC and log likelihood curves for each care recipient and each activity showed consistent behavior with ones described on **Figure 6**. Thus we selected AIC as measure of choice for HMM model selection. Based on previous discussion we took AIC as measure of choice for model selection and identification of optimal number of behavioral state for each care recipient and each activity.

However, it is very important to emphasize that insights presented in previous text cannot be considered as conclusive and cannot generalize over all problems. This is because cluster performance is dependent on data distributions that are different for each dataset, but also because depends on the context of analyses.
