**4. Results**

The first step was to choose performance measures that would address two sets of findings from our earlier research. First, our survey respondents showed no consensus regarding the relative impact of over or under reporting seizures. Second, our interviews with clinicians indicated that most patients and caregivers report seizures themselves without the help of seizure detection devices [3, 5]. It was, therefore, important for us to choose performance metrics that would both quantify over and under reporting as well as support comparison between

To address these requirements, we evaluated each system in terms of three statistics: precision, recall, and F-score. Recall or sensitivity is the fraction of all seizures that were detected. High recall values reflect a low chance of under reporting or missing a seizure. Missed seizure events are problematic as untreated seizures can have serious long-term health consequences.

Recall <sup>=</sup> true positives \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ true positives <sup>+</sup> false negatives (2)

Precision is the fraction of all relevant seizures that are detected. High precision values reflect a low chance of over reporting seizures or triggering false alarms. Low false alarm rates are

Precision <sup>=</sup> true positives \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ true positives <sup>+</sup> false positives (3)

<sup>F</sup> <sup>=</sup> <sup>2</sup> \* precision \* recall \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ precision <sup>+</sup> recall (4)

In practice, notable inconsistencies between studies required making several assumptions. Many systems did not report precision and recall directly. In some cases, these rates had to be calculated based on information in the papers. Next, several studies presented statistics in terms of only those patients with seizures (PWS) [38–40] while other studies reported statistics for all patients in a study [41–43]. Including all patients meant that some patients without seizures might also contribute false positives. To address this discrepancy, we recomputed precision to include only those false positives from patients with seizures. For example, Poh et al. [41] reported performance for all patient and precision subsequently increased 24.54%

Next, we calculated patient self-reporting performance based on previous studies [18]. In this case, we assumed perfect self-reporting precision. Blum et al. [7, 9] evaluated seizure awareness among 31 patients with partial and generalized type epilepsies and observed that patients never falsely reported seizures. Next, we calculated based on observations from a similar study from Hoppe et al. [9] in which 91 patients with focal type epilepsies failed to report 32.0% of seizures during the day and 85.8% of seizures while asleep at night. This resulted in a precision of 100% for both day and night time reporting, recall values of 68.0 and

14.5% and F-scores of 0.25 and 0.81 for day and night time reporting, respectively.

seizure reporting systems and patient self-reporting rates from the literature [9].

important to avoid changing already effective medication.

76 Seizures

when calculated among only those patients with seizures.

The F-score balances over and under reporting and is expressed as:

#### **4.1. Part 1: self-reporting types, priorities, and characteristics**

This section summarizes our key research findings. **Figure 2** presents the type, priority, and characteristics of important information that clinicians need patients to report along with notable perceived patient self-reporting challenges and agreement between participants.

#### *4.1.1. Self-reporting types*

The first step for our analysis was establishing the types of patient self-reported data that clinicians need from patients. The bottom row of **Figure 2** shows a sorted list with highest to lowest priority clinical information needs.

**Figure 2.** "Top 20" types, priorities, and characteristics of neurocognitive self-reporting needs (top row) and specific self-reporting challenges (sorted from greatest to least importance) (bottom row).

#### *4.1.2. Self-reporting priorities*

Next, we investigated the priority of the patient self-reported data that clinicians need from patients. The "top 20" highest priority symptoms and triggers are shown in **Figure 2**.

AEDs. Here our key findings were that limited technologies exist for supporting the process of characterizing patient seizure type, and while most seizure detection devices are more accurate than patients for nighttime reporting, these devices must be made more accurate to

Self-Reporting Technologies for Supporting Epilepsy Treatment

http://dx.doi.org/10.5772/intechopen.70283

79

The results in **Figure 3** provide a comparison of seizure detection device and patient selfreporting capabilities on an F-score axis between 0 and 1. The results also account discrepancies in study population size by computing performance for only those patients with seizures (PWS) as opposed to all patients that participated in each study. The following subsections

Inertial systems utilize one or more wrist and/or chest-worn motion sensor [36, 44] and detect seizure-like convulsions as intense, repetitive limb, and torso movements with F-scores ranging from 0.133 to 0.990. These systems offer the benefit of being able to measure motion under blankets for nighttime use [36] and typically measure limb motion using an accelerometer [42], and/or gyroscope [45]. The two highest performing research systems in our review were from Schulc et al. [45] and Dalton et al. [46]. Schulc et al. [45] instrumented patients with a

**Figure 3.** Seizure reporting performance comparison: multiple types of non-EEG seizure detection systems are compared against patient self-reporting on a continuous F-score scale from 0.0 to 1.0, read left to right, where 0.0 is worst and 1.0 is shows the best performance. Each seizure detection system is represented as a circle for given class technology. The circle texture indicates the time of day that the system was evaluated and diameter reflects the relative number of patients that had at least one seizure during each study. Self-reporting performance is shown using vertical lines. Daytime performance is shown as a vertical white line with a black border while nighttime performance is shown as a

describe inertial systems, video systems, and multimodal systems.

be beneficial for daytime use.

*4.2.1. Inertial systems*

solid black line, respectively.

#### *4.1.3. Self-reporting characteristics*

The online survey established a consensus regarding several important self-reporting characteristics. The top row of **Figure 2** shows clinician perceptions regarding the "top 20" highest ranked symptoms and triggers in terms of availability, reliability, difficulty and desired frequency; while the bottom row shows the same characteristics but categorized in terms of "unavailable", "difficult" for patients to collect" or "unreliable", respectively.

#### *4.1.4. Self-reporting challenges*

Next, we identified the pair of symptoms and triggers with the highest number of critical clinical responses. The most frequent clinician survey responses are shown in **Table 1**.

The first row in **Table 1** highlights patient reporting challenges associated with information access. This includes the symptom or trigger with the greatest number of "unavailable" and "difficult" responses. The second row, further accounts for problems associated with data collection performance. This includes the symptom or trigger with the greatest number of "unavailable", "difficult", and "unreliable" responses, respectively. The results highlight "suicide attempts" and "seizure onset time at night" as two important unmet clinical needs.

#### *4.1.5. Self-reporting themes*

Mental health and sleep-related symptoms and triggers each appeared among the "top 20" highest ranked symptoms and triggers. Icons above the bar graphs in **Figure 2** denote mental health-related symptoms and triggers such as "depression symptoms" with red circles and sleep-related symptoms and triggers such as "impaired sleep and daytime alertness" and "impaired sleep quality" with blue diamonds.

#### **4.2. Part 2: seizure reporting technology review capabilities**

This section summarizes our research findings and highlights how inaccurate patient and caregiver seizure reporting impacts clinical decision-making for prescribing and adjusting


**Table 1.** Most frequent clinician reported survey responses.

AEDs. Here our key findings were that limited technologies exist for supporting the process of characterizing patient seizure type, and while most seizure detection devices are more accurate than patients for nighttime reporting, these devices must be made more accurate to be beneficial for daytime use.

The results in **Figure 3** provide a comparison of seizure detection device and patient selfreporting capabilities on an F-score axis between 0 and 1. The results also account discrepancies in study population size by computing performance for only those patients with seizures (PWS) as opposed to all patients that participated in each study. The following subsections describe inertial systems, video systems, and multimodal systems.

#### *4.2.1. Inertial systems*

*4.1.2. Self-reporting priorities*

78 Seizures

*4.1.3. Self-reporting characteristics*

*4.1.4. Self-reporting challenges*

clinical needs.

*4.1.5. Self-reporting themes*

"impaired sleep quality" with blue diamonds.

**Table 1.** Most frequent clinician reported survey responses.

**4.2. Part 2: seizure reporting technology review capabilities**

Unavailable + dfficult Suicide attempts

Unavailable + difficult + unreliable Seizure onset time at night

Next, we investigated the priority of the patient self-reported data that clinicians need from

The online survey established a consensus regarding several important self-reporting characteristics. The top row of **Figure 2** shows clinician perceptions regarding the "top 20" highest ranked symptoms and triggers in terms of availability, reliability, difficulty and desired frequency; while the bottom row shows the same characteristics but categorized in terms of

Next, we identified the pair of symptoms and triggers with the highest number of critical

The first row in **Table 1** highlights patient reporting challenges associated with information access. This includes the symptom or trigger with the greatest number of "unavailable" and "difficult" responses. The second row, further accounts for problems associated with data collection performance. This includes the symptom or trigger with the greatest number of "unavailable", "difficult", and "unreliable" responses, respectively. The results highlight "suicide attempts" and "seizure onset time at night" as two important unmet

Mental health and sleep-related symptoms and triggers each appeared among the "top 20" highest ranked symptoms and triggers. Icons above the bar graphs in **Figure 2** denote mental health-related symptoms and triggers such as "depression symptoms" with red circles and sleep-related symptoms and triggers such as "impaired sleep and daytime alertness" and

This section summarizes our research findings and highlights how inaccurate patient and caregiver seizure reporting impacts clinical decision-making for prescribing and adjusting

**Epilepsy**

clinical responses. The most frequent clinician survey responses are shown in **Table 1**.

patients. The "top 20" highest priority symptoms and triggers are shown in **Figure 2**.

"unavailable", "difficult" for patients to collect" or "unreliable", respectively.

Inertial systems utilize one or more wrist and/or chest-worn motion sensor [36, 44] and detect seizure-like convulsions as intense, repetitive limb, and torso movements with F-scores ranging from 0.133 to 0.990. These systems offer the benefit of being able to measure motion under blankets for nighttime use [36] and typically measure limb motion using an accelerometer [42], and/or gyroscope [45]. The two highest performing research systems in our review were from Schulc et al. [45] and Dalton et al. [46]. Schulc et al. [45] instrumented patients with a

**Figure 3.** Seizure reporting performance comparison: multiple types of non-EEG seizure detection systems are compared against patient self-reporting on a continuous F-score scale from 0.0 to 1.0, read left to right, where 0.0 is worst and 1.0 is shows the best performance. Each seizure detection system is represented as a circle for given class technology. The circle texture indicates the time of day that the system was evaluated and diameter reflects the relative number of patients that had at least one seizure during each study. Self-reporting performance is shown using vertical lines. Daytime performance is shown as a vertical white line with a black border while nighttime performance is shown as a solid black line, respectively.

single sensor on the forearm (98.00% precision, 100.00% recall) while Dalton et al. [46] instrumented patients with a pair of wrist-worn sensors (84.0% precision, 91% recall). The highest performing commercial product is Epi-care Free. Epi-care Free is a single wrist sensor with similar performance (81.95% precision, 89.74% recall) [43]. High false positive rates remain a challenge as rhythmic activities such as brushing teeth [42, 43] and exercise [41] are often responsible for triggering false alarms.

*4.2.4. Audio, ECG, EMG, pressure systems*

sors being cumbersome to wear for long periods of time.

pressure readings or the patient sitting up in bed [64].

*4.2.5. Seizure reporting comparison between devices and patient self-report*

and ECG as we only evaluated a single system from each of these categories.

**Systems F-score Precision Recall PWS Modality** Schulc [45] 0.990 0.980 1.000 3 Inertial Cuppens [66] 0.964 0.931 1.000 5 Video Lu [67] 0.933 0.933 0.933 5 Video Cattani [76] 0.921 0.932 0.910 1 Video Karayiannis [48] 0.900 0.900 0.900 54 Video Dalton [46] 0.874 0.840 0.910 5 Inertial Beniczky [43] 0.854 0.814 0.897 20 Inertial Kramer [72] 0.811 0.714 0.938 15 Inertial Cuppens [77] 0.797 0.850 0.750 3 Video Nijsen [85] 0.788 0.650 1.000 7 Inertial Van de Vel, Emfit [65] 0.780 0.780 0.780 1 Pressure Cuppens [73] 0.737 0.600 0.952 7 Inertial

van Elmpt et al. [62] used ECG measurements for detecting the onset of heart rate changes associated with seizures and achieved competitive performance with inertial sensors (F-score = 0.391). Heart rate was observed to increase (tachycardia) at the onset of seizures and decrease following seizures (postictal bradycardia). Muscle activated sensors have been used to detect seizures [63], however, no further efforts have been made, perhaps due to adhesive EMG sen-

Self-Reporting Technologies for Supporting Epilepsy Treatment

http://dx.doi.org/10.5772/intechopen.70283

81

Mattress pressure pads have achieved mid-level performance for generalized tonic clonic (GTC) seizures [64, 65] with F-scores ranging from 0.580 to 0.78. These sensors present the added benefit of not requiring patients to wear sensors and increased privacy over having a camera installed in bedrooms. Most mattress systems, however, report false positive rates that are notably higher than inertial and video-based systems [66], due to pillows dampening

**Table 2** presents a set of statistics for comparing each system to patient [9] seizure reporting performance. Each row contains an F-score along with precision, recall, and number of patients with seizures and modality or type of system and is sorted by descending F-score for reference. Next, **Table 3** presents statistics for comparing performance between each type of system. Each row includes the mean, standard deviation, minimum and maximum values together with two sets of p-values from a one-sided t-test. The p-values report the likelihood that each type of system would achieve a higher average F-score performance than that of patient self-reporting [9]. It should be noted that the t-test could not be computed for EMG

#### *4.2.2. Video systems*

Marker and markerless video systems been developed for detecting and classifying a range of seizure types [39] with F-scores between 0.201 and 0.964. These systems had lower overall performance than other alternatives such as inertial systems, but modern computer vision techniques are making these systems increasingly flexible and attractive for longterm use.

Markerless video systems can be trained to reliably detect patient seizure movement without the need to wear sensors on the body. For example, while prior systems were restricted to specific settings such as specific Neonatal Intensive Care Units [47, 48], more recent systems such as the one from Cuppens et al. [77] use image features that are more robust to lighting and viewpoint changes and thus applicable to different bedrooms.

Marker-based video systems, by contrast, require patients to wear active or passive markers for measuring patient motion but provide among the few examples of systems that also classify types of detected seizures [35, 38, 51]. Rémi et al. [35] used an infrared camera and retroreflective markers to track and classify different types of patient limb movements during seizures. The video recordings were analyzed to track the position of each marker over time. The relative movement of these markers between video frames was then used to discriminate between motor characteristics during different types of convulsive seizures.

#### *4.2.3. Multimodal systems*

Multimodal systems utilize inputs from multiple types of sensors thereby improving seizure detection performance with F-scores ranging from 0.083 to 0.560. Poh et al. [41] showed that electrodermal activity (EDA), in conjunction with an accelerometer, could detect seizures better than using accelerometry alone [41]. EDA measures autonomic arousal and could play a role in detecting seizures with subtle motor movement. In addition, future research may highlight differences between EDA responses on both wrists and legs for differentiating generalized and partial seizures, and for characterizing seizure laterality [52].

The MP5 system [54, 55] consisted of an under mattress microphone and accelerometer according to Ref. [56], although performance was comparatively poor (average F-score = 0.234). More recently, Pavlova et al. showed that respiration can complement video EEG during seizure diagnosis [57]. Heart rate variability [58], EDA, and respiration may enable systems to recognize life-threatening postictal depression following seizures [59, 60].

#### *4.2.4. Audio, ECG, EMG, pressure systems*

single sensor on the forearm (98.00% precision, 100.00% recall) while Dalton et al. [46] instrumented patients with a pair of wrist-worn sensors (84.0% precision, 91% recall). The highest performing commercial product is Epi-care Free. Epi-care Free is a single wrist sensor with similar performance (81.95% precision, 89.74% recall) [43]. High false positive rates remain a challenge as rhythmic activities such as brushing teeth [42, 43] and exercise [41] are often

Marker and markerless video systems been developed for detecting and classifying a range of seizure types [39] with F-scores between 0.201 and 0.964. These systems had lower overall performance than other alternatives such as inertial systems, but modern computer vision techniques are making these systems increasingly flexible and attractive for long-

Markerless video systems can be trained to reliably detect patient seizure movement without the need to wear sensors on the body. For example, while prior systems were restricted to specific settings such as specific Neonatal Intensive Care Units [47, 48], more recent systems such as the one from Cuppens et al. [77] use image features that are more robust to lighting

Marker-based video systems, by contrast, require patients to wear active or passive markers for measuring patient motion but provide among the few examples of systems that also classify types of detected seizures [35, 38, 51]. Rémi et al. [35] used an infrared camera and retroreflective markers to track and classify different types of patient limb movements during seizures. The video recordings were analyzed to track the position of each marker over time. The relative movement of these markers between video frames was then used to discriminate

Multimodal systems utilize inputs from multiple types of sensors thereby improving seizure detection performance with F-scores ranging from 0.083 to 0.560. Poh et al. [41] showed that electrodermal activity (EDA), in conjunction with an accelerometer, could detect seizures better than using accelerometry alone [41]. EDA measures autonomic arousal and could play a role in detecting seizures with subtle motor movement. In addition, future research may highlight differences between EDA responses on both wrists and legs for differentiating generalized and partial seizures, and for characterizing seizure

The MP5 system [54, 55] consisted of an under mattress microphone and accelerometer according to Ref. [56], although performance was comparatively poor (average F-score = 0.234). More recently, Pavlova et al. showed that respiration can complement video EEG during seizure diagnosis [57]. Heart rate variability [58], EDA, and respiration may enable systems to recog-

and viewpoint changes and thus applicable to different bedrooms.

between motor characteristics during different types of convulsive seizures.

nize life-threatening postictal depression following seizures [59, 60].

responsible for triggering false alarms.

*4.2.2. Video systems*

*4.2.3. Multimodal systems*

laterality [52].

term use.

80 Seizures

van Elmpt et al. [62] used ECG measurements for detecting the onset of heart rate changes associated with seizures and achieved competitive performance with inertial sensors (F-score = 0.391). Heart rate was observed to increase (tachycardia) at the onset of seizures and decrease following seizures (postictal bradycardia). Muscle activated sensors have been used to detect seizures [63], however, no further efforts have been made, perhaps due to adhesive EMG sensors being cumbersome to wear for long periods of time.

Mattress pressure pads have achieved mid-level performance for generalized tonic clonic (GTC) seizures [64, 65] with F-scores ranging from 0.580 to 0.78. These sensors present the added benefit of not requiring patients to wear sensors and increased privacy over having a camera installed in bedrooms. Most mattress systems, however, report false positive rates that are notably higher than inertial and video-based systems [66], due to pillows dampening pressure readings or the patient sitting up in bed [64].

#### *4.2.5. Seizure reporting comparison between devices and patient self-report*

**Table 2** presents a set of statistics for comparing each system to patient [9] seizure reporting performance. Each row contains an F-score along with precision, recall, and number of patients with seizures and modality or type of system and is sorted by descending F-score for reference. Next, **Table 3** presents statistics for comparing performance between each type of system. Each row includes the mean, standard deviation, minimum and maximum values together with two sets of p-values from a one-sided t-test. The p-values report the likelihood that each type of system would achieve a higher average F-score performance than that of patient self-reporting [9]. It should be noted that the t-test could not be computed for EMG and ECG as we only evaluated a single system from each of these categories.



The resulting tables can then be used for more closely examining system performance with respect to under and over reporting. High recall systems with low precision [41, 61] seldom miss seizures for addressing the concern of underreporting yet tend to overcompensate and over report seizures due to false alarms. High-precision systems with low recall [54, 62] have the opposite problem and address the concern of over reporting seizures at the risk of missing seizures. High F-score systems [45, 66, 67] have high-precision and recall values and therefore

Self-Reporting Technologies for Supporting Epilepsy Treatment

http://dx.doi.org/10.5772/intechopen.70283

83

The multiphase structure of our study was instrumental in translating our interviews, literature review, and expert panel findings into effective online survey questions. The key findings included the types, priorities, and characteristics of self-reported data that clinicians need

The remainder of this section highlights notable patient self-reporting challenges as well as

Many symptoms and triggers were reported as "useful" but "unavailable" as shown in orange in **Figure 2**. "Academic decline" was said to be unavailable (five out of six respondents). These findings highlight the need for patient data that may already be collected but unavailable to clinicians. Improved interoperability between electronic health records (EHRs) and electronic grading systems could alert clinicians to changes in patient grades during

There were several symptoms and triggers that were reported as "difficult" for patients to report as shown in yellow in **Figure 2**. Notably, "Seizure onset at night" and "excessive sleep movements" were said to be "difficult" to report among most epilepsy specialists (five out of six respondents). These findings highlight the inherent difficulty of patient data collection while sleeping or unconscious. Introducing automated wrist-worn devices such as the Empatica E4 [68] and ActiGraph Link [69] could stand to increase patient self-reporting performance by detecting events such as seizure and unusual sleep movements, while also

Next, many symptoms and triggers were reported as being useful but unreliable when selfreported as shown in gray in **Figure 2**. All epilepsy specialists (six out of six respondents) agreed that patient and caregiver reports of patient "memory impairment" were "unreliable".

perform well without over or under reporting.

from patients as shown in **Figure 2**.

*5.1.1. Self-reporting availability*

*5.1.2. Self-reporting difficulty*

*5.1.3. Self-reporting reliability*

appointments.

**5.1. Part 1: self-reporting types, priorities, and characteristics**

subsequent feedback after sharing these findings with clinicians.

reducing patient and caregiver data collection burden.

**5. Discussion**

**Table 2.** System and patient self-reporting performance comparison.


**Table 3.** System F-score and p-value statistics by modality.

The resulting tables can then be used for more closely examining system performance with respect to under and over reporting. High recall systems with low precision [41, 61] seldom miss seizures for addressing the concern of underreporting yet tend to overcompensate and over report seizures due to false alarms. High-precision systems with low recall [54, 62] have the opposite problem and address the concern of over reporting seizures at the risk of missing seizures. High F-score systems [45, 66, 67] have high-precision and recall values and therefore perform well without over or under reporting.
