**4.1. Intersubject classification**

**Figures 3** and **4** show the accuracy versus the number of removed features obtained by apply‐ ing the two methods: random forest and SVM, respectively. The accuracy was computed with a LOO cross‐validation strategy and a total of 52 feature vectors were involved, which rep‐ resent the ensemble averages referring to positive and negative affective valence responses of all volunteers investigated (26 for each class). Each feature vector is composed of *M* = 756 elements (see section 3.2). A global accuracy of 79% is achieved by the system if roughly 500 irrelevant features are removed from the input feature set with random forest, whereas the system yields an accuracy peak value of 100% using SVM‐RFE after removing 680 features, remaining less than 100 features as relevant ones.

**Figure 3.** Intersubject accuracy obtained by the implemented random forest. Features extracted from ensemble‐average signals are computed at least over 30 single trials.

**Figure 4.** Intersubject accuracy obtained by the implemented SVM‐RFE. Features extracted from ensemble‐average signals are computed at least over 30 single trials.

**Figure 5.** Intrasubject percentiles of accuracy values versus the number of features removed using random forest. The last point corresponds to 36 selected features.

**Figure 3.** Intersubject accuracy obtained by the implemented random forest. Features extracted from ensemble‐average

**Figure 4.** Intersubject accuracy obtained by the implemented SVM‐RFE. Features extracted from ensemble‐average

signals are computed at least over 30 single trials.

34 Emotion and Attention Recognition Based on Biological Signals and Images

signals are computed at least over 30 single trials.

**Figure 6.** Intrasubject percentiles of accuracy values versus the number of features removed using SVM‐RFE. The last point corresponds to 36 selected features.

**Tables 1** and **2** describe the spatial and temporal locations of the relevant features when the input of the classifiers is the data set formed by these 52 feature vectors. Concerning spatial locations, both random forest and the SVM algorithm allocate the relevant features consistently in frontal regions of the brain, although SVM also keeps a significant number from centroparietal regions. This corroborates other research works where, during affec‐ tive processing, the particular contribution of frontal regions has also been pointed out [46, 47]. Concerning location in time, with a random forest most of the features display *medium* and *long* latencies while with an SVM the most relevant time interval corresponds to *medium* latencies. Hence, in contrast to a random forest, the SVM selects a larger number of features from early poststimulus time intervals. These results also match previous brain studies reported in literature, in which ensembles of averaged signals were used as well [20, 48]. Note that although the two methods hardly agree to the time intervals where fea‐ tures show up, both highlight frontal areas as relevant spatial locations for affective valence processing.


Upper table: Space location (EEG channels): frontal (FP1, FPz, FP2, F7, F3, Fz, F4, F8), central‐temporal (T7, C3, Cz, C4, T8) and parietal‐occipital (P7, P3, Pz, P4, P8, O1, Oz, O<sup>2</sup> ). Lower table: Time location (time intervals): short (*i =* {1,2}), medium (*i =* {3.4}), long l (*i =* {5,6}) and long II (*i =* {7,8,9}).

**Table 1.** Distribution of the 36 selected features within each band by random forest (intersubject classification).

## **4.2. Intrasubject classification**

**Figures 5** and **6** show the global accuracy, computed by averaging the particular accuracy values of all participants, when the classifiers were trained on only one subject's data. For an intrasubject classification purpose, features were extracted from single‐trial signals as described above (Section 3). The training set for each subject is made up by a total of 65‐72 single trials for both classes of emotions and LOO cross‐validation strategy is applied as well.

Similar to intersubject analysis, SVM‐RFE yielded better results in terms of accuracy rates than random forest when features are extracted from single‐trials. SVM‐RFE reaches mean values close to the maximal accuracy and up to 100% for some subjects. Nevertheless, random forest keeps an 80% accuracy as the upper limit.


**Tables 1** and **2** describe the spatial and temporal locations of the relevant features when the input of the classifiers is the data set formed by these 52 feature vectors. Concerning spatial locations, both random forest and the SVM algorithm allocate the relevant features consistently in frontal regions of the brain, although SVM also keeps a significant number from centroparietal regions. This corroborates other research works where, during affec‐ tive processing, the particular contribution of frontal regions has also been pointed out [46, 47]. Concerning location in time, with a random forest most of the features display *medium* and *long* latencies while with an SVM the most relevant time interval corresponds to *medium* latencies. Hence, in contrast to a random forest, the SVM selects a larger number of features from early poststimulus time intervals. These results also match previous brain studies reported in literature, in which ensembles of averaged signals were used as well [20, 48]. Note that although the two methods hardly agree to the time intervals where fea‐ tures show up, both highlight frontal areas as relevant spatial locations for affective valence

36 Emotion and Attention Recognition Based on Biological Signals and Images

**Scalp region Beta Alpha Theta Delta Total** Frontal 7 2 4 5 18 Central‐temporal 6 0 3 0 9 Parietooccipital 5 2 0 2 9 **Time interval Beta Alpha Theta Delta Total** Short 0 1 0 0 1 Medium 6 1 0 2 9 Long I 12 0 1 3 16 Long II 0 2 6 2 10

**Figures 5** and **6** show the global accuracy, computed by averaging the particular accuracy values of all participants, when the classifiers were trained on only one subject's data. For an intrasubject classification purpose, features were extracted from single‐trial signals as described above (Section 3). The training set for each subject is made up by a total of 65‐72 single trials for both classes of emotions and LOO cross‐validation strategy is applied as well. Similar to intersubject analysis, SVM‐RFE yielded better results in terms of accuracy rates than random forest when features are extracted from single‐trials. SVM‐RFE reaches mean values close to the maximal accuracy and up to 100% for some subjects. Nevertheless, random

Upper table: Space location (EEG channels): frontal (FP1, FPz, FP2, F7, F3, Fz, F4, F8), central‐temporal (T7, C3, Cz, C4,

**Table 1.** Distribution of the 36 selected features within each band by random forest (intersubject classification).

). Lower table: Time location (time intervals): short (*i =* {1,2}),

processing.

**4.2. Intrasubject classification**

T8) and parietal‐occipital (P7, P3, Pz, P4, P8, O1, Oz, O<sup>2</sup>

medium (*i =* {3.4}), long l (*i =* {5,6}) and long II (*i =* {7,8,9}).

forest keeps an 80% accuracy as the upper limit.

Upper table: Space location (EEG channels): frontal (Fp1, Fpz, Fp2, F7, F3, Fz, F4, F8), central‐temporal (T7, C3, Cz, C4, T8) and parietal‐occipital (P7, P3, Pz, P4, P8, O1, Oz, O2). Lower table: Time location (time intervals): short (*i =* {1,2}), medium (*i =* {3.4}), long l (*i =* {5,6}) and long II (*i =* {7,8,9}).

**Table 2.** Distribution of the 36 selected features within each band by SVM‐RFE (intersubject classification).

A comparison of the outcomes of individual training sessions, with respect to the 36 features that remain, reveals a large interindividual. All training sessions encompassed an equal num‐ ber of iterations. For each feature, it was then counted how often it occurred in any subject. **Figure 7** displays this comparison. It shows that, for example, 220 features never survived

**Figure 7.** Within the 36 features selected from each individual training, the histogram counts the number of times a feature was selected using both wrapper methods.

in any individual thus may be considered completely irrelevant. Remarkably, few features appear consistently as relevant features in at least six out of 26 subjects confirming the high interindividual heterogeneity, independently on the applied method for selecting features. A similar conclusion was drawn in Ref. [15], in this case by using a feature selection block before performing classification. However, note that a comparable accuracy value is achieved whether decision making is based on a set of 52 feature vectors (ensemble averages over trials and subjects) or on training classifiers individually with 65–72 feature vectors for each subject.

**Figure 8.** Spatial location of feature relevance in each frequency band obtained from counting the contribution from all subjects within intrasubject classifications (left column: random forest, right column: SVM). The relevance is represented by a color map, where blue represents the least relevant features (nonselected features) and red represents the most relevant ones (selected as relevant by all subjects).

Affective Valence Detection from EEG Signals Using Wrapper Methods http://dx.doi.org/10.5772/66667 39

in any individual thus may be considered completely irrelevant. Remarkably, few features appear consistently as relevant features in at least six out of 26 subjects confirming the high interindividual heterogeneity, independently on the applied method for selecting features. A similar conclusion was drawn in Ref. [15], in this case by using a feature selection block before performing classification. However, note that a comparable accuracy value is achieved whether decision making is based on a set of 52 feature vectors (ensemble averages over trials and subjects) or on training classifiers individually with 65–72 feature vectors for each subject.

38 Emotion and Attention Recognition Based on Biological Signals and Images

**Figure 8.** Spatial location of feature relevance in each frequency band obtained from counting the contribution from all subjects within intrasubject classifications (left column: random forest, right column: SVM). The relevance is represented by a color map, where blue represents the least relevant features (nonselected features) and red represents the most

relevant ones (selected as relevant by all subjects).

**Figure 9.** Relevance of the features selected for different latencies and spatial locations (left column: random forest, right column: SVM) following from counting the contribution of all subjects within an intrasubject classification. Feature importance is visualized by a normalized color map, where blue represents the least relevant features (nonselected features) and red represents the most relevant ones (selected as relevant by all subjects).

**Figures 8** and **9** show the relevance of the features chosen by the two methods in topographi‐ cal maps. As can be seen from **Figure 8**, on average, both algorithms allocate the most relevant features in the frontal region in agreement with intersubject applications. Similarly, both also identify relevant features mostly in the beta bands. According to **Figure 9**, both algorithms allocate important features showing up with short latencies in the frontal areas of the brain. Concerning *medium* and *long* latencies both algorithms again identify important features in frontal areas though their importance is more pronounced with the random forest.

Although intersubject and intrasubject methodologies show a similar performance, they have different application scenarios. The intersubject classification is mostly suitable for offline appli‐ cations as well as for brain studies in order to complement the statistical methods. For instance, in Ref. [49], an SVM‐RFE scheme was exclusively applied to identify scalp spectral dynamics linked with the affective valence processing and to compare with standard statistical results (*t*‐test). In that work, a different technique for feature extraction was developed, whose goals consisted of creating a particular volume of features by means of a wavelet filtering. In this way, a high‐dimensional data set was represented by means of three dimensions: frequency (resolution: 1 Hz), time (resolution: 1 ms) and topographical location (21 EEG channels).

Due to the biological variability observed, intrasubject studies cannot generalize easily across a cohort of subjects. Thus, intrasubject approaches might be interesting for personalized stud‐ ies where subjects need to be followed up for a couple of sessions, such as in a rehabilita‐ tion therapy, or for neurofeedback‐based applications. An example of an intrasubject study is reported in Ref. [50], where the neuroticism trait is analyzed using EEG to check the influ‐ ence of individual differences in the emotional processing and the susceptibility of each brain region. In that work SVM was used as well, although from a different standpoint, since it was performed in subject identification tasks from single trials.
