**Acknowledgements**

clinical diagnosis of schizophrenia. These findings implicate that the emotional salience detection of voices differentiate the negative and positive symptoms in neuropsychiatric

The emotional prosody was also examined in those with congenital amusia (a specific neu‐ rodevelopmental disorder featured as tone‐deafness, [32]). Lu et al. [32] presented emotional words spoken with declarative or the question voice to the amusics and their healthy control. The N1 was reduced and the N2 was increased in incongruent voice. The modulation of N1 was intact whereas the change in N2 was reduced in amusics, suggesting an impaired conflict processing in amusia. The authors argued that the impaired discrimination of speech intona‐ tion among amusic individuals may arise from an inability to access information extracted at

One application of these studies is to build an artificial intelligence to decode brain signals which contribute to socioemotion understanding. Most of the studies use the acted (posted) vocal expression as testing materials, which were produced by professional actors, public speakers, or amateurs to portray an intended emotion. In real‐life communication, the com‐ municators may use such emotional pose to achieve certain communicative goals. Some research purpose, for example, the cultural display in vocal expression communication, may be specifically favored by using posed stimuli. However, a call for research on naturalistic, ecological, and observation‐based stimuli is highly recommended. Therefore, a future study is to examine how the brain differentiates "real" vs. "fake" vocal expression by looking at the

Another implication of using EEG signals to study vocal emotion decoding is to test the effectiveness of speech‐coding strategies used in hearing aids for deaf listeners when they distinguish the emotions via prosody‐specific features of language [33, 34]. In Agrawal et al. [33], statements simulated with different speech‐encoding strategies differentiated the P200 in the happy expression and an early (0–400 ms) and late (600–1200 ms) gamma band power increase in vocal expressions of happiness, anger, and neutral. In Agrawal et al. [34], the P200 was differentiated by different simulation strategies in all types of emotions, and was larger in happiness than in other emotion types across speech‐encoding strategies. These studies emphasized the importance of vocoded simulation to better understand the prosodic cues which cochlear impairment users may be utilizing to decode emotion in the voice. Further studies will also draw upon the merits of multimodal recording and synchroniza‐ tion of neurophysiological and peripheral physiological responses to decoding vocal expres‐ sions, including eye movement, pupil dilation, heart rate tracking, etc., to understand how different systems support the understanding of social and emotional information in speech

disorders at the preattentive level.

54 Emotion and Attention Recognition Based on Biological Signals and Images

**7. Applications and future directions**

early processing stages.

neurophysiological responses.

and vocalizations.

Special thanks to Professor Dr. Marc D. Pell who leads the Neuropragmatics and Emotion Lab in the School of Communication Sciences and Disorders, the McLaughlin Scholarship and McGill MedStar by Faculty of Medicine, McGill University that were awarded to the author.
