**2.2 Mixed and augmented reality applications with binaural technology**

Audio Mixed Reality (AMR) applications aim at recreating new auditory spaces for listeners by balancing the proportion of real and virtual elements. Also, Audio Augmented Reality (AAR) applications aim at achieving listeners' experiences of acoustic transparency, as if there was no headset, to interleave virtual sounds with an unaltered reality [24]. Drawing upon Milgram and Kishino [25]'s "virtuality continuum" of visual displays, McGill et al. [26] define AAR as "auditory headset experiences intended to [...] exploit spatial congruence with real-world elements." From this perspective, AAR sits at the edge of AMR that encapsulates "any auditory VR and AR experiences." These definitions mirror the recording esthetics continuum from "attempting realism" to "creating virtual worlds" produced through different sound capture systems and mixing approaches [27]. While mixing for stereo recordings differs from mixing for AMR and AAR applications, we applied our knowledge of sound capture systems to best meet the cultural expectations and genre conventions of the performance contexts. Specifically, we primarily used microphone arrays that captured the acoustic environment for our five AAR case studies, versus close mono microphones that focused on the instruments' direct sound for our three AMR case studies.

To enhance listeners' perception of auditory spaciousness through headphone monitoring, König [28] conceptualized one of the first four-channel headphones that positioned an additional speaker driver near the tragus to diffuse reverberation, and thus allow for a more accurate spatial image with less sound pressure level on the ear axis. Further developments intending to simulate surround and multichannel loudspeaker systems have led to the design of multi-driver headphones that position multiple speaker drivers within the ear cup, employing the shape of the listener's ear and pinna to influence the filtering of high frequencies as they enter in the ear canal [19]. Meanwhile, most of today's AAR and AMR headphone applications use binaural filtering with head-related transfer functions (HRTF) that enable listeners to externalize sound sources while wearing regular headphones. Theoretically, delivering accurate intelligibility, localization, and externalization

<sup>4</sup> Cardassi first tested a bone-conduction headphone in Fall 2017 for the recording of *Ramos* (Redshit, 2019) with Pras as music producer and sound engineer. While she could only use it for the recording of a few pieces in Rolston Hall at the Banff Centre, she enjoyed preparing for the sessions with it at home. This was confirmed through personal email communication on March 9, 2021.

of sound sources through headphones requires the binaural rendering of sound sources via individualized HRTFs transmitted through high fidelity open-back headphones [29]. Nevertheless, according to a review of sound externalization studies, adding reverberation-related cues, and/or dynamic binaural rendering that matches listeners' self-initiated head movements, facilitates the localization and externalization of binaural cues [30], which may compensate for the use of nonindividualized HRTFs and closed-back headphones. Whereas dynamic binaural ensures the success of AAR applications for users who move a lot in the real-world environment, such as orchestra conductors, we suggest that static binaural may be more relevant for AMR applications where most of the binaural cues are out of sight, such as recording sessions with musicians performing in separate rooms. In this view, static binaural might still provide users with a better source intelligibility and a more spatial experience compared to stereo systems since there is less masking effect among sources, even though the localization accuracy and externalization of binaural cues remain compromised, for example, generating front-back confusions. In fact, a study showed that "short training periods involving active learning and feedback" facilitate listeners' ability to externalize sources while using binaural systems with non-individualized HRTFs [31]. In this chapter, we present the concept of two distinct dynamic binaural AAR setups and one static binaural AMR setup that involved a short training tutorial for listeners.

Besides the popularity of noise cancelation headphones that filter the real acoustic environment out for listeners to focus on music or other virtual elements [26], AAR and AMR microphone-hear-through devices are primarily developed for single users' experiences in non-musical applications, for example, for audio gaming [32]; street navigation [33]; and soundwalks that immerse listeners in sonic art compositions [34]. Only a few collaborative AAR experiences have been tested [35], for example, a four-player interactive audio experience [36]; a twoplayer audio game called *eidola multiplayer* [37]; and creative artworks dedicated to multi-users, such as *Listen* for museum visits [38] or *SoundDelta* devoted to large public outdoor events [39]. Also, to our knowledge, very few AAR musical applications besides Copper and Martin's ATH [2] have been designed. For instance, a Master thesis showed that members of a rock band preferred performing with AAR dynamic setups compared to mono and stereo headphones [40]; a study with methodological shortcomings tested AAR dynamic in-ear monitors for members of an acoustic ensemble [41]; and the *Architexture Series* brought new music composers, sound engineers, and architects to collaborate on site reconstruction [42, 43]. Our eight music performance case studies, therefore, contribute to AAR and AMR research by assessing two AAR setups that aim at overcoming performers' social interaction challenges when wearing headphones, and one AMR setup that aims at enhancing social interactions among performers when being remotely located.

### **2.3 Binaural music production**

Sound engineers increasingly use binaural technology in the recording studio in parallel with the development of new plugins and devices that enable listeners' sound externalization on headphones with and without the tracking of their head movements, for example, binaural simulation of surround sound mixes in control rooms that do not have a 5.1 speaker system [44]. Although binaural audio is optimized for headphone listening which is the primary music listening mode of our time, so far only few binaural music productions have been released on the market. For instance, Williams and Reiser walked us through the binaural capture and rendering processes of sources for the production of "*GoGo Penguin [untitled]"*

### *Binaural Headphone Monitoring to Enhance Musicians' Immersion in Performance DOI: http://dx.doi.org/10.5772/intechopen.104845*

(Blue Note Records 2020), which was released in stereo and not yet in binaural.5 They used three *Neuman KM 100* dummy heads to *overdrive* space in the main live room and the drum room, and to immerse listeners within the piano sound. At the mixing stage, they also used *dear VR* plugins to externalize specific sources. They underlined that binaural production techniques are the best fit to convey virtuosic performances of high-level musicians in contemporary jazz and classical music because the recording of their performances requires little signal processing in terms of equalization and dynamic range compression. Indeed, extensive signal processing does not work well with binaural rendering, and equalization and compression should only be used for creative purposes since there are less source masking effects than in stereo [17]. We thus assessed our three binaural solutions in professional-level performance contexts whose esthetics did not require much signal processing, with five out of the eight case studies primarily involving classical and jazz musicians.

Whereas binaural has not yet succeeded commercially as a release format, more and more public European radios offer binaural programs, for example, *Hyperradio* on Radio France, which primarily broadcasts audio plays and electronic music live shows. To broadcast classical orchestral recordings for *BBC Proms* on BBC Radio 3, Parnell and Pike [45] reported on using IRCAM's *Panoramix* to enhance the positioning and ambiance of the auditory scene captured with a Schoeps ORTF-3D microphone array that features two coincident layers of four microphones. Results from their audience study showed that binaural mixes were rated as "more enjoyable" by 79% of respondents, whilst 75% said that the experience was "somewhat" or "absolutely" like being there in person. These findings contrasted with previous research that found that overall, the stereo listening experience was preferred to binaural for a range of musical genres [46]. Also, the outcomes of a study about binaural mixing for hip-hop production suggest that listeners can be disoriented by this unfamiliar immersive format [47]. In particular, the main sources of the beat seem more effective when not externalized. We used this knowledge to capture and mix sound sources in the performers' binaural headphones for our eight case studies.
