**6. Performance analysis**

This section presents an objective evaluation of the performance of the proposed Bilateral Ambisonics reproduction approach, and compares it to that of the Basic Ambisonics+MagLS reproduction method.

A binaural signal for a sound-field composed of a single plane-wave of unit amplitude, as presented in **Figure 5**, is computed, and the Normalized Mean Square Error (NMSE) for the left ear is evaluated as:

$$\epsilon^L(f) = \mathbf{10} \log\_{10} \frac{\left| p\_{\rm ref}^L(f) - p^L(f) \right|^2}{\left| p\_{\rm ref}^L(f) \right|^2},\tag{17}$$

where *pL* ref is the reference high-order binaural signal computed using Eq. (3) with *<sup>N</sup>* <sup>¼</sup> 40, and *pL* is the binaural signal computed using Eq. (3) or (14). The NMSE, although positive and real, is sensitive to both the magnitude and the phase

#### **Figure 8.**

*NMSE of binaural signals computed for sound-fields composed of a single plane-wave, averaged over 434 plane-wave directions (distributed according to a Lebedev grid), with HRTF of KEMAR. The NMSE is computed using Eq. (17), with Basic Ambisonics reproduction (solid lines), with Basic Ambisonics reproduction with MagLS [31] (dashed lines), and with Bilateral Ambisonics reproduction (dotted lines), with N* ¼ 1, 4*, and a high-order reference with N* ¼ 40*.*

#### *Binaural Reproduction Based on Bilateral Ambisonics DOI: http://dx.doi.org/10.5772/intechopen.100402*

errors in the binaural signal. The NMSE is averaged over a range of 434 plane-waves with incidence angles distributed nearly-uniformly over the sphere, using the Lebedev sampling scheme of order 17 [59].

**Figure 8** shows this averaged NMSE. For the MagLS approach, a cutoff frequency of 2 kHz was used, as indicated by the increased error above this frequency, where the phase is completely inaccurate. The figure demonstrates the improvement in the accuracy of the Bilateral Ambisonics reproduction, compared to the Basic Ambisonics reproduction methods, where at high frequencies, up to about 5 kHz for *N* ¼ 1 and 15 kHz for *N* ¼ 4, the errors are lower by 10–20 dB.Two important spatial cues for sound source localization are the Interaural Time Difference (ITD) and Interaural Level Difference (ILD). Both were shown to be affected by the truncation error due to low-order reproduction [29]. **Figure 9** shows the ITDs, ILDs and their corresponding errors relative to a high-order reference (*N* ¼ 40). The ITDs and ILDs were computed for binaural signals of a single planewave sound-field with incident angles across the left horizontal half-plane (*<sup>θ</sup>* <sup>¼</sup> <sup>90</sup><sup>∘</sup> ; 0<sup>∘</sup> ≤*ϕ*≤180<sup>∘</sup> ) with 1° resolution, and with a KEMAR HRTF. The ITDs were esti-

#### **Figure 9.**

*ITDs, ILDs and their corresponding errors as a function of azimuth angle for binaural signals computed for sound-fields composed of a single plane-wave from 180 directions on the left horizontal plane (the right side is symmetrical), with HRTF of KEMAR. The signals were computed using Basic Ambisonics reproduction with and without MagLS, and Bilateral Ambisonics reproduction.*

mated using the onset detection method, applied to a 2 kHz low-pass filtered version of the signals [54]. The ILDs were calculated and averaged across 18 auditory filter bands as [5]:

$$\text{IID}\left(f\_{\varepsilon}, \mathfrak{Q}\right) = \mathbf{10} \log\_{10} \frac{\int \mathbf{C}\left(f, f\_{\varepsilon}\right) \left|p^{L}(f)\right|^{2} \mathbf{d}f}{\int \mathbf{C}\left(f, f\_{\varepsilon}\right) \left|p^{R}(f)\right|^{2} \mathbf{d}f},\tag{18}$$

$$\text{ILD}\_{av}(\mathfrak{Q}) = \frac{1}{\mathfrak{18}} \sum\_{\mathfrak{f}\_{\mathfrak{c}}} \text{ILD}(f\_{\mathfrak{c}}, \mathfrak{Q}), \tag{19}$$

where *C* is a Gammatone filter with center frequency *f <sup>c</sup>*, as implemented in the Auditory Toolbox [60]. The integral is evaluated between 1*:*5kHz and 16kHz and *f <sup>c</sup>* is restricted accordingly. This computation facilitates a perceptually motivated smoothing of the ILD across frequencies, which is required for appropriate comparison between ILDs.

Comparison of the ITD errors with the Just Notable Differences (JND) values reported by Andreopoulou and Katz in [54] (40*μ*s for the frontal directions and about 100*μ*s for the lateral directions) reveals the main advantage of the Bilateral Ambisonics approach, where the phase information is preserved and the ITD errors are below the JND even at *N* ¼ 1.

**Figure 9b** shows that both the MagLS and the Bilateral approaches achieve significant improvement in the ILD accuracy compared to the Basic Ambisonics reproduction. While with the Basic Ambisonics reproduction the ILD errors are above the JND (�1 dB [61, 62]) even with *N* ¼ 4, with the MagLS and the Bilateral Ambisonics reproduction the errors for *N* ¼ 4 are below the JND for most angles. Relatively high errors can be seen at the lateral angles compared to the front and back directions. This can be explained by the fact that the ILD at the front and back directions is close to zero, where the errors are expected to be small due to the symmetry of the HRTF model. Nevertheless, both the MagLS and the Bilateral Ambisonics reproduction led to substantially lower ILD errors compared to the Basic Ambisonics reproduction.

As discussed in Section 5, a limitation of the Bilateral Ambisonics method compared to Basic Ambisonics is found in terms of the incorporation of head-tracking in post-processing. In Section 5, a method to overcome this limitation was suggested. To evaluate the performance of this method, a simulation study was conducted. The simulation results aim to evaluate the NMSE introduced by the head rotation and its dependence on the Bilateral Ambisonics signal order and the head rotation angle. In the simulation, a head was positioned in free-field, facing the *x*^ direction with the ears positioned on the *xy* plane. A sound-field was generated, consisting of a single plane-wave with unit amplitude arriving from directions that are taken from the Lebedev sampling scheme, using the same sampling scheme mentioned earlier. The Bilateral Ambisonics signal, *a<sup>L</sup> nm*ð Þ*k* , is calculated with respect to the left ear position *r* ! *<sup>a</sup>* up to order *N*. Note that the superscript *<sup>L</sup>* denoting the left ear will be removed for brevity from now on. The signal is then transformed to *a k*ð Þ , Ω with the discrete inverse spherical Fourier transform (DISFT) [46]. Next, the head is rotated by *γ* degrees clockwise in the horizontal plane, as shown in **Figure 6b**, resulting in a new rotated left ear position *r* ! *<sup>b</sup>*. The translated plane-wave amplitude density function, *at* ð Þ *<sup>k</sup>*, <sup>Ω</sup> , is computed using Eq. (15). Next, Eq. (16) is used to calculate *ar nm*ð Þ*k* from *at nm*ð Þ*<sup>k</sup>* , the discrete spherical Fourier transform (DSFT) [46] of *at* ð Þ *k*, Ω . The signal *ar nm*ð Þ*k* represents the head-rotated left ear Bilateral Ambisonics signal; note that the right ear signal can be calculated in a similar manner. Finally, the left ear binaural signal with head-tracked Bilateral Ambisonics, *p f* ð Þ, is calculated using Eq. (14) with *ar nm*ð Þ*k* and a KEMAR HRTF. The reference binaural signal, *p*refð Þ*f* , is calculated using Eq. (14) with an accurately generated Bilateral Ambisonics signal *a*ref *nm*ð Þ*k* of order *N* at the head-rotated position. The NMSE is calculated using Eq. (17), and averaged over the 434 sampling scheme directions.

**Figure 10a** shows the NMSE between *p*refð Þ*f* and *p f* ð Þ for a head rotation of *<sup>γ</sup>* <sup>¼</sup> <sup>30</sup><sup>∘</sup> and different reproduction orders, *<sup>N</sup>* <sup>¼</sup> 1, 4, 10. The figure demonstrates the improvement in the NMSE as the order increases. Additionally, the figure demonstrates how the error increases with frequency. For *N* ¼ 1, 4, 10 an error of less then �10dB is achieved up to about 1, 5kHz and 11kHz, respectively. This result indicates that, for example, with order *<sup>N</sup>* <sup>¼</sup> 4 and a rotation angle of 30<sup>∘</sup> the suggested rotation method will experience a noticeable loss in accuracy above 5kHz, compared to the reference. To evaluate the performance of the suggested method for different head rotation angles, the order was kept at *N* ¼ 4 and various values of head rotation angle, *γ*, were used. **Figure 10b** illustrates how the performance deteriorates as the rotation angle increases. For *<sup>γ</sup>* <sup>¼</sup> <sup>30</sup><sup>∘</sup> , 60<sup>∘</sup> , 90<sup>∘</sup> an error of less than �10dB is achieved up to about 5kHz, 3kHz and 2kHz, respectively.

We now compare between binaural reproduction performance with head-tracked Bilateral Ambisonics, head-tracked MagLS and with head-tracked Basic Ambisonics.

*Binaural Reproduction Based on Bilateral Ambisonics DOI: http://dx.doi.org/10.5772/intechopen.100402*

#### **Figure 10.**

*NMSE of binaural signals computed using Bilateral Ambisonics reproduction with head rotation as in Eq. (16), for various orders (a) and rotation angles (b).*

**Figure 11.**

*NMSEs of binaural signals computed using rotated Basic, MagLS and Bilateral Ambisonics signals with order N* ¼ 4 *relative to a high-order Basic Ambisonics reproduction with N* ¼ 40*, with various rotation angles, γ. The NMSE is averaged over 434 plane-wave directions.*

In the simulation (which is identical to the previously described simulation), the NMSE is measured for head-tracked binaural signals computed using Basic, MagLS and Bilateral Ambisonics reproductions with order *N* ¼ 4, and compared with a highorder reference computed using Basic Ambisonics reproduction with order *N* ¼ 40. The head-tracked Bilateral Ambisonics signals are calculated with the suggested method, using Eqs. (15) and (16), and both the head-tracked Basic Ambisonics and MagLS signals are calculated in the SH domain using Eq. (16). Note that for headtracking with Basic Ambisonics and MagLS, the error is independent of the rotation angle, *γ*. **Figure 11** presents the results for different head-rotation angles, *γ*. As expected, the rotation procedure compromises the accuracy of the binaural signal with Bilateral Ambisonics at high frequencies. For *<sup>γ</sup>* <sup>¼</sup> <sup>10</sup><sup>∘</sup> , the Bilateral Ambisonics reproduction retains its advantage in accuracy compared to the Basic Ambisonics reproduction up to around 20 kHz. However, for a head-rotation of *<sup>γ</sup>* <sup>¼</sup> <sup>30</sup><sup>∘</sup> , the Bilateral Ambisonics reproduction retains its advantage only up to about 7 kHz. For a head-rotation of *<sup>γ</sup>* <sup>¼</sup> <sup>60</sup><sup>∘</sup> , the two reproduction schemes are equally accurate. For a head-rotation of *<sup>γ</sup>* <sup>¼</sup> <sup>90</sup><sup>∘</sup> , the Bilateral Ambisonics reproduction results in an error of less than �10dB up to about 2kHz, compared to 2*:*5kHz for Basic Ambisonics. Similar behavior was also observed for other reproduction orders. These results indicate that, in this case, the suggested rotation method is mainly beneficial for head rotations up to 60<sup>∘</sup> . Note that 60<sup>∘</sup> means that the listener can turn his/her head 60<sup>∘</sup> both to the left and to the right. The inaccuracies depicted in **Figures 10** and **11** relating to the reproduction order *N* and head-rotation angle *γ*, can be explained by errors due to the translation operation described in Eq. (15) [46, 57].

Further evaluation of head-tracking compensation is the subject of ongoing research. The study could include evaluation of ITD/ILD reconstruction, Lateral Error, Polar error in median plane, Coloration error [26] and subjective listening tests.
