**2.3 Additional analysis methods**

## *2.3.1 Decomposing the analysis across sphere regions*

Several studies have shown variations in localisation accuracy as a function of region on the sphere due to, amongst other things, cue interpretation [3] or reporting method [32]. In these cases, decomposition schemes were used to better characterise those variations and understand their origins. As mentioned in Section 2.1.5, Van Wanrooij and Van Opstal [25] for example decomposed the analysis of elevation gain across azimuthal regions. Later, Majdak et al. [14] proposed an analysis split into hemi-fields to detect higher accuracy variations for targets in the rear region. Middlebrooks [12] applied a similar spatial decomposition to detect high variability for responses in the upper-rear quadrant, temporarily excluding them from the analysis to better assess variations in remaining regions. The principal drawback of decomposition is that it reduces the statistical power of the analysis, and can result in unbalanced data sets if responses are not evenly spread across the regions under consideration.

### *2.3.2 Performance evolution modelling and analysis*

For the evaluation of HRTF learning, it is essential to assess the progression of participant performance over multiple sessions. On the assumption that any adaptation to an HRTF is a process with diminishing returns with repeated training sessions, localisation performances may be modelled as an exponential decay *y* ¼ *y*<sup>0</sup> exp ð Þþ �*t=τ c* [15, 31]. Here *y*<sup>0</sup> is the initial performance, *t* is the time (training day, session, *etc.*), *τ* is the improvement time constant, and *c* is the long term performance. This model of performance over time allows for comparisons between studies, such as determining if different protocols lead to faster learning rates or if better long term performance can be achieved. If the training duration proves insufficient to reach a performance plateau/asymptote, like that seen in Stitt et al., [10], the improvement data may be better modelled using the linear form *ax* þ *b* [9, 31]. In addition to performance modelling, the correlation between training duration and performance metrics has been used to determine if factors other than training duration, like participant attention, should be considered to explain performance evolution [33].

Analysis of performance evolution can be performed per condition (grouping participants) [8, 10] or per participant [23]. Participant performance evaluation makes it harder to draw general conclusions, but potentially provides deeper insight into performance as not all participants exhibit the same ability to adapt to a new HRTF [24]. This adaptation capacity appears to be a function of initial HRTF affinity or "perceptual quality" [10]. For inter-study comparisons, some form of performance scaling or normalisation may first be required to compensate for such affinities, highlighting performance improvement rather than absolute value [10].

### **3. Methodology for assessing localisation performance**

From the literature review in the previous section, a methodology is derived for assessing binaural localisation accuracy. Though it was designed with a focus on HRTF training programs, it should be applicable to any HRTF-related study interested in localisation performance assessment. Section 3.1 introduces the conventions and metrics used in the methodology, itself detailed in Section 3.2. The metrics
