Short-Latency Evoked Potentials of the Human Auditory System

*Gijsbert van Zanten, Huib Versnel, Nathan van der Stoep, Wiepke Koopmans and Alex Hoetink*

## **Abstract**

Auditory Brainstem Responses (ABR) are short-latency electric potentials from the auditory nervous system that can be evoked by presenting transient acoustic stimuli to the ear. Sources of the ABR are the auditory nerve and brainstem auditory nuclei. Clinical application of ABRs includes identification of the site of lesion in retrocochlear hearing loss, establishing functional integrity of the auditory nerve, and objective audiometry. Recording of ABR requires a measurement setup with a highquality amplifier with adequate filtering and low skin-electrode impedance to reduce non-physiological interference. Furthermore, signal averaging and artifact rejection are essential tools for obtaining a good signal-to-noise ratio. Comparing latencies for different peaks at different stimulus intensities allows the determination of hearing threshold, location of the site of lesion, and establishment of neural integrity. Audiological assessment of infants who are referred after failing hearing screening relies on accurate estimation of hearing thresholds. Frequency-specific ABR using tone-burst stimuli is a clinically feasible method for this. Appropriate correction factors should be applied to estimate the hearing threshold from the ABR threshold. Whenever possible, obtained thresholds should be confirmed with behavioral testing. The Binaural Interaction Component of the ABR provides important information regarding binaural processing in the brainstem.

**Keywords:** auditory evoked potential, auditory brainstem response, ABR, click evoked ABR, frequency-specific ABR, objective audiometry

## **1. Introduction**

Auditory Evoked Potentials (AEP) are electric potentials from the auditory nervous system that can be evoked by presenting abrupt acoustic stimuli to the ear. Registration of the electric potential as a function of time after stimulus presentation shows a reproducible pattern of waves that occur at specific time points after stimulus onset. The time between stimulus onset and occurrence of an extreme value of a wave is called latency. As can be appreciated from **Figure 1**, responses span a time window of several orders of magnitude ranging from several milliseconds to a second. This wide range can be divided into three time-windows reflecting

#### **Figure 1.**

*Impression of registration of an auditory evoked potential. The abscissa shows latency in ms after stimulus onset on a logarithmic scale. The ordinate shows the amplitude of the electric potential in μV.*

different latency ranges. Registrations within these different time-windows are generally called Auditory Brainstem Response (ABR) for short time-windows up to 8 ms, Middle Latency Auditory Evoked Potentials (MLAEP) from 8 ms up to approximately 40 ms, and Long Latency Auditory Evoked Potentials (LLAEP) for time-windows of 40 ms and longer. In this chapter we will focus on short latency ABR responses.

**Figure 2** shows the results of a PubMed search with terms "auditory" and "potential" and "brain stem" and "human" (the latter both in text and as mesh term). It can be appreciated that the first paper mentioning "auditory potential" was published in 1948, but it was not until the early 1970s that the subject generated a substantial number of publication year by year. In the early 1970s, Jewett and Williston [1] introduced labeling of vertex-derived positive extremes of the ABR waves with roman numerals. They also established that these waves are far-field potentials from subcortical structures, providing indirect evidence that wave I is volume-conducted from the eight cranial nerve. Furthermore, they concluded that "waves I through VI have sufficient reliability to be worthy of establishing clinical and experimental norms". This makes them, and particularly wave V, suitable for objective audiometry based on wave occurrence and latency. Picton et al. [2] extended ABR nomenclature by introducing the prime for the vertex-negative extreme following a positive extreme. Thus V′ identifies the vertex-negative extreme following vertex-positive extreme V. In this chapter, we will refer to the vertex-positive extremes as peaks. The first intracranial recordings in humans were, to our knowledge, reported by [3, 4]. In the first study, potentials were recorded from the intracranial part of the auditory nerve in patients undergoing operations for cranial nerve disorders. The results indicated that the auditory nerve gives rise to the first two of the peaks in the scalp-recorded ABR and not to only the first peak. The latter study concluded on the basis of in-depth recordings during brain surgery that waves II and III are primarily generated within the pons, with possible contributions from the auditory nerve. Waves IV and VI originate from the pons and the medial geniculate body respectively. In Section 2 we will discuss the sources of the ABR more extensively.

**Figure 2.**

*Number of publications with search terms "auditory" and "potential\*' and "brainstem" (solid line) and "auditory" and "potential\*" and "brainstem" and "audiometry" (dashed line). The term "Human" was used as a search term both in full text and as a Mesh term.*

Clinical application of ABRs includes identification of the site of lesion in retrocochlear hearing loss, establishing functional integrity of the auditory nerve, and objective audiometry. With the advent of Magnetic Resonance Imaging (MRI) for the detection of acoustic neuroma, the clinical use of ABR for this purpose has declined. ABR remains an important tool, however, for establising neural functional integrity in cases of suspected auditory neuropathy and objective audiometry in newborns. Section 3 will give an overview of all aspects of clinical ABR measurements.

Many countries have established Universal Newborn Hearing Screening Programs for the identification of children with permanent congenital hearing loss. Outcomes of these programs include a lower age of identification, lower age of provision of amplification, and better speech production and perception [5]. Infants who do not pass newborn hearing screening are referred for diagnostic audiological assessment to determine the degree and type of hearing loss, and hearing loss configuration. Hearing thresholds in newborns are typically estimated by using ABR for objective audiometry because behavioral techniques such as Visual Reinforcement Audiometry (VRA) or Conditioned Play Audiometry (CPA) are not feasible at a very young age. Another application of ABR is the detection of ototoxicity in young children that are treated with cisplatin for cancer or (concomitantly) with aminoglycosides or glycopeptides antibiotics for infections. Section 4 will discuss the application of frequencyspecific stimuli for objective audiometry in these patient groups. Finally, in Section 5 we will discuss an example of the application of binaural ABR measurements as an objective measure of directional hearing ability.

### **2. Neural sources underlying the ABR**

The structures that contribute with their stimulus-evoked electrical activity to the ABR are the auditory nerve, cochlear nucleus, superior olive complex, and the lateral lemniscus. These structures will be briefly described with respect to their physiological responses and function.

Comprehensive overviews are provided for instance in [6]. Since the ABR is often used, both in the clinic and in animal experiments, to assess hearing loss caused by damage in the cochlea, that structure is included.

#### **2.1 Description of pathway**

Sound reaches the cochlea via the outer ear canal, tympanic membrane, and middle-ear ossicles. The sensory organ in the cochlea, known as the organ of Corti, is located on the basilar membrane, which stretches from the base near the footplate of the stapes to the apex. Due to gradients of its mechanical properties from base to apex the basilar membrane functions as a frequency filter bank and it is tonotopically organized: it maximally vibrates to high frequencies of the sound at the base and to low frequencies towards the apex, and each place along the basilar membrane corresponds to a frequency it is most sensitive to, a characteristic frequency (CF). Vibrations start at the base and travel towards the apex, a phenomenon known as the traveling wave. Consequently, cochlear responses occur faster after stimulus onset to high frequencies than to low frequencies.

In the organ of Corti, two types of sensory hair cells are distinguished: inner and outer hair cells (IHCs and OHCs, respectively), which are arranged in four rows in the ratio 1:3 and which differ distinctly in function. The IHCs act as mechano-electrical transducers passing through the acoustical information to the nerve, and the OHCs act as amplifiers, increasing detection sensitivity by 40–50 dB and increasing frequency selectivity. In both types of hair cells, acoustical vibrations are converted to electrical potentials. In IHCs these receptor potentials trigger action potentials in the nerve. For that purpose, each IHC is innervated by 10–20 afferent auditory nerve fibers, which are myelinated and which systematically vary in spontaneous rate (SR) and the threshold at their CF [7], the latter allowing for a wide dynamic range to be encoded. In the OHCs, the receptor potentials trigger the cells to contract and expand, and this motility is thought to amplify the basilar membrane vibrations, in particular at low sound levels. Irrespective of the mechanisms, OHC loss leads to a threshold shift of 40–50 dB and deterioration of frequency tuning. Each OHC is innervated by a single unmyelinated afferent fiber, and it shares this fiber with several other OHCs. These fibers have very high thresholds (>90 dB SPL). The great majority of the afferent auditory nerve fibers (~95%) receive input from the IHCs. An auditory nerve of a young normal-hearing subject contains about 35.000 fibers.

Action potentials that are generated at the IHC synapse are propagated along the auditory nerve to the cochlear nucleus (CN). The nerve branches to three divisions of the nucleus: anterior ventral cochlear nucleus (AVCN), posterior ventral cochlear nucleus (PVCN), and dorsal cochlear nucleus (DCN). The AVCN contains for the large part bushy cells which show similar responsiveness as the auditory nerve fibers. Their onset response latencies are ~0.6 ms longer than that of the nerve fibers [8]. Notably, the timing of the action potentials is more precise than that of the auditory nerve, i.e., when stimuli are presented repetitively, the action potentials have a very

#### *Short-Latency Evoked Potentials of the Human Auditory System DOI: http://dx.doi.org/10.5772/intechopen.102039*

similar latency. The PVCN contains, among other cell types, multipolar cells which show so-called chopper responses with longer latencies than the bushy cells. The frequency tuning in AVCN and PVCN is similar to that in the auditory nerve. The DCN has a complex circuitry of various cell types including inhibitory interneurons. Consequently, many DCN neurons show frequency tuning that is characterized by excitatory responses to limited frequency-sound level combinations, and inhibitory responses to a wide range of frequencies and levels.

The bushy cells in the AVCN project to the superior olivary complex (SOC), which is the first station along the auditory pathway to combine input from both ears [9]. Specifically, the spherical bushy cells send their precise phase-locked action potentials to both ipsi- and contralateral medial superior olive (MSO) and to the ipsilateral lateral superior olive (LSO); globular bushy cells project to the contralateral medial nucleus of the trapezoid body (MNTB) from where inhibitory input is delivered to the LSO. Receiving well-timed input from both ears, neurons in the MSO are tuned to interaural time differences (ITD), and receiving ipsilateral excitatory and contralateral inhibitory input LSO neurons are sensitive to interaural level differences (ILD).

The next station in the auditory brainstem is the lateral lemniscus (LL), which globally can be distinguished in a ventral nucleus (VNLL) processing monaural information and a dorsal nucleus (DNLL) processing binaural information. The VNLL receives input from the contralateral CN, and the DNLL receives input from ipsilateral MSO and bilateral LSO.

Monaural and binaural pathways from each of the above-described brainstem nuclei converge in the inferior colliculus (IC). It allows the IC to process several auditory features including basic spectrotemporal features [10] and 2-dimensional spatial information [11].

### **2.2 Contribution of various nuclei to ABR**

The ABR waveform is commonly described as consisting of five peaks. Peaks III and V typically dominate peak II and IV, respectively, and are the ones to be best observed in daily practice in a clinic or laboratory. Peak I appears more prominently in animals than in humans, where it fades faster with decreasing stimulus level than peaks III and V (see also Section 3.6). Electrical activity from the auditory nerve and brainstem nuclei contributes to the ABR. A first-order approach to understand which neural population corresponds to which peak, is to consider the sequence of nuclei in the pathway. The inter-peak interval of approximately 1.0 ms agrees with the axonal conduction time and synaptic delay between the generation of action potentials at two successive neurons. Indeed, as summarized by [12] for the human ABR partly based on intraoperative recordings, peak I reflects the activity of the auditory nerve, peak III that of the CN, peak IV the SOC, and peak V the LL. Peak II is generated by the central part of the auditory nerve, likely where it branches to the three CN divisions. In smaller mammals used in auditory research like gerbils, mice, and guinea pigs, rather four than five peaks are distinguished with peak IV being analogous to peak V of the human ABR [12]. A fifth peak would then reflect responses in IC, as a correspondence to IC evoked potentials in mice indicated [13]. Based on a series of careful lesion and modeling studies of click-evoked ABRs in cats, [14] linked peak I to the auditory nerve, II to the globular bushy cells in AVCN, III to spherical bushy cells and cells driven by globular cells, IV to MSO principal cells, and V to cells driven by MSO principal cells.

In a secondary approach, one should consider that the early stations besides contributing to early peaks can also contribute to later peaks. We consider the ABR evoked by the most commonly used stimulus, a broadband click. As a consequence of the traveling wave mechanics, the click response latency in the auditory nerve is shortest for high-CF neurons and increases with decreasing CF [15], which leads one to conclude that high-CF fibers contribute to wave I [14]. The low-CF fibers with longer latencies and multi-peaked responses (with inter-spike intervals of 1/CF) therefore may contribute to later waves. In particular for high click levels, the high-CF fibers show second firings about 1 ms after the first action potential, an interval that is related to the neural refractoriness [16], and notably, similar to the ABR inter-peak interval. The same notion applies to the CN bushy cells, i.e. those with lower CFs have longer click latencies and may contribute to later peaks.

#### **2.3 Summing contributions from the various sources**

The following factors determine the extent to which a neural population contributes to the ABR: the number of responding neurons, the discharge probabilities of the individual neurons, the discharge latencies, the synchronization of discharges between neurons, the synchronization of the individual neuron, and the unit response (UR). How action potentials of a neural population shape an ABR wave is illustrated in **Figure 3** by the compound action potential (CAP), which reflects the auditory nerve response, thus analogous to wave I of the ABR. The CAP is mathematically described as the convolution of the compound discharge latency distribution (CDLD) and the UR, a concept introduced by [18]. An example of a CAP with corresponding CDLD and UR is shown in **Figure 3**, along with the convolution equation.

The CDLD is the sum of the discharge probabilities of all responding auditory nerve fibers, which are typically recorded by poststimulus time histograms (PSTHs) acquired by presenting the stimulus a few 100 times. The discharge probability is the ratio of discharges and the number of stimuli. The synchronization is high when to

#### **Figure 3.**

*Example of CAP and corresponding CDLD, which is constructed based on the CAP and depicted UR using the convolution equation. The UR is modeled after experimental guinea pig data [17], and the PSTH is an example of a recorded single-fiber response to 256 presentations of a monophasic condensation click of 100 μs.*

#### *Short-Latency Evoked Potentials of the Human Auditory System DOI: http://dx.doi.org/10.5772/intechopen.102039*

each stimulus presentation the latency is very similar, thus resulting in a peaky PSTH, and the synchronization is low when the discharges are spread. The click-evoked PSTH in **Figure 3** has a latency of about 2.0 ms with some discharges at 1.8 ms and some at 2.3 ms; the second peak reflects second discharges of the neuron. The CDLD will be relatively narrow when the PSTHs of the responding neurons have the same latency, and broad when the latencies vary among neurons. The latter applies to the auditory nerve since fibers with a low SR have typically longer latency than the fibers of high SR [15]. The UR is the potential at the recording electrode that results from a single action potential. Obviously, it determines both the size and shape of the AEP waveform, and it depends mostly on the distance of the electrode from the neural population. Generally, the UR depends on specific electrode configurations, the tissue between electrode and neurons, which includes electrodes at the skin, and skull characteristics. It is the factor that is most difficult to assess; for the CAP, it has been assessed by recording the potential at the CAP-recording site around the occurrence of action potential [17, 19, 20]. Each neuron may have its UR depending on the neuron's location and morphometry. For the auditory nerve it can be assumed that the UR does not vary significantly with CF and SR [17], an assumption that generally works well when using the UR to predict CAPs [21–23]. The neural populations in the brainstem, however, will have URs that vary greatly between nuclei [14].

As an approximation, the CAP amplitude is proportional to the number of responding neurons (N in equation in **Figure 3**). **Figure 4** shows amplitudes of CAPs evoked by an electrical current pulse as a function of the number of auditory nerve fibers in guinea pigs.

Most of these guinea pigs have been deafened and consequently, the number of neurons, quantified by packing density of the cell bodies in Rosenthal's canal at different durations of deafness, varied widely [25]. Using an electrical stimulus, synchronization

#### **Figure 4.**

*Amplitudes of CAPs to electrical pulse stimulation (eCAPs) as a function of packing density of spiral ganglion cells. Data are acquired in 97 guinea pigs that are normal-hearing or ototoxically deafened with varying duration of deafness (2–14 weeks). Electrical pulses used were biphasic pulses with a phase duration of 50 μs and inter-phase gap of 30 μs and alternating polarity. Current levels are maximal, i.e., at or near saturation. The packing density reflects the number of surviving neurons. For methodological details see Ramekers et al. [24].*

is expected to be large, and the great majority of surviving neurons are expected to respond, creating an ideal condition to test the convolution approximation. Indeed, the CAP amplitude significantly increases with the neural packing density, however, the amplitude varies enormously among guinea pigs, and the variance is only explained for 36% by the packing density. This outcome confirms that the number of responding neurons is an important factor, but at the same time it underscores the unreliability of amplitude as a measure of auditory evoked potentials including the ABR.

How do responses with different latencies add up? To address that question again the CAP provides a good illustration as shown in **Figure 5**.

The example shows two CAP contributions, with a ratio second/first of 0.25, and a latency difference of CDLD of 0.6 ms (left column) and 0.4 ms (right column). The difference of 0.2 ms has enormous consequences for the resulting waveforms. The left waveform shows two distinct waves (N1, P1, N2, P2) but the right waveform shows a merged P1-P2 while the N2 has vanished. It illustrates an often occurring phenomenon of ABRs that waves appear as merged components, therefore not showing the classical 5 waves.

The URs of the various brainstem nuclei are crucial for how the potentials add up. As the URs depend on recording sites, the effect of changing electrode sites is demonstrated in **Figure 6** showing click-evoked ABRs in a normal-hearing guinea pig, first with skin needles as electrodes, second with screws implanted in the skull as electrodes. For the different click levels, the waveforms show clear differences.

### **2.4 Effect of hearing loss**

ABR waveforms vary with degree and types of hearing loss. We discuss two different types of common pathologies with respect to the consequences for the clickevoked ABR, OHC loss in basal cochlear regions, and synaptopathy.

#### **Figure 5.**

*Illustration of summing of two waveforms with varying latencies. In the left column, the latency difference between the first and second contribution is 0.6 ms, and in the right column, the latency difference is 0.4 ms. The size of the contributions is unchanged. The resulting waveforms (bottom row) differ greatly in that the left one shows a clear second peak, whereas the right one shows only one peak. The waveforms here show CAPs, but the principle applies to ABRs as well.*

*Short-Latency Evoked Potentials of the Human Auditory System DOI: http://dx.doi.org/10.5772/intechopen.102039*

#### **Figure 6.**

*Click-evoked ABRs recorded from normal-hearing guinea pigs. Clicks consisted of monophasic pulses of 20 μs with alternating polarity, presented at a rate of 10.1/s. The levels indicate dB attenuation relative to ~110 dB pe SPL. Subcutaneous needle electrode configuration: Active electrode behind the ipsilateral pinna, reference electrode on the skull, rostral to the brain, and ground electrode in the hind limb. Transcranial screw electrode configuration: active electrode 1 cm posterior to bregma, and the reference electrode 2 cm anterior to bregma; as ground electrode a subcutaneous needle electrode in the hind limb was used. For methodological details see [24].*

OHC loss in basal cochlear regions, for instance, caused by ototoxic medication, noise trauma, aging, or any combination of these, leads to high-frequency hearing loss and to degradation of frequency tuning, which both have consequences for click-evoked responses of the auditory nerve. First, the latency increases with decreasing click level will be larger than normal, since for the lower levels the neurons from apical regions, which have late responses because of the traveling wave delay, will dominate the contributions to the ABR. Second, the difference in auditory nerve responses between rarefaction and condensation clicks (see Section 3.4 on stimulus polarity), which is negligible in normal ears, will increase in particular with respect to latency. Basal neurons in regions of OHC loss show decreased sensitivity for high frequencies and increased sensitivity for low frequencies [26], which can be characterized as double frequency tuning, leading to click responses with short latencies typical for high-CF click responses and latency differences between rarefaction and condensation clicks reminiscent of low-CF responses [27]. While this polarity asymmetry occurs at high click levels, at low levels the dominating low-CF responses will cause a latency difference in responses between the rarefaction and condensation polarity. Third, shallow frequency tuning may lead to increased synchronization [27], which can be explained by considering the click response as an impulse response of which the frequency tuning is the Fourier transform.

In animals, it has been demonstrated that aging leads to loss of neurons because of damage to the IHC synapses while the IHC itself remains functional [7]. Exposure to noise also when not leading to IHC loss augments this cochlear synaptopathy. The amplitude of wave I of the ABR has been found to be strongly correlated to the survival of IHC synapses in mice [28] reminiscent of the correlation between eCAP amplitude and neural survival in **Figure 4**. In humans, neural degeneration also occurs with increasing age, and speech perception has been shown to be affected by the neural loss as quantified in a post-mortem histological analysis [29]. The low-SR neurons, which have high thresholds, are especially vulnerable for synaptopathy and

therefore the ratio of wave I amplitudes at high and low stimulus levels is regarded as a measure of synaptopathy. Carcagno and Plack [30] underscored the use of this ABR measure as they found a decrease in the wave I ratio with age. Alternatively, the ratio of wave I and wave V amplitudes is sometimes used.
