**Neural Mechanisms for Binocular Depth, Rivalry and Multistability**

Athena Buckthought and Janine D. Mendola

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48446

## **1. Introduction**

82 Visual Cortex – Current Status and Perspectives

79: 2218-21.

attention. Nat Neurosci 2: 370-4.

[85] Brefczynski JA, DeYoe EA (1999) A physiological correlate of the 'spotlight' of visual

[86] Gandhi SP, Heeger DJ, Boynton GM (1999) Spatial attention affects brain activity in

[87] Martinez A, Anllo-Vento L, Sereno MI, Frank LR, Buxton RB, Dubowi, DJ, Wong EC, Hinrichs H, Heinze HJ, Hillyard SA (1999) Involvement of striate and extrastriate visual

[88] Tootell RB, Hadjikhani N, Hall EK, Marrett S, Vanduffel W, Vaughan JT, Dale AM

[89] Watanabe T, Sasaki Y, Miyauchi S, Putz B, Fujimaki N, Nielsen M, Takino R, Miyakawa S (1998) Attention-regulated activity in human primary visual cortex. J Neurophysiol

[90] Friedman DS, O'Colmain BJ, Munoz B, Tomany SC, McCarty C, de Jong PT, Nemesure B, Mitchell P, Kempen J (2004) Prevalence of age-related macular degeneration in the

[91] Klein R, Peto T, Bird A, Vannewkirk MR (2004) The epidemiology of age-related

[92] Vingerling JR, Dielemans I, Hofman A, Grobbee DE, Hijmering M, Kramer CF, de Jong PT (1995) The prevalence of age-related maculopathy in the Rotterdam Study.

[93] Penfold PL, Madigan MC, Gillies MC, Provis JM (2001) Immunological and aetiological

[94] Faubert J, Overbury O (2000) Binocular vision in older people with adventitious visual impairment: sometimes one eye is better than two. J Am Geriatr Soc 48: 375-80. [95] Kleiner RC, Enger C, Alexander MF, Fine SL (1988) Contrast sensitivity in age-related

[96] Midena E, Degli Angeli C, Blarzino MC, Valenti M, Segato T (1997) Macular function impairment in eyes with early age-related macular degeneration. Invest Ophthalmol

[97] Boucart M, Despretz P, Hladiuk K, Desmettre T (2008) Does context or color improve

[98] Tran TH, Guyader N, Guerin A, Despretz P, Boucart M (2011) Figure ground discrimination

[99] Tran TH, Rambaud C, Despretz P, Boucart M (2011) Scene perception in age-related

[100] Boucart M, Dinon JF, Despretz P, Desmettre T, Hladiuk K, Oliva A (2008) Recognition of facial emotion in low vision: a flexible usage of facial features. Vis Neurosci 25: 603-9. [101] Musel B, Hera R, Chokron S, Alleysson D, Chiquet C, Romanet JP, Guyader N, Peyrin C (2011) Residual abilities in age-related macular degeneration patients to process spatial frequencies during natural scenes categorization. Vis Neurosci 28: 529-541. [102] Epstein RA, Kanwisher N (1998) A cortical representation of the local visual

object recognition in patients with low vision? Vis Neurosci 25: 685-91.

in age-related macular degeneration. Invest Ophthalmol Vis Sci 52: 1655-60.

macular degeneration. Invest Ophthalmol Vis Sci 51: 6868-74.

human primary visual cortex. Proc Natl Acad Sci U S A 96: 3314-9.

(1998) The retinotopy of visual spatial attention. Neuron 21: 1409-22.

cortical areas in spatial attention. Nat Neurosci 2: 364-9.

United States. Arch Ophthalmol 122: 564-72.

Ophthalmology 102: 205-10.

Vis Sci 38: 469-77.

environment. Nature 392: 598-601.

macular degeneration. Am J Ophthalmol 137: 486-95.

macular degeneration. Arch Ophthalmol 106: 55-7.

aspects of macular degeneration. Prog Retin Eye Res 20: 385-414.

The purpose of this chapter is to present a review of recent functional neuroimaging (fMRI) studies of binocular vision, including binocular depth and rivalry, as well as a review of studies of perceptual multistability. As such, we will first emphasize the binocular aspects of binocular rivalry, while later emphasizing the rivalrous aspects. The interrelationship of binocular depth and rivalry, as well as multistability, will be described with reference to fMRI studies and single-unit recording studies in animals. These studies have provided provocative new evidence that the neural substrates for depth and rivalry, as well as other forms of multistability are remarkably similar. We will also describe our own research findings from two recent experiments, in which we performed (1) a direct comparison between binocular rivalry and depth, and (2) a direct comparison between binocular rivalry and monocular rivalry, a related form of bistability [1,2]. Our studies are unique in using both matched stimulation and comparable tasks, overcoming a limitation in the interpretation of many previous studies. As a result, these experiments are particularly relevant in delineating some of the global similarities and differences in the cortical networks activated in each of these different domains.

## **2. Binocular depth**

Binocular depth perception arises as a consequence of the slightly displaced point of view of the two eyes. The horizontal displacement of image features in the two eyes (i.e. binocular disparities) makes it possible to reconstruct the depth relationships in the visual world. Binocular matching of local features in the retinal images may be used to obtain estimates of the absolute disparity (and distance) of objects or surfaces, as well as the relative disparity (or relative distances) between different objects. An example of an image with binocular depth is shown in Figure 1a. If the left and right images are cross-fused, the image appears

© 2012 Buckthought and Mendola, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

to be tilted with the top coming forward. This occurs because the visual system interprets the greater shift in matched features at the top of the image as a displacement in front of the fixation plane. In general, crossed disparities, in which image features in the right image are shifted to the right, and image features in the left image are shifted to the left, are interpreted as in front of the fixation plane. Uncrossed disparities (with shifts in the opposite directions) are interpreted as behind the fixation plane. Absolute disparities are the total shift in front or behind the plane of fixation, while relative disparities can be computed as the difference in these absolute disparities for different objects. Thus the initial steps in recovering binocular depth relationships involve determining the horizontal shift in image features between the two eyes.

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 85

**Figure 2.** Brain areas highlighted in fMRI studies for (a) depth, (b) rivalry and (c) multistability.

depth perception involve a number of processing levels from early visual areas in the occipital lobe to higher-level occipito-parietal and frontal areas [3]. An early study of depth restricted the analysis to visual cortical areas, and found that disparity selectivity was present in areas V1, V2, V3, V3A and MT+ [4]. However, later studies which performed analysis over a larger number of visual areas found that the activation levels are highest in dorsal occipito-parietal areas, such as V3A, V7, V4d-topo and caudal intraparietal sulcus relative to others [3,5-8]. Moreover, considering all these studies together, the most consistent sites of activation for depth across many previous fMRI studies were V3A, V7, V4d-topo, or other lateral occipital areas, such as MT+, lateral occipital complex, and kinetic

**Figure 1.** (a) Binocular depth and (b) rivalry. (c) Plaids in which both depth and rivalry are perceived.

The most prominent cortical areas which have been activated by binocular depth in previous studies have been superimposed on the right hemisphere of the human brain in Figure 2a. The brain areas highlighted in the Figure are occipital visual areas (V1, V2, V3, V4, V3A, V7 and lateral occipital cortex, LO), superior parietal cortex (SP), inferior parietal cortex/intraparietal sulcus (IP), temporoparietal junction (TPJ), ventral temporal cortex (VT), middle frontal gyrus (MF), inferior frontal gyrus (IF), premotor cortex (PM), supplementary motor area (SMA), frontal eye fields (FEF) and insula/frontal operculum (FO). The numeric labels refer to the number of fMRI studies which reported prominent activation at each cortical site (colour coded on a scale ranging from red to blue, from highest to lowest). Table 1 lists the studies which were used to compile these numbers. Note that if it was not absolutely clear that a particular area was reported, this is indicated with an asterisk in Table 1. It should be noted that some studies did not perform a whole brain analysis, or may not have had an interest in reporting activation in certain areas, so this may bias the numbers that appear in Figure 2 and Table 1. Some studies limited their analysis to occipital sites of activation only. Nevertheless, it is clear that the neural mechanisms for binocular

features between the two eyes.

to be tilted with the top coming forward. This occurs because the visual system interprets the greater shift in matched features at the top of the image as a displacement in front of the fixation plane. In general, crossed disparities, in which image features in the right image are shifted to the right, and image features in the left image are shifted to the left, are interpreted as in front of the fixation plane. Uncrossed disparities (with shifts in the opposite directions) are interpreted as behind the fixation plane. Absolute disparities are the total shift in front or behind the plane of fixation, while relative disparities can be computed as the difference in these absolute disparities for different objects. Thus the initial steps in recovering binocular depth relationships involve determining the horizontal shift in image

**Figure 1.** (a) Binocular depth and (b) rivalry. (c) Plaids in which both depth and rivalry are perceived.

The most prominent cortical areas which have been activated by binocular depth in previous studies have been superimposed on the right hemisphere of the human brain in Figure 2a. The brain areas highlighted in the Figure are occipital visual areas (V1, V2, V3, V4, V3A, V7 and lateral occipital cortex, LO), superior parietal cortex (SP), inferior parietal cortex/intraparietal sulcus (IP), temporoparietal junction (TPJ), ventral temporal cortex (VT), middle frontal gyrus (MF), inferior frontal gyrus (IF), premotor cortex (PM), supplementary motor area (SMA), frontal eye fields (FEF) and insula/frontal operculum (FO). The numeric labels refer to the number of fMRI studies which reported prominent activation at each cortical site (colour coded on a scale ranging from red to blue, from highest to lowest). Table 1 lists the studies which were used to compile these numbers. Note that if it was not absolutely clear that a particular area was reported, this is indicated with an asterisk in Table 1. It should be noted that some studies did not perform a whole brain analysis, or may not have had an interest in reporting activation in certain areas, so this may bias the numbers that appear in Figure 2 and Table 1. Some studies limited their analysis to occipital sites of activation only. Nevertheless, it is clear that the neural mechanisms for binocular

**Figure 2.** Brain areas highlighted in fMRI studies for (a) depth, (b) rivalry and (c) multistability.

depth perception involve a number of processing levels from early visual areas in the occipital lobe to higher-level occipito-parietal and frontal areas [3]. An early study of depth restricted the analysis to visual cortical areas, and found that disparity selectivity was present in areas V1, V2, V3, V3A and MT+ [4]. However, later studies which performed analysis over a larger number of visual areas found that the activation levels are highest in dorsal occipito-parietal areas, such as V3A, V7, V4d-topo and caudal intraparietal sulcus relative to others [3,5-8]. Moreover, considering all these studies together, the most consistent sites of activation for depth across many previous fMRI studies were V3A, V7, V4d-topo, or other lateral occipital areas, such as MT+, lateral occipital complex, and kinetic


Neural Mechanisms for Binocular Depth, Rivalry and Multistability 87

52%, V4: 71%, MT: 95%, MST: 92%, caudal intraparietal sulcus: 78%, lateral intraparietal sulcus: 75%, TE: 78%, and frontal eye fields: 65%). Hence both fMRI and single unit recording studies indicate that binocular depth processing is not restricted to one cortical region but is processed in many areas, presumably subserving a wide range of different

These studies indicate that a number of higher-level retinotopic visual areas are particularly prominent in depth processing, and yet the borders between these areas are sometimes difficult to distinguish across different studies. For example, V4d-topo has been defined as the human topographic homolog (topolog), an area situated (1) superior to V4v, (2) anterior to V3A, and (3) posterior to MT+, and should be distinguished from more ventral lateral occipital cortex [40]. V7 is an area adjacent and anterior to V3A that contains a hemifield map [40]. Hence V7 and V4d-topo both lie between V3A and area MT+, and the border between these areas is not always clear using conventional retinotopic mapping techniques, or retrospective review. Another region, referred to as the kinetic occipital area (KO), is particularly responsive to disparity edges, and appears to lie within V4d-topo [18]. Lateral

The important depth processing areas can be subdivided into distinct dorsal and ventral processing streams, referred to as "what" and "where" pathways, related to the identification of objects, or actions relative to objects, respectively [7,10,14,23]. The dorsal stream is believed to project from the occipital visual areas to parietal areas, while the ventral stream projects from occipital to temporal areas [41]. More specifically, the ventral stream begins with V1, goes through visual area V2, then through visual area V4, and to the inferior/ventral temporal cortex, which includes lateral occipital cortex, fusiform gyrus and other ventral temporal areas, including the areas mentioned above for depth (Figure 2). The dorsal stream begins with V1, goes through area V2, then V3A, V6, V7 and MT+, and then to the posterior parietal cortex, including the parietal areas mentioned above (superior parietal lobe and intraparietal sulcus) for depth (Figure 2). The anatomical locations of some of these areas are also shown in Figure 3. The black contours superimposed on the top middle panel show locations for areas V1, V2, V3, V4d-topo, V3A, V7, lateral occipital cortex, and MT+ based upon retinotopic mapping and anatomical landmarks for one subject. (This Figure

will also be discussed further below, in the discussion of rivalry and multistability).

The dissociation between dorsal and ventral areas has been most clearly delineated in single unit recording studies in the macaque [42]. In these studies, ventral areas have been found to have some of the properties necessary for object recognition, such as a detailed 3-D shape description of surface boundaries and surface content. In fact, specific responses are evoked only by binocular stimuli in which depth is perceived, but do not vary if depth is specified by different cues [42]. Conversely, dorsal areas (such as the intraparietal sulcus) have some of the properties necessary for making actions, such as selectivity for orientation in depth of surfaces and elongated objects. Moreover, their responses are invariant to changes in depth cues [42].

occipital areas LO-1 and LO-2 also lie within KO [39].

**2.1. Dorsal and ventral processing streams** 

functions.

occipital area [2,3,5,6,8,10,13], as well as intraparietal sulcus [2,3,6-12] and superior parietal lobe [2,6,7,8,10-12]. In addition, ventral temporal cortical areas, including the fusiform gyrus, have also been noted to be depth selective [2,16,17].

As reviewed in [3], single unit recording studies in the macaque monkey have also reported that a large percentage of neurons are sensitive to binocular disparity, disparity-defined 3D shape or 3-D surface orientation in many cortical areas (V1: 45%, V2: 65%, V3/V3A: 80%, VP: 52%, V4: 71%, MT: 95%, MST: 92%, caudal intraparietal sulcus: 78%, lateral intraparietal sulcus: 75%, TE: 78%, and frontal eye fields: 65%). Hence both fMRI and single unit recording studies indicate that binocular depth processing is not restricted to one cortical region but is processed in many areas, presumably subserving a wide range of different functions.

These studies indicate that a number of higher-level retinotopic visual areas are particularly prominent in depth processing, and yet the borders between these areas are sometimes difficult to distinguish across different studies. For example, V4d-topo has been defined as the human topographic homolog (topolog), an area situated (1) superior to V4v, (2) anterior to V3A, and (3) posterior to MT+, and should be distinguished from more ventral lateral occipital cortex [40]. V7 is an area adjacent and anterior to V3A that contains a hemifield map [40]. Hence V7 and V4d-topo both lie between V3A and area MT+, and the border between these areas is not always clear using conventional retinotopic mapping techniques, or retrospective review. Another region, referred to as the kinetic occipital area (KO), is particularly responsive to disparity edges, and appears to lie within V4d-topo [18]. Lateral occipital areas LO-1 and LO-2 also lie within KO [39].

## **2.1. Dorsal and ventral processing streams**

86 Visual Cortex – Current Status and Perspectives

Ventrolateral prefrontal

Middle frontal gyrus or Dorsolateral prefrontal

Inferior parietal lobe (intraparietal sulcus)

Temporoparietal occipital

Ventral temporal cortex

cortex

cortex

junction

(fusiform)

Depth Rivalry Multistability

[2] [1,2] [1,30]

[2,10] [1,2,19-22] [1,19,31,32,34-36]

[2,3,5-11] [1,2,5,19-21,23] [1,19,29,30,32-

[2,16,17] [2,20-25] [1,29,32,36\*]

28]

[2] [1,2,19-22] [19,29,30,32,33,35

35,36\*,37-38]

35,36\*,37-38]

36\*, 37,38]

[1,32,38]


Frontal eye fields [2] [1,19-21] [1,19,29,30] Anterior cingulate [1,19,20] [19,30-34] Supplementary motor area [2] [1,2,20,22] [1,33,35,36] Primary motor cortex [20,22] [30,34] Premotor cortex [2,9] [1] [1,35-37] Insula/frontal operculum [2] [1,2,19-22] [1,19,30,36]

Inferior frontal gyrus [2] [2,19-22] [1,19,33] Inferior frontal junction [2] [1,2,19] [1,19]

Superior parietal lobe [2,6-8,10-12] [1,19-22] [1,19,30,32,34-

MT+ [2,4-15] [1,2,19,21] [19,30,32,33,35\*,

Lateral occipital complex [5,6,8,9\*,16,17] [5,22,23] [35\*]

V7 [2,3,5-9,14] [2,21\*,22\*,23] [38] Kinetic occipital area (KO) [2,3,6,8,13,18] [2] [30] V4 [2,3,5-7,13-15,18] [2, 5,26,27] [32,38]

V3 [2-16] [1,2,20,21,26-28] [1,31,32,37] V2 [2-7,10-12,14,15] [1,2,21,26-28] [1,31-33,37] V1 [2-7,10,14,15] [1,2,26-28] [1,32,33]

**Table 1.** fMRI studies which highlighted particular brain areas for depth, rivalry and multistability.

occipital area [2,3,5,6,8,10,13], as well as intraparietal sulcus [2,3,6-12] and superior parietal lobe [2,6,7,8,10-12]. In addition, ventral temporal cortical areas, including the fusiform

As reviewed in [3], single unit recording studies in the macaque monkey have also reported that a large percentage of neurons are sensitive to binocular disparity, disparity-defined 3D shape or 3-D surface orientation in many cortical areas (V1: 45%, V2: 65%, V3/V3A: 80%, VP:

V3A [2-12,14] [1,2,21\*,22\*,23,27,

gyrus, have also been noted to be depth selective [2,16,17].

The important depth processing areas can be subdivided into distinct dorsal and ventral processing streams, referred to as "what" and "where" pathways, related to the identification of objects, or actions relative to objects, respectively [7,10,14,23]. The dorsal stream is believed to project from the occipital visual areas to parietal areas, while the ventral stream projects from occipital to temporal areas [41]. More specifically, the ventral stream begins with V1, goes through visual area V2, then through visual area V4, and to the inferior/ventral temporal cortex, which includes lateral occipital cortex, fusiform gyrus and other ventral temporal areas, including the areas mentioned above for depth (Figure 2). The dorsal stream begins with V1, goes through area V2, then V3A, V6, V7 and MT+, and then to the posterior parietal cortex, including the parietal areas mentioned above (superior parietal lobe and intraparietal sulcus) for depth (Figure 2). The anatomical locations of some of these areas are also shown in Figure 3. The black contours superimposed on the top middle panel show locations for areas V1, V2, V3, V4d-topo, V3A, V7, lateral occipital cortex, and MT+ based upon retinotopic mapping and anatomical landmarks for one subject. (This Figure will also be discussed further below, in the discussion of rivalry and multistability).

The dissociation between dorsal and ventral areas has been most clearly delineated in single unit recording studies in the macaque [42]. In these studies, ventral areas have been found to have some of the properties necessary for object recognition, such as a detailed 3-D shape description of surface boundaries and surface content. In fact, specific responses are evoked only by binocular stimuli in which depth is perceived, but do not vary if depth is specified by different cues [42]. Conversely, dorsal areas (such as the intraparietal sulcus) have some of the properties necessary for making actions, such as selectivity for orientation in depth of surfaces and elongated objects. Moreover, their responses are invariant to changes in depth cues [42].

A dissociation between dorsal and ventral areas in human fMRI studies relates to greater selectivity for object recognition using shape defined by disparity in ventral areas [15,17,43- 46]. One particular study took advantage of the fact that objects are more easily recognized if they lie in front of a background plane, than if they lie behind a plane [17]. The stimuli consisted of stereo-defined line drawings of objects that either protruded in front or behind a background plane. The activation in ventral and lateral occipital cortex, or lateral occipital complex (LOC), was greater for the objects which were located in front of the background plane, and the activity in these ventral stream areas was also strongly correlated with behavioral object recognition performance. Several other studies also found that activity in the lateral occipital complex could be related to the representation of shape from disparity by (1) making comparisons between object shapes with or without disparity [43], or (2) by comparisons between object shape conditions in which the 2-D monocular contour did not vary but the perceived 3-D shape differed [44]. The lateral occipital complex has also been found to be selective for convex and concave shapes defined by disparity, and is preferentially selective for convex shapes. This fits with behavioral measures since the visual system shows greater sensitivity for the perception of convex shapes [47]. A final study found that the lateral occipital complex combines disparity with perspective information to represent perceived three-dimensional shape [15].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 89

Several other studies have found dissociations between the properties of dorsal and ventral areas with regards to the representation of disparity magnitude. One study found that when disparity was parametrically varied, the BOLD signal increased with disparity only in dorsal areas of the occipito-parietal cortex (i.e. V2, V3, V3A, as well as inferior and superior parietal lobe) [10]. Another study used a comparison of activation for correlated versus anticorrelated random dot stereograms (only the former case supports a depth percept) in order to assess depth selectivity across a number of areas [8]. Disparity selectivity was found in dorsal (visual and parietal) areas, including V3A, V7, MT+, intraparietal sulcus and superior parietal lobe, as well as ventral area LO (ventral lateral occipital cortex), but not in early (V1, V2) or intermediate ventral (V3v, V4) visual cortical areas. Furthermore, only dorsal areas were found to encode metric disparity (disparity magnitude), whereas ventral area LO appeared to represent depth position in a categorical manner (i.e., disparity sign). The findings suggest that activity in both dorsal and ventral visual streams reflects binocular depth perception but the neural computations may differ [8]. Consistent with these results, a third study measured the responses across a number of occipital and parietal areas to different magnitudes of binocular disparity [7]. Across all areas, there was an increase in BOLD signal with increasing disparity. However, the greatest modulation of response was found in dorsal visual and parietal areas, including V3A, MT+, V7, intraparietal sulcus and superior parietal lobe. These differences contrast with the response to the zero disparity plane stimulus, which is greatest in the early visual areas, smaller in the ventral and dorsal visual areas, and absent in parietal areas. These results illustrate that the dorsal stream can reliably represent and discriminate a large range of disparities [7]. Moreover, these findings indicate distinct computations performed in (possibly) different

cortical areas, including fusional matching, metric depth, and categorical depth.

The human parietal cortex is believed to extract three-dimensional shape representations that can support the ability to manipulate objects both physically and mentally (as reviewed in [6]). Lesions to the posterior parietal lobe can cause profound deficits in spatial awareness, including neglect of the contralateral half of visual space, inability to draw simple three-dimensional objects such as a cube, and inability to estimate distance and size [48]. The superior and inferior parietal areas of activation identified in human fMRI studies of binocular depth include several intraparietal sulcus (IPS) regions involved in 3D shape perception from disparity, dorsal IPS anterior (DIPSA) and dorsal IPS medial (DIPSM), ventral IPS (VIPS)/V7, and parieto-occipital POIPS [6,9,42]. These parietal regions extract 3D shape representations that can support motor functions, such as grasping hand movements or saccadic eye movements toward objects [42]. Regions DIPSM and DIPSA have been found to be sensitive to depth structure (i.e., spatial variations in depth along surfaces arising from disparity), but not position in depth, while a more posterior region, the ventral IPS (VIP) had a mixed sensitivity [6]. Regions DIPSM and DIPSA likely correspond to LIP and AIP in the monkey and process depth information necessary in order to make eye or hand movements, respectively [6,42]. These parietal areas (DIPSA and DIPSM) are also more

**2.2. Posterior parietal cortex** 

**Figure 3.** (a) Activation for monocular rivalry or (b) binocular rivalry above the blank baseline at three contrasts (9%, 18%, 36%).

Several other studies have found dissociations between the properties of dorsal and ventral areas with regards to the representation of disparity magnitude. One study found that when disparity was parametrically varied, the BOLD signal increased with disparity only in dorsal areas of the occipito-parietal cortex (i.e. V2, V3, V3A, as well as inferior and superior parietal lobe) [10]. Another study used a comparison of activation for correlated versus anticorrelated random dot stereograms (only the former case supports a depth percept) in order to assess depth selectivity across a number of areas [8]. Disparity selectivity was found in dorsal (visual and parietal) areas, including V3A, V7, MT+, intraparietal sulcus and superior parietal lobe, as well as ventral area LO (ventral lateral occipital cortex), but not in early (V1, V2) or intermediate ventral (V3v, V4) visual cortical areas. Furthermore, only dorsal areas were found to encode metric disparity (disparity magnitude), whereas ventral area LO appeared to represent depth position in a categorical manner (i.e., disparity sign). The findings suggest that activity in both dorsal and ventral visual streams reflects binocular depth perception but the neural computations may differ [8]. Consistent with these results, a third study measured the responses across a number of occipital and parietal areas to different magnitudes of binocular disparity [7]. Across all areas, there was an increase in BOLD signal with increasing disparity. However, the greatest modulation of response was found in dorsal visual and parietal areas, including V3A, MT+, V7, intraparietal sulcus and superior parietal lobe. These differences contrast with the response to the zero disparity plane stimulus, which is greatest in the early visual areas, smaller in the ventral and dorsal visual areas, and absent in parietal areas. These results illustrate that the dorsal stream can reliably represent and discriminate a large range of disparities [7]. Moreover, these findings indicate distinct computations performed in (possibly) different cortical areas, including fusional matching, metric depth, and categorical depth.

#### **2.2. Posterior parietal cortex**

88 Visual Cortex – Current Status and Perspectives

represent perceived three-dimensional shape [15].

contrasts (9%, 18%, 36%).

A dissociation between dorsal and ventral areas in human fMRI studies relates to greater selectivity for object recognition using shape defined by disparity in ventral areas [15,17,43- 46]. One particular study took advantage of the fact that objects are more easily recognized if they lie in front of a background plane, than if they lie behind a plane [17]. The stimuli consisted of stereo-defined line drawings of objects that either protruded in front or behind a background plane. The activation in ventral and lateral occipital cortex, or lateral occipital complex (LOC), was greater for the objects which were located in front of the background plane, and the activity in these ventral stream areas was also strongly correlated with behavioral object recognition performance. Several other studies also found that activity in the lateral occipital complex could be related to the representation of shape from disparity by (1) making comparisons between object shapes with or without disparity [43], or (2) by comparisons between object shape conditions in which the 2-D monocular contour did not vary but the perceived 3-D shape differed [44]. The lateral occipital complex has also been found to be selective for convex and concave shapes defined by disparity, and is preferentially selective for convex shapes. This fits with behavioral measures since the visual system shows greater sensitivity for the perception of convex shapes [47]. A final study found that the lateral occipital complex combines disparity with perspective information to

**Figure 3.** (a) Activation for monocular rivalry or (b) binocular rivalry above the blank baseline at three

The human parietal cortex is believed to extract three-dimensional shape representations that can support the ability to manipulate objects both physically and mentally (as reviewed in [6]). Lesions to the posterior parietal lobe can cause profound deficits in spatial awareness, including neglect of the contralateral half of visual space, inability to draw simple three-dimensional objects such as a cube, and inability to estimate distance and size [48]. The superior and inferior parietal areas of activation identified in human fMRI studies of binocular depth include several intraparietal sulcus (IPS) regions involved in 3D shape perception from disparity, dorsal IPS anterior (DIPSA) and dorsal IPS medial (DIPSM), ventral IPS (VIPS)/V7, and parieto-occipital POIPS [6,9,42]. These parietal regions extract 3D shape representations that can support motor functions, such as grasping hand movements or saccadic eye movements toward objects [42]. Regions DIPSM and DIPSA have been found to be sensitive to depth structure (i.e., spatial variations in depth along surfaces arising from disparity), but not position in depth, while a more posterior region, the ventral IPS (VIP) had a mixed sensitivity [6]. Regions DIPSM and DIPSA likely correspond to LIP and AIP in the monkey and process depth information necessary in order to make eye or hand movements, respectively [6,42]. These parietal areas (DIPSA and DIPSM) are also more strongly activated by curved surfaces than tilted surfaces. Hence these parietal areas (DIPSA, DIPSM and VIPS) show a full representation of a range of different 3D shapes from disparity, including frontoparallel, tilted and curved shapes [9]. Furthermore, these parietal areas appear to be involved in cue-invariant processing of 3D shape, including processing of monocular cues to depth (e.g., texture gradients, perspective, motion, shading) [6,42]. V7/VIPS is also an area sensitive to depth structure, depth position, as well as other cues which contribute to the representation of depth relationships, such as motion, 3D-structure from motion and 2D shape [6,9]. In previous fMRI studies, this area has also been described as showing activation strongly correlated with the magnitude of depth defined by disparity and was strongly correlated to the amount of depth perceived by subjects [9].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 91

oblique and right oblique oriented images. Because the retinal image stays constant, while the visual percept changes, this provides a popular method for studying conscious visual experience. Binocular rivalry has been modeled with interocular inhibition between monocular neurons representing the orthogonal left and right image components, as well as neuronal adaptation [51-54]. A monocular neuron may respond to a particular orientation (e.g. left oblique), and suppresses the response of neurons tuned to the opposite (i.e. right oblique) orientation. The neuron will continue to respond until adaptation or fatigue allows the other neurons to respond in turn to the opposite orientation. In the research lab, particularly robust binocular rivalry is created by presenting a number of different types of incompatible images to the two eyes, which may include simple gratings, contours or more complex images such as a face and house or other objects [51] (see also Figure 4(a-b) and 4(e-

The interrelationship between binocular depth and rivalry has been a subject of longstanding debate and interest [51,55,56]. Generally, binocular rivalry ensues when the image features in the two eyes are too dissimilar to be reconciled, and depth cannot be recovered because the binocular disparities are too great. In our normal visual experience, there may be dissimilar features in the two eyes, but a strong sensation of binocular rivalry occurs only rarely. When unmatched rivalrous components are present in an image, this interferes with normal binocular depth perception [55]. Presumably this occurs because suppression from the unmatched components prevents binocular matching necessary for depth. Consistently, it has been proposed that binocular rivalry is the default outcome which arises when binocular fusion and depth fails [51,55]. However, recent studies have provided evidence that binocular rivalry and depth can be observed simultaneously over the same spatial location, which calls into question previous models and interpretations [2,57-58]. Moreover, there are a number of possible new interpretations which could reconcile these results. In particular, both binocular rivalry and depth involve mechanisms of binocular matching, to find correspondences between image features for depth or to detect larger, irreconcilable differences, in the case of binocular rivalry. The mechanisms for binocular rivalry may inhibit false matches at different orientations, effectively suppressing noise in neural responses and sharpening the tuning of orientational mechanisms, which can also be related to the phenomenon of dichoptic masking [59, 60]. Hence the strong inhibitory interactions which we are familiar with in the phenomenon of binocular rivalry may be fundamental in the resolution of ambiguity in binocular vision. Consequently, it has been found in fMRI studies that these processing mechanisms for depth and rivalry are closely related in many brain areas, and occur in parallel throughout the visual system [2, 3,5,20,61]. Models of binocular rivalry presume that binocular rivalry occurs as a consequence of interocular competition between monocular neurons, which would be expected to occur at early levels in the visual system [51-54]. This has been verified in a number of functional neuroimaging studies of binocular rivalry, which have reported eye-specific dominance and suppression in early visual areas in the occipital cortex (V1, V2, V3) [26,27,62], or the lateral geniculate nucleus (LGN) [63,64]. As expected from the theoretical models, a number of fMRI studies have also documented alternating response suppression, as well as neural

f)).

#### **2.3. Questions for future study**

The functional neuroimaging studies to date have broadly defined some of the functions of different areas in binocular vision, and delineated dorsal and ventral processing streams. There appears to be a progression in the dorsal pathway from more basic binocular processing in early visual areas, towards the metrical encoding of binocular depth in parietal areas, presumably to support eye or hand movements towards objects. Likewise, the ventral pathway appears to involve a progressive refinement towards depth encoding to support object recognition. However, in either case the processing stages are not understood. Important issues for future study will be to examine this in greater detail and draw stronger inferences in relating the functions of different areas. For example, a few studies have tried to compare the representation of relative and absolute disparity across a number of areas, with conflicting results, although there does appear to be a tendency for relative disparity to be encoded in ventral areas while both absolute and relative disparity are encoded in dorsal areas [3,10, 14]. The encoding of relative disparity is likely to be very important in object recognition, while both absolute and relative disparity may turn out to be important in perception for action. Future studies could explain more clearly how these different cues may be used in different contexts and with different tasks. Also, relatively few studies have examined the role of stereoscopic cues in complex object recognition. The absence of studies in this area may be related to the belief held by many investigators that binocular disparity is not critical in recognition of faces or other complex objects (for example, see [49,50]). However, ventral areas have selectivity to binocular disparity and hence it would be important to investigate further the role of these areas in binocular vision.

## **3. Binocular rivalry**

If the images in the two eyes are not the same or similar, but rather incompatibly different, another distinctive perceptual state results. In binocular rivalry, incompatible images, such as left and right oblique oriented gratings, are presented to the two eyes. Observers typically perceive only one image at a time, and perception alternates between the left and right image every few seconds. An example of binocular rivalry is shown in Figure 1b. If the left and right images are cross-fused, alternations may be clearly perceived between the left oblique and right oblique oriented images. Because the retinal image stays constant, while the visual percept changes, this provides a popular method for studying conscious visual experience. Binocular rivalry has been modeled with interocular inhibition between monocular neurons representing the orthogonal left and right image components, as well as neuronal adaptation [51-54]. A monocular neuron may respond to a particular orientation (e.g. left oblique), and suppresses the response of neurons tuned to the opposite (i.e. right oblique) orientation. The neuron will continue to respond until adaptation or fatigue allows the other neurons to respond in turn to the opposite orientation. In the research lab, particularly robust binocular rivalry is created by presenting a number of different types of incompatible images to the two eyes, which may include simple gratings, contours or more complex images such as a face and house or other objects [51] (see also Figure 4(a-b) and 4(ef)).

90 Visual Cortex – Current Status and Perspectives

**2.3. Questions for future study** 

**3. Binocular rivalry** 

strongly activated by curved surfaces than tilted surfaces. Hence these parietal areas (DIPSA, DIPSM and VIPS) show a full representation of a range of different 3D shapes from disparity, including frontoparallel, tilted and curved shapes [9]. Furthermore, these parietal areas appear to be involved in cue-invariant processing of 3D shape, including processing of monocular cues to depth (e.g., texture gradients, perspective, motion, shading) [6,42]. V7/VIPS is also an area sensitive to depth structure, depth position, as well as other cues which contribute to the representation of depth relationships, such as motion, 3D-structure from motion and 2D shape [6,9]. In previous fMRI studies, this area has also been described as showing activation strongly correlated with the magnitude of depth defined by disparity

The functional neuroimaging studies to date have broadly defined some of the functions of different areas in binocular vision, and delineated dorsal and ventral processing streams. There appears to be a progression in the dorsal pathway from more basic binocular processing in early visual areas, towards the metrical encoding of binocular depth in parietal areas, presumably to support eye or hand movements towards objects. Likewise, the ventral pathway appears to involve a progressive refinement towards depth encoding to support object recognition. However, in either case the processing stages are not understood. Important issues for future study will be to examine this in greater detail and draw stronger inferences in relating the functions of different areas. For example, a few studies have tried to compare the representation of relative and absolute disparity across a number of areas, with conflicting results, although there does appear to be a tendency for relative disparity to be encoded in ventral areas while both absolute and relative disparity are encoded in dorsal areas [3,10, 14]. The encoding of relative disparity is likely to be very important in object recognition, while both absolute and relative disparity may turn out to be important in perception for action. Future studies could explain more clearly how these different cues may be used in different contexts and with different tasks. Also, relatively few studies have examined the role of stereoscopic cues in complex object recognition. The absence of studies in this area may be related to the belief held by many investigators that binocular disparity is not critical in recognition of faces or other complex objects (for example, see [49,50]). However, ventral areas have selectivity to binocular disparity and hence it would be important to investigate further the role of these areas in binocular vision.

If the images in the two eyes are not the same or similar, but rather incompatibly different, another distinctive perceptual state results. In binocular rivalry, incompatible images, such as left and right oblique oriented gratings, are presented to the two eyes. Observers typically perceive only one image at a time, and perception alternates between the left and right image every few seconds. An example of binocular rivalry is shown in Figure 1b. If the left and right images are cross-fused, alternations may be clearly perceived between the left

and was strongly correlated to the amount of depth perceived by subjects [9].

The interrelationship between binocular depth and rivalry has been a subject of longstanding debate and interest [51,55,56]. Generally, binocular rivalry ensues when the image features in the two eyes are too dissimilar to be reconciled, and depth cannot be recovered because the binocular disparities are too great. In our normal visual experience, there may be dissimilar features in the two eyes, but a strong sensation of binocular rivalry occurs only rarely. When unmatched rivalrous components are present in an image, this interferes with normal binocular depth perception [55]. Presumably this occurs because suppression from the unmatched components prevents binocular matching necessary for depth. Consistently, it has been proposed that binocular rivalry is the default outcome which arises when binocular fusion and depth fails [51,55]. However, recent studies have provided evidence that binocular rivalry and depth can be observed simultaneously over the same spatial location, which calls into question previous models and interpretations [2,57-58]. Moreover, there are a number of possible new interpretations which could reconcile these results. In particular, both binocular rivalry and depth involve mechanisms of binocular matching, to find correspondences between image features for depth or to detect larger, irreconcilable differences, in the case of binocular rivalry. The mechanisms for binocular rivalry may inhibit false matches at different orientations, effectively suppressing noise in neural responses and sharpening the tuning of orientational mechanisms, which can also be related to the phenomenon of dichoptic masking [59, 60]. Hence the strong inhibitory interactions which we are familiar with in the phenomenon of binocular rivalry may be fundamental in the resolution of ambiguity in binocular vision. Consequently, it has been found in fMRI studies that these processing mechanisms for depth and rivalry are closely related in many brain areas, and occur in parallel throughout the visual system [2, 3,5,20,61].

Models of binocular rivalry presume that binocular rivalry occurs as a consequence of interocular competition between monocular neurons, which would be expected to occur at early levels in the visual system [51-54]. This has been verified in a number of functional neuroimaging studies of binocular rivalry, which have reported eye-specific dominance and suppression in early visual areas in the occipital cortex (V1, V2, V3) [26,27,62], or the lateral geniculate nucleus (LGN) [63,64]. As expected from the theoretical models, a number of fMRI studies have also documented alternating response suppression, as well as neural

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 93

temporal areas selective for faces and houses (i.e. fusiform face area and parahippocampal place area, respectively). The amplitudes of percept-related fMRI signal fluctuations during binocular rivalry in these visual areas are similar to those during actual stimulus alternations, suggesting that the conflict has been resolved at this stage, with no representation of the suppressed stimulus. Hence it appears that there is a progression from early visual areas towards higher-level areas in the magnitude of suppression, with the

One fMRI study of binocular rivalry used analysis methods to detect whether temporal correlations were present in the activity in areas V2/V3 and other cortical areas during the perception of rivalry with no task [21]. Indeed, the results confirmed that many of the areas listed above for rivalry were related through a covariation of activity, indicating that these widespread, extrastriate ventral, superior and inferior parietal and prefrontal cortical areas comprise a network reflecting the changes in perception during rivalry. There was significant temporal modulation of activity in these areas that followed closely to the response patterns of human subjects indicating when perceptual alternations occurred. The results indicated that cooperative interactions between extrastriate visual and non-visual areas are important for conscious visual awareness, and that the prefrontal cortex may contribute to conscious vision. Another study of binocular rivalry inferred, using an eventrelated design, that activity in intraparietal sulcus preceded the onset of rivalrous alternations, providing evidence for a possible causal role for this area in initiating rivalry [61]. Intriguingly, the intraparietal sulcus was the only area identified in the event-related analysis in this study, and no frontal areas were implicated as playing a causal role in

The cortical areas activated by rivalry in previous fMRI studies are shown in Figure 2b and Table 1, for comparison with depth. Some of the differences between rivalry and depth can simply be accounted for by noting that there was a larger number of studies for depth than rivalry, particularly studies interested in reporting activation levels only in occipital areas. Nevertheless, the Figure makes it clear that parietal activation (i.e. intraparietal sulcus and superior parietal lobe) was prominently reported in both depth and rivalry studies. One exception to this is the temporoparietal junction, which was often reported as an activation site for rivalry but reported for depth only in one study, that actually employed a depth task [2]. This is consistent with our view that this area is usually active with stimulus-driven shifts of attention [65,67,70]. Overall, frontal activation was more prominent in rivalry than depth studies (such as dorsolateral and ventrolateral prefrontal cortex, middle frontal gyrus, inferior frontal gyrus, supplementary motor area, insula/frontal operculum), while occipital activation was relatively more prominent in depth studies (e.g. V3, V3A, V4d-topo, V7, lateral occipital complex and MT+). Some of these differences could be attributed to the

latter a closer match to the perceptual experience during binocular rivalry [53,69].

**3.1. Functional interactions between cortical areas in rivalry** 

perceptual alternations.

**3.2. Comparison of binocular depth and rivalry** 

**Figure 4.** Examples of multistability. (a-b) Binocular rivalry (gratings), (c) monocular rivalry (gratings), (d) Necker cube, (e-f) binocular rivalry (face/house), and (g) monocular rivalry (face/house).

activation related to attentional monitoring and selection, at later stages in the visual pathway. A number of occipito-parietal areas (e.g. V3A, V7 and intraparietal sulcus) are again represented, as was the case for depth. The most important visual areas reported for rivalry include V1, V2, V3, V3A, V4d-topo, V7 [5], lateral occipital areas (MT+/lateral occipital complex) [1,2,5,19,21-23], and ventral temporal areas [2,20-25]. In addition to these visual cortical areas, a number of frontal and parietal sites of activation were reported, which have been associated with top-down control of attention, or stimulus-driven shifts of spatial attention [65-68]. These parietal areas include superior parietal lobe [1,19-22], intraparietal sulcus [1,2,5,19-21,23,61], and temporoparietal junction [1,2,19-22]. These frontal areas associated with attentional control or shifts of attention include middle frontal gyrus (or dorsolateral prefrontal cortex) [1,2,19-22], ventrolateral prefrontal cortex and inferior frontal gyrus [1,2,19-22], as well as insula/frontal operculum [1,2,19-22]. Additional areas were reported which could also be associated with attentional shifts or related to the preparation and execution of motor reports, such as supplementary motor area [1,2,20,22], frontal eye fields (FEF) and anterior cingulate [1,2,19-21]. The activation of some areas, such as FEF, could possibly be related to eye movements, but some studies used controls to verify that the activation is not related to eye movements, but more likely related to covert shifts of attention [20, 29].

Functional neuroimaging studies of early visual areas (V1, V2 or V3) have found that fMRI signal fluctuations during the perception of rivalry are generally lower than the signal fluctuations evoked by actual stimulus changes, in which the stimulus is physically replaced by the alternative [26-28]. However, much stronger correlations occur between subjective perception in binocular rivalry and activity in higher-level visual areas, such as functionally specialized extrastriate cortex [24]. For example, in binocular rivalry in which alternations are perceived between a face and house, signal fluctuations can be discerned in ventral temporal areas selective for faces and houses (i.e. fusiform face area and parahippocampal place area, respectively). The amplitudes of percept-related fMRI signal fluctuations during binocular rivalry in these visual areas are similar to those during actual stimulus alternations, suggesting that the conflict has been resolved at this stage, with no representation of the suppressed stimulus. Hence it appears that there is a progression from early visual areas towards higher-level areas in the magnitude of suppression, with the latter a closer match to the perceptual experience during binocular rivalry [53,69].

## **3.1. Functional interactions between cortical areas in rivalry**

92 Visual Cortex – Current Status and Perspectives

attention [20, 29].

**Figure 4.** Examples of multistability. (a-b) Binocular rivalry (gratings), (c) monocular rivalry (gratings),

activation related to attentional monitoring and selection, at later stages in the visual pathway. A number of occipito-parietal areas (e.g. V3A, V7 and intraparietal sulcus) are again represented, as was the case for depth. The most important visual areas reported for rivalry include V1, V2, V3, V3A, V4d-topo, V7 [5], lateral occipital areas (MT+/lateral occipital complex) [1,2,5,19,21-23], and ventral temporal areas [2,20-25]. In addition to these visual cortical areas, a number of frontal and parietal sites of activation were reported, which have been associated with top-down control of attention, or stimulus-driven shifts of spatial attention [65-68]. These parietal areas include superior parietal lobe [1,19-22], intraparietal sulcus [1,2,5,19-21,23,61], and temporoparietal junction [1,2,19-22]. These frontal areas associated with attentional control or shifts of attention include middle frontal gyrus (or dorsolateral prefrontal cortex) [1,2,19-22], ventrolateral prefrontal cortex and inferior frontal gyrus [1,2,19-22], as well as insula/frontal operculum [1,2,19-22]. Additional areas were reported which could also be associated with attentional shifts or related to the preparation and execution of motor reports, such as supplementary motor area [1,2,20,22], frontal eye fields (FEF) and anterior cingulate [1,2,19-21]. The activation of some areas, such as FEF, could possibly be related to eye movements, but some studies used controls to verify that the activation is not related to eye movements, but more likely related to covert shifts of

Functional neuroimaging studies of early visual areas (V1, V2 or V3) have found that fMRI signal fluctuations during the perception of rivalry are generally lower than the signal fluctuations evoked by actual stimulus changes, in which the stimulus is physically replaced by the alternative [26-28]. However, much stronger correlations occur between subjective perception in binocular rivalry and activity in higher-level visual areas, such as functionally specialized extrastriate cortex [24]. For example, in binocular rivalry in which alternations are perceived between a face and house, signal fluctuations can be discerned in ventral

(d) Necker cube, (e-f) binocular rivalry (face/house), and (g) monocular rivalry (face/house).

One fMRI study of binocular rivalry used analysis methods to detect whether temporal correlations were present in the activity in areas V2/V3 and other cortical areas during the perception of rivalry with no task [21]. Indeed, the results confirmed that many of the areas listed above for rivalry were related through a covariation of activity, indicating that these widespread, extrastriate ventral, superior and inferior parietal and prefrontal cortical areas comprise a network reflecting the changes in perception during rivalry. There was significant temporal modulation of activity in these areas that followed closely to the response patterns of human subjects indicating when perceptual alternations occurred. The results indicated that cooperative interactions between extrastriate visual and non-visual areas are important for conscious visual awareness, and that the prefrontal cortex may contribute to conscious vision. Another study of binocular rivalry inferred, using an eventrelated design, that activity in intraparietal sulcus preceded the onset of rivalrous alternations, providing evidence for a possible causal role for this area in initiating rivalry [61]. Intriguingly, the intraparietal sulcus was the only area identified in the event-related analysis in this study, and no frontal areas were implicated as playing a causal role in perceptual alternations.

#### **3.2. Comparison of binocular depth and rivalry**

The cortical areas activated by rivalry in previous fMRI studies are shown in Figure 2b and Table 1, for comparison with depth. Some of the differences between rivalry and depth can simply be accounted for by noting that there was a larger number of studies for depth than rivalry, particularly studies interested in reporting activation levels only in occipital areas. Nevertheless, the Figure makes it clear that parietal activation (i.e. intraparietal sulcus and superior parietal lobe) was prominently reported in both depth and rivalry studies. One exception to this is the temporoparietal junction, which was often reported as an activation site for rivalry but reported for depth only in one study, that actually employed a depth task [2]. This is consistent with our view that this area is usually active with stimulus-driven shifts of attention [65,67,70]. Overall, frontal activation was more prominent in rivalry than depth studies (such as dorsolateral and ventrolateral prefrontal cortex, middle frontal gyrus, inferior frontal gyrus, supplementary motor area, insula/frontal operculum), while occipital activation was relatively more prominent in depth studies (e.g. V3, V3A, V4d-topo, V7, lateral occipital complex and MT+). Some of these differences could be attributed to the

more dynamic aspect of rivalry compared with depth, and the typical performance of a rivalry task, since these frontal and parietal areas have been associated with attention and working memory, as well as the performance of motor reports [65-68,70]. In particular, the anterior cingulate was reported for only rivalry but not depth.

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 95

temporal and frontal areas highlighted in Figure 2. Nevertheless, regions of superior and inferior parietal cortices (including temporoparietal junction and intraparietal sulcus) were activated more for the depth than the rivalry task, whereas a bias towards rivalry was seen in a lateral occipital region, calcarine, retrosplenial and ventral temporal areas. Thus, these results are important in showing that while parietal areas were clearly strongly activated by either depth or rivalry, consistent with previous studies (as discussed with reference to Figure 2 above), the activation levels were actually higher for depth when the two stimulus conditions had been equalized. This fits with an important role of these parietal areas in depth encoding in order to make hand or eye movements, which has been documented extensively (e.g. [6,9,42]). Conversely, lateral occipital area and ventral temporal areas were more specific for rivalry, consistent with a relatively greater number of studies which showed that these areas may be particularly relevant for the perception of rivalry [5,20-25].

Finally, in another manipulation, we included as a control, an orientation change task, which had similar stimulus features to the depth and rivalry tasks. In this case, the subject had to indicate with a key press which way the image was rotated. The orientation change condition required binocular fusion of matched features but evoked neither depth nor rivalry, serving to isolate those stages of binocular combination. This task was also matched to the depth and rivalry tasks in terms of the number of stimulus changes (which occurred every 3 s) and key presses. When the orientation change task was subtracted from either the depth or rivalry task, a lateral occipital area was highlighted, as well as V3A, V7, or ventral intraparietal sulcus (VIPS), and the kinetic occipital area (KO), including LO-1 and LO-2 [18]. This result indicated that these are areas active for either depth or rivalry, and may subserve a representation at the surface-level that would facilitate the grouping of features, and allow for more than one feature (i.e. depth or rivalry) to be coded at a spatial location

In conclusion, the combined results of fMRI and psychophysical studies indicate that depth and rivalry are processed in a similar network of cortical areas and are perceived simultaneously by coexisting in different spatial frequency or orientation channels (see [2,58] for further discussion of the latter point). An important aspect of the results reviewed was that the same frontal and parietal areas were prominently activated for both depth and rivalry. So by matching depth and rivalry for stimulus characteristics and task we found that globally similar sites would be activated, even though depth does not involve overt, endogeneous competition between alternate percepts. We confirmed that all of the prominent sites of activation for rivalry were also present for depth, including frontal (FEF, PM, SMA, MF, IF, IFJ, DLPF, VLPF, FO) and parietal areas (SP, IP and TPJ). These frontoparietal areas have traditionally been implicated in visual tasks requiring spatial shifts of attention and working memory [70]. Moreover, functional imaging experiments have shown that the superior parietal cortex is also engaged by successive shifts of spatial

**3.4. Conclusions: Comparison of depth and rivalry** 

[2].

attention [71].

If we step back and evaluate these patterns, then some caveats emerge. There appear to be some differences in the relative balance of frontal and occipital activation, comparing the results across depth and rivalry studies. However, it is challenging to compare the results across these studies because of substantial stimulus and task differences. Most studies of depth used (1) dynamic random dot stereograms (RDS) [4,8,10,11,13,14,17], (2) random dot dynamic checkerboards [3,10], (3) random dot sinusoidal corrugations at pedestal disparities [7], (4) random dot textured surfaces or shapes [9,16], (5) random line stimuli showing 3D depth structures [6], and (6) gratings or line drawings [17]. Rivalry studies have also used a variety of different stimuli, for example, (1) gratings [1,2,19,22,26-28], (2) faces/houses (1,24,26], (3) faces with differing emotional expressions [25], (4) tools, faces and textures [23], (5) slant rivalry [5,61], and (6) gratings/faces [20, 21]. The tasks used in depth and rivalry studies were also not comparable in terms of either the attentional demands or frequency of motor responses.

In order to address this issue, we performed a study designed explicitly to perform a direct whole-brain comparison of depth and rivalry with fMRI, with comparable stimulus patterns and tasks [2]. We used binocular plaid patterns in which depth is perceived from the nearvertical components and rivalry from the oblique components (Figure 1c). In Figure 1, the depth and rivalry components are added together to produce the plaids in which both depth and rivalry may be perceived. Subjects report that the percept of a rivalrous pattern is spatially superimposed on the tilted surface. The depth in the plaid stimulus changed every 3 s, between two possible percepts (top or bottom tilted forward). This was done in order to make a dynamic depth change that subjects could report, just as they reported dynamic changes in the rivalry task. For the depth task, subjects reported whether the top or bottom of the plaid stimulus pattern appeared to be tilted forward. The time interval of 3 s was chosen to match the mean time period between alternations for rivalry for the group of subjects. The depth change did not interfere with the rivalry percept, and subjects were able to perform a rivalry task with the identical plaid stimulus. This made it possible to compare conditions in which subjects perform either a depth or rivalry report task, while viewing identical plaid patterns, precisely matched for retinal stimulation. A comparison for the depth and rivalry task conditions would reveal the neural substrates for depth and/or rivalry.

#### **3.3. Depth and rivalry task comparison**

The most important comparison was that between the depth and rivalry task for identical plaid patterns. Our results showed that the whole brain network of activated cortical areas was remarkably similar for the rivalry task compared to the depth task when subjects viewed identical plaid patterns. These areas included the occipital, parietal, ventral temporal and frontal areas highlighted in Figure 2. Nevertheless, regions of superior and inferior parietal cortices (including temporoparietal junction and intraparietal sulcus) were activated more for the depth than the rivalry task, whereas a bias towards rivalry was seen in a lateral occipital region, calcarine, retrosplenial and ventral temporal areas. Thus, these results are important in showing that while parietal areas were clearly strongly activated by either depth or rivalry, consistent with previous studies (as discussed with reference to Figure 2 above), the activation levels were actually higher for depth when the two stimulus conditions had been equalized. This fits with an important role of these parietal areas in depth encoding in order to make hand or eye movements, which has been documented extensively (e.g. [6,9,42]). Conversely, lateral occipital area and ventral temporal areas were more specific for rivalry, consistent with a relatively greater number of studies which showed that these areas may be particularly relevant for the perception of rivalry [5,20-25].

Finally, in another manipulation, we included as a control, an orientation change task, which had similar stimulus features to the depth and rivalry tasks. In this case, the subject had to indicate with a key press which way the image was rotated. The orientation change condition required binocular fusion of matched features but evoked neither depth nor rivalry, serving to isolate those stages of binocular combination. This task was also matched to the depth and rivalry tasks in terms of the number of stimulus changes (which occurred every 3 s) and key presses. When the orientation change task was subtracted from either the depth or rivalry task, a lateral occipital area was highlighted, as well as V3A, V7, or ventral intraparietal sulcus (VIPS), and the kinetic occipital area (KO), including LO-1 and LO-2 [18]. This result indicated that these are areas active for either depth or rivalry, and may subserve a representation at the surface-level that would facilitate the grouping of features, and allow for more than one feature (i.e. depth or rivalry) to be coded at a spatial location [2].

#### **3.4. Conclusions: Comparison of depth and rivalry**

94 Visual Cortex – Current Status and Perspectives

motor responses.

rivalry.

**3.3. Depth and rivalry task comparison** 

anterior cingulate was reported for only rivalry but not depth.

more dynamic aspect of rivalry compared with depth, and the typical performance of a rivalry task, since these frontal and parietal areas have been associated with attention and working memory, as well as the performance of motor reports [65-68,70]. In particular, the

If we step back and evaluate these patterns, then some caveats emerge. There appear to be some differences in the relative balance of frontal and occipital activation, comparing the results across depth and rivalry studies. However, it is challenging to compare the results across these studies because of substantial stimulus and task differences. Most studies of depth used (1) dynamic random dot stereograms (RDS) [4,8,10,11,13,14,17], (2) random dot dynamic checkerboards [3,10], (3) random dot sinusoidal corrugations at pedestal disparities [7], (4) random dot textured surfaces or shapes [9,16], (5) random line stimuli showing 3D depth structures [6], and (6) gratings or line drawings [17]. Rivalry studies have also used a variety of different stimuli, for example, (1) gratings [1,2,19,22,26-28], (2) faces/houses (1,24,26], (3) faces with differing emotional expressions [25], (4) tools, faces and textures [23], (5) slant rivalry [5,61], and (6) gratings/faces [20, 21]. The tasks used in depth and rivalry studies were also not comparable in terms of either the attentional demands or frequency of

In order to address this issue, we performed a study designed explicitly to perform a direct whole-brain comparison of depth and rivalry with fMRI, with comparable stimulus patterns and tasks [2]. We used binocular plaid patterns in which depth is perceived from the nearvertical components and rivalry from the oblique components (Figure 1c). In Figure 1, the depth and rivalry components are added together to produce the plaids in which both depth and rivalry may be perceived. Subjects report that the percept of a rivalrous pattern is spatially superimposed on the tilted surface. The depth in the plaid stimulus changed every 3 s, between two possible percepts (top or bottom tilted forward). This was done in order to make a dynamic depth change that subjects could report, just as they reported dynamic changes in the rivalry task. For the depth task, subjects reported whether the top or bottom of the plaid stimulus pattern appeared to be tilted forward. The time interval of 3 s was chosen to match the mean time period between alternations for rivalry for the group of subjects. The depth change did not interfere with the rivalry percept, and subjects were able to perform a rivalry task with the identical plaid stimulus. This made it possible to compare conditions in which subjects perform either a depth or rivalry report task, while viewing identical plaid patterns, precisely matched for retinal stimulation. A comparison for the depth and rivalry task conditions would reveal the neural substrates for depth and/or

The most important comparison was that between the depth and rivalry task for identical plaid patterns. Our results showed that the whole brain network of activated cortical areas was remarkably similar for the rivalry task compared to the depth task when subjects viewed identical plaid patterns. These areas included the occipital, parietal, ventral In conclusion, the combined results of fMRI and psychophysical studies indicate that depth and rivalry are processed in a similar network of cortical areas and are perceived simultaneously by coexisting in different spatial frequency or orientation channels (see [2,58] for further discussion of the latter point). An important aspect of the results reviewed was that the same frontal and parietal areas were prominently activated for both depth and rivalry. So by matching depth and rivalry for stimulus characteristics and task we found that globally similar sites would be activated, even though depth does not involve overt, endogeneous competition between alternate percepts. We confirmed that all of the prominent sites of activation for rivalry were also present for depth, including frontal (FEF, PM, SMA, MF, IF, IFJ, DLPF, VLPF, FO) and parietal areas (SP, IP and TPJ). These frontoparietal areas have traditionally been implicated in visual tasks requiring spatial shifts of attention and working memory [70]. Moreover, functional imaging experiments have shown that the superior parietal cortex is also engaged by successive shifts of spatial attention [71].

## **4. Multistability**

Binocular rivalry is a specific example of a more general perceptual experience, multistability. Multistable images comprise important examples of conscious visual perceptual changes without any change in the stimulus being viewed. Multistability can be induced by using an ambiguous figure with more than one perceptual interpretation such as the Necker cube [72] or Rubin's vase/face [73] (Figure 4). For example, the image in Rubin's vase/face can be interpreted as either a vase or face, and formal observation shows that the perceptual organization changes between the face and vase over time. In a similar way, the Necker cube can be perceived with one face coming forward, or the other face forward and the percept fluctuates over time between these two possible organizations. As in the case of binocular rivalry, the retinal image stays constant, while the conscious percept changes. This lends itself to an investigation of visual conscious perception without a confounding stimulus change, as we have already seen for binocular rivalry. However, in comparison with binocular rivalry, observers do have somewhat greater voluntary control over their perception in these examples of multistability, and are better able to bias their interpretation towards one percept or the other [74]. Other examples of multistability include the rotating structure-from-motion sphere, which can be perceived to rotate in two different directions [36,38], and the apparent motion quartet, in which the perceived motion alternates between two different directions [19,33,75]. Another example of apparent motion is the spinning wheel, in which the perceived direction of rotation alternates between two directions [30]. Monocular (pattern) rivalry is yet another example of multistability in which a composite image is shown to both eyes, such as the sum of orthogonal gratings (Figure 4c) or a face/house composite (Figure 4g) [76]. These examples of monocular rivalry can be compared with examples of binocular rivalry in which either gratings or face/house pairs are shown to the left and right eyes (Figure 4a-b and e-f). Binocular rivalry can be perceived if (a-b) or (e-f) are cross-fused. Binocular rivalry can also be perceived for (c) and (g) if these are viewed using red-green stereoglasses. In monocular rivalry, the observer experiences perceptual alternations in which the two stimulus components (e.g. left and right oriented gratings) alternate in clarity or salience. The experience is similar to perceptual alternations in binocular rivalry, although the alternations are more difficult to perceive, because neither component is completely suppressed [69,77]. Thus in all these examples of multistability, the alternations between the different possible percepts are more subtle, compared with the near total suppression of one eye's image which occurs with binocular rivalry.

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 97

the occipital, parietal and frontal cortex, as was the case for depth and rivalry. The sites of activation which were most frequently reported included a number of frontal and parietal areas which have been associated with top-down control of attention, or stimulus-driven shifts of spatial attention, and working memory [65-68,70]. These parietal areas included superior parietal lobe [1,19,30,32,34-38], intraparietal sulcus [1,19,29,30,32-38] and temporoparietal junction [19,29,30,32,33,35-37]. The frontal sites of activation which also could be related to attention included dorsolateral prefrontal cortex or middle frontal gyrus [1,19,31,32,34-36], ventrolateral prefrontal cortex or inferior frontal gyrus [1,19,30,33]. Again, as was seen earlier for rivalry, a number of frontal areas were present which could be associated with attention or the preparation and execution of motor reports, such as frontal eye fields, anterior cingulate [1,19,29,30-34] and supplementary motor area [1,33,35,36]. These frontal and parietal areas were the most frequently reported sites of activation in these studies. However, several studies confirmed that activation also occurs in occipital areas, including ventral occipital (fusiform gyrus) [1,29,32,36], medial temporal areas

(hMT+) [19,30,32,33,35-38] and areas V1, V2, V3, V3A or V4-d topo [1,5,31,32,37,38].

studies.

The overall global pattern of activation sites for depth, rivalry and multistability can be compared, in Figure 2 and Table 1. Parietal activation (i.e. superior parietal lobe and intraparietal sulcus) was prominently and equally reported in all three cases. However, an exception to this was the temporoparietal junction, which was reported for rivalry and multistability, but not for depth (with the exception of [2]), consistent for a role for this area in stimulus-driven shifts of attention [65,67,70]. Furthermore, there was overall more frontal activation (e.g. dorsolateral and ventrolateral prefrontal cortex, supplementary motor area, and insula/frontal operculum) for either rivalry or multistability, compared with depth. These frontal areas could be associated with top-down control of attention or stimulusdriven shifts of attention, as well as the planning and execution of motor responses. Conversely, there was greater emphasis on occipital areas (e.g. V7, V4d-topo, V3, V3A, lateral occipital complex, MT+), for depth compared with rivalry or multistability. In other words, the balance between frontal and occipital activation was in favour of frontal areas for multistability or in favour of occipital areas for depth, with rivalry falling in between. Again, part of these differences can be attributed to the fact that there were more depth studies that had an interest in reporting activation in occipital areas, but even taking this into consideration, the overall pattern shows that occipital activation was relatively more prominent in depth studies. In general, few previous depth studies performed a whole-brain analysis [2,3,5,7-12,16], and of these few, only three reported frontal activation [2,9,10]. It was not clear whether this was simply an absence of reporting, or due to the fact that there was no frontal activation because a task was not being performed in most of these studies. The areas reported in these few whole-brain studies were the usual set of prominent occipito-parietal areas we might expect, such as V2, V3, V3A, V4d-topo, V7, intraparietal sulcus and parietal lobe [2,3,5,7-12,16]. A useful area for future study would be more matched comparisons between depth and rivalry, in which dynamic changes in depth (and a task to report depth percepts) could be used to make a direct comparison to rivalry

Recent functional neuroimaging studies of multistability have used a range of different image types, such as the (1) Necker cube [31,32,34,35,37], (2) Mach Pyramid, 3-D Triangle, Card, and Wave [31], (3) Rubins face/vase [29], (4) monocular rivalry with gratings [1], (5) rotating structure-from-motion sphere [36,38], and (6) apparent motion, which includes the spinning wheel [19,30], and (7) motion quartet [19,33]. There has been a remarkable congruence of findings across functional neuroimaging studies of multistability, despite the large variability in the images used to evoke changes in perceptual organization. For all image types, a distributed network of cortical areas is activated during the perception of multistability, which highlights occipito-parietal areas, as well as many interrelated areas of the occipital, parietal and frontal cortex, as was the case for depth and rivalry. The sites of activation which were most frequently reported included a number of frontal and parietal areas which have been associated with top-down control of attention, or stimulus-driven shifts of spatial attention, and working memory [65-68,70]. These parietal areas included superior parietal lobe [1,19,30,32,34-38], intraparietal sulcus [1,19,29,30,32-38] and temporoparietal junction [19,29,30,32,33,35-37]. The frontal sites of activation which also could be related to attention included dorsolateral prefrontal cortex or middle frontal gyrus [1,19,31,32,34-36], ventrolateral prefrontal cortex or inferior frontal gyrus [1,19,30,33]. Again, as was seen earlier for rivalry, a number of frontal areas were present which could be associated with attention or the preparation and execution of motor reports, such as frontal eye fields, anterior cingulate [1,19,29,30-34] and supplementary motor area [1,33,35,36]. These frontal and parietal areas were the most frequently reported sites of activation in these studies. However, several studies confirmed that activation also occurs in occipital areas, including ventral occipital (fusiform gyrus) [1,29,32,36], medial temporal areas (hMT+) [19,30,32,33,35-38] and areas V1, V2, V3, V3A or V4-d topo [1,5,31,32,37,38].

96 Visual Cortex – Current Status and Perspectives

Binocular rivalry is a specific example of a more general perceptual experience, multistability. Multistable images comprise important examples of conscious visual perceptual changes without any change in the stimulus being viewed. Multistability can be induced by using an ambiguous figure with more than one perceptual interpretation such as the Necker cube [72] or Rubin's vase/face [73] (Figure 4). For example, the image in Rubin's vase/face can be interpreted as either a vase or face, and formal observation shows that the perceptual organization changes between the face and vase over time. In a similar way, the Necker cube can be perceived with one face coming forward, or the other face forward and the percept fluctuates over time between these two possible organizations. As in the case of binocular rivalry, the retinal image stays constant, while the conscious percept changes. This lends itself to an investigation of visual conscious perception without a confounding stimulus change, as we have already seen for binocular rivalry. However, in comparison with binocular rivalry, observers do have somewhat greater voluntary control over their perception in these examples of multistability, and are better able to bias their interpretation towards one percept or the other [74]. Other examples of multistability include the rotating structure-from-motion sphere, which can be perceived to rotate in two different directions [36,38], and the apparent motion quartet, in which the perceived motion alternates between two different directions [19,33,75]. Another example of apparent motion is the spinning wheel, in which the perceived direction of rotation alternates between two directions [30]. Monocular (pattern) rivalry is yet another example of multistability in which a composite image is shown to both eyes, such as the sum of orthogonal gratings (Figure 4c) or a face/house composite (Figure 4g) [76]. These examples of monocular rivalry can be compared with examples of binocular rivalry in which either gratings or face/house pairs are shown to the left and right eyes (Figure 4a-b and e-f). Binocular rivalry can be perceived if (a-b) or (e-f) are cross-fused. Binocular rivalry can also be perceived for (c) and (g) if these are viewed using red-green stereoglasses. In monocular rivalry, the observer experiences perceptual alternations in which the two stimulus components (e.g. left and right oriented gratings) alternate in clarity or salience. The experience is similar to perceptual alternations in binocular rivalry, although the alternations are more difficult to perceive, because neither component is completely suppressed [69,77]. Thus in all these examples of multistability, the alternations between the different possible percepts are more subtle, compared with the

near total suppression of one eye's image which occurs with binocular rivalry.

Recent functional neuroimaging studies of multistability have used a range of different image types, such as the (1) Necker cube [31,32,34,35,37], (2) Mach Pyramid, 3-D Triangle, Card, and Wave [31], (3) Rubins face/vase [29], (4) monocular rivalry with gratings [1], (5) rotating structure-from-motion sphere [36,38], and (6) apparent motion, which includes the spinning wheel [19,30], and (7) motion quartet [19,33]. There has been a remarkable congruence of findings across functional neuroimaging studies of multistability, despite the large variability in the images used to evoke changes in perceptual organization. For all image types, a distributed network of cortical areas is activated during the perception of multistability, which highlights occipito-parietal areas, as well as many interrelated areas of

**4. Multistability** 

The overall global pattern of activation sites for depth, rivalry and multistability can be compared, in Figure 2 and Table 1. Parietal activation (i.e. superior parietal lobe and intraparietal sulcus) was prominently and equally reported in all three cases. However, an exception to this was the temporoparietal junction, which was reported for rivalry and multistability, but not for depth (with the exception of [2]), consistent for a role for this area in stimulus-driven shifts of attention [65,67,70]. Furthermore, there was overall more frontal activation (e.g. dorsolateral and ventrolateral prefrontal cortex, supplementary motor area, and insula/frontal operculum) for either rivalry or multistability, compared with depth. These frontal areas could be associated with top-down control of attention or stimulusdriven shifts of attention, as well as the planning and execution of motor responses. Conversely, there was greater emphasis on occipital areas (e.g. V7, V4d-topo, V3, V3A, lateral occipital complex, MT+), for depth compared with rivalry or multistability. In other words, the balance between frontal and occipital activation was in favour of frontal areas for multistability or in favour of occipital areas for depth, with rivalry falling in between. Again, part of these differences can be attributed to the fact that there were more depth studies that had an interest in reporting activation in occipital areas, but even taking this into consideration, the overall pattern shows that occipital activation was relatively more prominent in depth studies. In general, few previous depth studies performed a whole-brain analysis [2,3,5,7-12,16], and of these few, only three reported frontal activation [2,9,10]. It was not clear whether this was simply an absence of reporting, or due to the fact that there was no frontal activation because a task was not being performed in most of these studies. The areas reported in these few whole-brain studies were the usual set of prominent occipito-parietal areas we might expect, such as V2, V3, V3A, V4d-topo, V7, intraparietal sulcus and parietal lobe [2,3,5,7-12,16]. A useful area for future study would be more matched comparisons between depth and rivalry, in which dynamic changes in depth (and a task to report depth percepts) could be used to make a direct comparison to rivalry studies.

#### **4.1. Comparison of monocular and binocular rivalry**

As we have encountered before, there appears to be a global trend towards a slightly different distribution of frontal, parietal and occipital activation across binocular rivalry and other multistability studies. Yet, one of the difficulties in comparing results across rivalry and multistability studies is that the studies were not carried out with equivalent stimulus conditions, tasks or methodology, and functional imaging analyses. To address this, we carried out an fMRI study explicitly designed to perform a direct comparison between binocular rivalry and an example of multistability (monocular rivalry), using matched retinal stimulation and comparable tasks [2]. We used orthogonal gratings for binocular rivalry (left or right oblique grating in each eye) or monocular rivalry (sum of orthogonal gratings in each eye), as shown in Figure 4. Coloured stimuli were used in order to enhance the percept of monocular rivalry. As described earlier, the perceptual alternations in monocular rivalry are more subtle than those in binocular rivalry, reflecting less perceptual suppression [69,77].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 99

rivalry, and also highlighted these same regions of activation. The more widespread activation pattern for binocular than monocular rivalry may be consistent with the presence of neural competition at higher-level areas, as well as greater effects of attention. As anticipated, when binocular and monocular rivalry were directly compared, an interaction with stimulus contrast was found in early visual areas V1, V2, and V3. Binocular rivalry evoked greater activation than monocular rivalry for the low contrast images. However, at higher stimulus contrasts, where perceptual suppression was more complete, the response

One of the important results of the study was that both binocular and monocular rivalry showed a U-shaped function of activation as a function of contrast. Current models and concepts regarding binocular rivalry can explain this pattern (e.g. [51-54,56,69]). Rivalry models include inhibitory neurons in addition to excitatory neurons to account for interocular inhibition and suppression. In addition, the contribution of inhibition and suppression would generally be expected to lower the BOLD signal. At high contrasts, we expect the activation to increase due to an increasing neuronal response gain, which also leads to faster alternation rates, explaining the increase from 18% to 36% contrast. The increase in activation at the lowest contrast can possibly be explained as reflecting disinhibition, assuming that the excitatory and inhibitory neurons have different thresholds. At low contrasts, inhibitory neurons would not be strongly activated, resulting in slower alternation rates. Thus the higher BOLD signal at 9% contrast might be due to a release from

An important result of the study was to show that in addition to the activation of visual areas presumed to be involved directly in competition between neural representations, there was also activity for either binocular or monocular rivalry in frontoparietal areas that are often implicated in attention, and previously identified for binocular rivalry [65-67,70]. The previous literature seems to indicate that the balance of frontal activation may have been slightly higher for multistability than rivalry (as shown in Figure 2). But in our study in which we matched binocular and monocular rivalry for stimulus features and used comparable tasks, the frontal and parietal activation was actually somewhat higher for binocular rivalry, and included temporoparietal junction (TPJ), which was not an area significantly activated for monocular rivalry. The TPJ is modulated by stimulus-driven attentional shifts to unexpected objects or events [65,67,70]. It is possible that the TPJ was less active for monocular rivalry since the perceptual changes did not signal a change in object identity, as in binocular rivalry. All the other forms of multistability studied in fMRI paradigms produced some TPJ activation, including ambiguous figures [29,32,35,37], apparent motion [19,30,33] or structure from motion [36]. Hence a change in object identity and stimulus-driven shifts to unexpected events may be very relevant to the perceptual

to binocular rivalry fell below that to monocular rivalry.

inhibition that accompanies slow alternation rates.

experience of binocular rivalry and other forms of multistability.

**4.3. Role of parietal areas** 

**4.2. U-shaped function of activation** 

A direct comparison of monocular and binocular rivalry using gratings is attractive as the same images with matched retinal stimulation can be used for both forms of bistability in order to isolate the effect of suppression, and to determine if they share common neural mechanisms. We anticipated that the effects of perceptual suppression would be evident in a lower BOLD signal for binocular compared with monocular rivalry in early visual areas, such as V1, V2 or V3. We also used so-called 'rivalry replay' conditions, in which the entire stimulus was physically changed between the two possible percepts, using the identical temporal sequences reported earlier during rivalry with button presses. This is intended to mimic rivalry in terms of stimulus changes and motor demands, and allows subtractions to be made between rivalry and replay in order to isolate the neural substrates which may be more directly related to the perception of rivalrous alternations.

Some results are shown in the form of brain activation maps, averaged across six subjects (Figure 3). The activation for monocular rivalry or binocular rivalry with grating stimuli above the baseline condition is shown at three contrasts (9%, 18%, 36%). A view from the back of the human brain is shown (right hemisphere only). The colour scale indicates statistically significant results ranging from t=2.35 to 8.00 (orange-yellow) (FDR, p<0.05). Compared to a blank screen, both binocular and monocular rivalry show a U-shaped function of activation as a function of stimulus contrast, i.e. higher activity for most areas at 9% and 36%. The sites of cortical activation for monocular rivalry included occipital pole (V1, V2, V3), ventral temporal cortex (including fusiform gyrus), superior parietal cortex, ventrolateral prefrontal cortex, dorsolateral prefrontal cortex, supplementary motor area, frontal eye fields, and insula/frontal operculum. Interestingly, the areas for binocular rivalry were more widespread, and also included lateral occipital regions, as well as inferior parietal cortex, including intraparietal sulcus and temporoparietal junction (TPJ). In particular, MT+, lateral occipital complex and V3A were more active for binocular than monocular rivalry for all contrasts. The comparison of binocular rivalry with the replay condition was particularly important in isolating the neural substrates for the perception of rivalry, and also highlighted these same regions of activation. The more widespread activation pattern for binocular than monocular rivalry may be consistent with the presence of neural competition at higher-level areas, as well as greater effects of attention. As anticipated, when binocular and monocular rivalry were directly compared, an interaction with stimulus contrast was found in early visual areas V1, V2, and V3. Binocular rivalry evoked greater activation than monocular rivalry for the low contrast images. However, at higher stimulus contrasts, where perceptual suppression was more complete, the response to binocular rivalry fell below that to monocular rivalry.

#### **4.2. U-shaped function of activation**

98 Visual Cortex – Current Status and Perspectives

suppression [69,77].

**4.1. Comparison of monocular and binocular rivalry** 

more directly related to the perception of rivalrous alternations.

As we have encountered before, there appears to be a global trend towards a slightly different distribution of frontal, parietal and occipital activation across binocular rivalry and other multistability studies. Yet, one of the difficulties in comparing results across rivalry and multistability studies is that the studies were not carried out with equivalent stimulus conditions, tasks or methodology, and functional imaging analyses. To address this, we carried out an fMRI study explicitly designed to perform a direct comparison between binocular rivalry and an example of multistability (monocular rivalry), using matched retinal stimulation and comparable tasks [2]. We used orthogonal gratings for binocular rivalry (left or right oblique grating in each eye) or monocular rivalry (sum of orthogonal gratings in each eye), as shown in Figure 4. Coloured stimuli were used in order to enhance the percept of monocular rivalry. As described earlier, the perceptual alternations in monocular rivalry are more subtle than those in binocular rivalry, reflecting less perceptual

A direct comparison of monocular and binocular rivalry using gratings is attractive as the same images with matched retinal stimulation can be used for both forms of bistability in order to isolate the effect of suppression, and to determine if they share common neural mechanisms. We anticipated that the effects of perceptual suppression would be evident in a lower BOLD signal for binocular compared with monocular rivalry in early visual areas, such as V1, V2 or V3. We also used so-called 'rivalry replay' conditions, in which the entire stimulus was physically changed between the two possible percepts, using the identical temporal sequences reported earlier during rivalry with button presses. This is intended to mimic rivalry in terms of stimulus changes and motor demands, and allows subtractions to be made between rivalry and replay in order to isolate the neural substrates which may be

Some results are shown in the form of brain activation maps, averaged across six subjects (Figure 3). The activation for monocular rivalry or binocular rivalry with grating stimuli above the baseline condition is shown at three contrasts (9%, 18%, 36%). A view from the back of the human brain is shown (right hemisphere only). The colour scale indicates statistically significant results ranging from t=2.35 to 8.00 (orange-yellow) (FDR, p<0.05). Compared to a blank screen, both binocular and monocular rivalry show a U-shaped function of activation as a function of stimulus contrast, i.e. higher activity for most areas at 9% and 36%. The sites of cortical activation for monocular rivalry included occipital pole (V1, V2, V3), ventral temporal cortex (including fusiform gyrus), superior parietal cortex, ventrolateral prefrontal cortex, dorsolateral prefrontal cortex, supplementary motor area, frontal eye fields, and insula/frontal operculum. Interestingly, the areas for binocular rivalry were more widespread, and also included lateral occipital regions, as well as inferior parietal cortex, including intraparietal sulcus and temporoparietal junction (TPJ). In particular, MT+, lateral occipital complex and V3A were more active for binocular than monocular rivalry for all contrasts. The comparison of binocular rivalry with the replay condition was particularly important in isolating the neural substrates for the perception of One of the important results of the study was that both binocular and monocular rivalry showed a U-shaped function of activation as a function of contrast. Current models and concepts regarding binocular rivalry can explain this pattern (e.g. [51-54,56,69]). Rivalry models include inhibitory neurons in addition to excitatory neurons to account for interocular inhibition and suppression. In addition, the contribution of inhibition and suppression would generally be expected to lower the BOLD signal. At high contrasts, we expect the activation to increase due to an increasing neuronal response gain, which also leads to faster alternation rates, explaining the increase from 18% to 36% contrast. The increase in activation at the lowest contrast can possibly be explained as reflecting disinhibition, assuming that the excitatory and inhibitory neurons have different thresholds. At low contrasts, inhibitory neurons would not be strongly activated, resulting in slower alternation rates. Thus the higher BOLD signal at 9% contrast might be due to a release from inhibition that accompanies slow alternation rates.

#### **4.3. Role of parietal areas**

An important result of the study was to show that in addition to the activation of visual areas presumed to be involved directly in competition between neural representations, there was also activity for either binocular or monocular rivalry in frontoparietal areas that are often implicated in attention, and previously identified for binocular rivalry [65-67,70]. The previous literature seems to indicate that the balance of frontal activation may have been slightly higher for multistability than rivalry (as shown in Figure 2). But in our study in which we matched binocular and monocular rivalry for stimulus features and used comparable tasks, the frontal and parietal activation was actually somewhat higher for binocular rivalry, and included temporoparietal junction (TPJ), which was not an area significantly activated for monocular rivalry. The TPJ is modulated by stimulus-driven attentional shifts to unexpected objects or events [65,67,70]. It is possible that the TPJ was less active for monocular rivalry since the perceptual changes did not signal a change in object identity, as in binocular rivalry. All the other forms of multistability studied in fMRI paradigms produced some TPJ activation, including ambiguous figures [29,32,35,37], apparent motion [19,30,33] or structure from motion [36]. Hence a change in object identity and stimulus-driven shifts to unexpected events may be very relevant to the perceptual experience of binocular rivalry and other forms of multistability.

### **4.4. Visuospatial attention in control of multistability**

Two aspects of the deployment of attention in vision have been studied extensively using physiological methods: the effects of attention on modulating neural responses in early visual cortical areas, and the top-down control of attention from executive control regions of the brain [66,68]. The effect of visuospatial attention to a stimulus at a peripheral location (while maintaining central fixation) is to increase the cortical response associated with that stimulus in striate and extrastriate visual areas within the contralateral hemisphere compared to when that stimulus is not attended (e.g. [71]). This contralateral attention effect has been shown to operate on the precise retinotopic cortical representation of the attended stimulus. Visual attention can also operate by modulating the cortical responses to a given stimulus feature [71]. In contrast, the top-down control of spatial attention has been associated with activity in the dorsolateral prefrontal and posterior parietal cortex, including intraparietal sulcus and superior parietal lobe [68], and transient activity within these regions is thought to initiate a shift of attention between locations, features, or objects. Thus, the effect of attention is to modulate neural activity in visual areas, while the control of attention has been associated with transient activity in frontal and parietal cortex that occurs at the onset of attentional switches [65,70], in addition to sustained activity in these areas that maintains a given attentive state. Studies of the voluntary control of ambiguous figure reversals have also revealed transient frontoparietal activation, suggesting that there may be a common mechanism subserving the voluntary deployment of attention and voluntary control over perceptual bistability [32,34].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 101

areas (as in Figure 2) to be active around the time of perceptual transitions between interpretations [19]. As described earlier, some previous rivalry studies have used subtractions between rivalry and 'rivalry replay' conditions to isolate rivalry mechanisms, and these frontal and parietal activations were still present following these subtractions [2,20,22]. It is possible that this activation could be related to the difficulty in judging the transitions during real rivalry alternations. The investigators noted in particular that some transitions occur virtually instantaneously, with one percept abruptly suppressing the alternative percept, whereas other transitions comprise dynamic mixtures of both percepts for a period of time before one percept dominates completely. They studied the role of this frontoparietal activation, with specific interest in its relation to the temporal structure of transitions, which can be either instantaneous or prolonged by periods during which observers experience a mix of both perceptual interpretations. Using both bistable apparent motion and binocular rivalry, they found that transition-related frontoparietal activity is larger for transitions that last longer, suggesting that the frontoparietal activation remains throughout the duration of the transition. They also found that frontoparietal activity during binocular rivalry transitions exceeded activity during abrupt transitions simulated using rivalry replay, as was found previously in a number of studies [2,20,22]. However, they confirmed that this only occurs when perceptual transitions are replayed as instantaneous events. When replay depicts the transitions with the actual durations reported during rivalry, then transitions mimicked with replay and genuine rivalry produced equal activation levels in frontoparietal areas. The results are consistent with the view that at least a component of frontoparietal activation during bistable perception reflects a *response* to rivalrous (or replay) perceptual transitions rather than their *cause*. Hence the results shed light on the functional role of frontoparietal activity and the mechanisms underlying perceptual reorganizations during bistable perception. This activation could reflect the change in sensory experience and task demand that occurs during transitions, which fits well with the known role of these areas in attention and decision making [65-67,70,78,79].

**5. Methodological issues in fMRI studies and role of frontal areas** 

analysis to predict perceptual states [38].

Some of the differences in the results across depth, rivalry and multistability studies can be explained due to the use of differing methodology and functional imaging analysis methods. The majority of rivalry studies have used event-related designs which correlated activations in different brain areas to the start of each alternation [5,19,20,22,28,61], while a smaller number of rivalry studies used block designs in which stimulus blocks with rivalry were contrasted with blocks of rivalry replay [1,2,26]. One other rivalry study analyzed temporal correlations between cortical areas during passive viewing of rivalry [21]. In general, multistability studies have used methods which are quite similar to those used in rivalry studies. For example, a large number of multistability studies used event-related designs correlating brain activation to reversals [29,30,32,33,36], while others used block designs comparing multistability to baseline conditions [31,34,35,37], or multivariate pattern

One pertinent study investigated whether the voluntary control of perceptual configuration in a multistable stimulus (Necker cube) is mediated by voluntary shifts of selective attention, using event-related functional imaging [32]. Two slightly different versions of the Necker cube display were used during attention and perception conditions. In the attention condition, participants were cued to shift attention between the squares in left and right hemifields. In the perception condition, corresponding corners of the squares were connected by horizontal lines producing a perceptually multistable Necker cube. Observers reported which of the two faces appeared forward in depth, and were provided with cues to induce voluntary perceptual reversals. Both the perception and attention conditions yielded increased activity in contralateral occipital visual areas (V1v, V2v, VP, V3, V3A, V4v, MT+, V1d, V2d). Furthermore, voluntary shifts of attention and voluntary shifts in perceptual configuration were associated with common activity in the posterior parietal cortex (superior parietal lobe and intraparietal sulcus), part of the frontoparietal attentional topdown control network [66]. These results support the hypothesis that voluntary shifts in perceptual bistability in the Necker cube are mediated by spatial attention [32].

#### **4.5. Transitions between percepts in binocular rivalry or multistability**

A recent study took a different approach in studying these issues, noting that a number of previous binocular rivalry studies have found a large network of frontal and parietal cortical areas (as in Figure 2) to be active around the time of perceptual transitions between interpretations [19]. As described earlier, some previous rivalry studies have used subtractions between rivalry and 'rivalry replay' conditions to isolate rivalry mechanisms, and these frontal and parietal activations were still present following these subtractions [2,20,22]. It is possible that this activation could be related to the difficulty in judging the transitions during real rivalry alternations. The investigators noted in particular that some transitions occur virtually instantaneously, with one percept abruptly suppressing the alternative percept, whereas other transitions comprise dynamic mixtures of both percepts for a period of time before one percept dominates completely. They studied the role of this frontoparietal activation, with specific interest in its relation to the temporal structure of transitions, which can be either instantaneous or prolonged by periods during which observers experience a mix of both perceptual interpretations. Using both bistable apparent motion and binocular rivalry, they found that transition-related frontoparietal activity is larger for transitions that last longer, suggesting that the frontoparietal activation remains throughout the duration of the transition. They also found that frontoparietal activity during binocular rivalry transitions exceeded activity during abrupt transitions simulated using rivalry replay, as was found previously in a number of studies [2,20,22]. However, they confirmed that this only occurs when perceptual transitions are replayed as instantaneous events. When replay depicts the transitions with the actual durations reported during rivalry, then transitions mimicked with replay and genuine rivalry produced equal activation levels in frontoparietal areas. The results are consistent with the view that at least a component of frontoparietal activation during bistable perception reflects a *response* to rivalrous (or replay) perceptual transitions rather than their *cause*. Hence the results shed light on the functional role of frontoparietal activity and the mechanisms underlying perceptual reorganizations during bistable perception. This activation could reflect the change in sensory experience and task demand that occurs during transitions, which fits well with the known role of these areas in attention and decision making [65-67,70,78,79].

100 Visual Cortex – Current Status and Perspectives

control over perceptual bistability [32,34].

**4.4. Visuospatial attention in control of multistability** 

Two aspects of the deployment of attention in vision have been studied extensively using physiological methods: the effects of attention on modulating neural responses in early visual cortical areas, and the top-down control of attention from executive control regions of the brain [66,68]. The effect of visuospatial attention to a stimulus at a peripheral location (while maintaining central fixation) is to increase the cortical response associated with that stimulus in striate and extrastriate visual areas within the contralateral hemisphere compared to when that stimulus is not attended (e.g. [71]). This contralateral attention effect has been shown to operate on the precise retinotopic cortical representation of the attended stimulus. Visual attention can also operate by modulating the cortical responses to a given stimulus feature [71]. In contrast, the top-down control of spatial attention has been associated with activity in the dorsolateral prefrontal and posterior parietal cortex, including intraparietal sulcus and superior parietal lobe [68], and transient activity within these regions is thought to initiate a shift of attention between locations, features, or objects. Thus, the effect of attention is to modulate neural activity in visual areas, while the control of attention has been associated with transient activity in frontal and parietal cortex that occurs at the onset of attentional switches [65,70], in addition to sustained activity in these areas that maintains a given attentive state. Studies of the voluntary control of ambiguous figure reversals have also revealed transient frontoparietal activation, suggesting that there may be a common mechanism subserving the voluntary deployment of attention and voluntary

One pertinent study investigated whether the voluntary control of perceptual configuration in a multistable stimulus (Necker cube) is mediated by voluntary shifts of selective attention, using event-related functional imaging [32]. Two slightly different versions of the Necker cube display were used during attention and perception conditions. In the attention condition, participants were cued to shift attention between the squares in left and right hemifields. In the perception condition, corresponding corners of the squares were connected by horizontal lines producing a perceptually multistable Necker cube. Observers reported which of the two faces appeared forward in depth, and were provided with cues to induce voluntary perceptual reversals. Both the perception and attention conditions yielded increased activity in contralateral occipital visual areas (V1v, V2v, VP, V3, V3A, V4v, MT+, V1d, V2d). Furthermore, voluntary shifts of attention and voluntary shifts in perceptual configuration were associated with common activity in the posterior parietal cortex (superior parietal lobe and intraparietal sulcus), part of the frontoparietal attentional topdown control network [66]. These results support the hypothesis that voluntary shifts in

perceptual bistability in the Necker cube are mediated by spatial attention [32].

**4.5. Transitions between percepts in binocular rivalry or multistability** 

A recent study took a different approach in studying these issues, noting that a number of previous binocular rivalry studies have found a large network of frontal and parietal cortical

#### **5. Methodological issues in fMRI studies and role of frontal areas**

Some of the differences in the results across depth, rivalry and multistability studies can be explained due to the use of differing methodology and functional imaging analysis methods. The majority of rivalry studies have used event-related designs which correlated activations in different brain areas to the start of each alternation [5,19,20,22,28,61], while a smaller number of rivalry studies used block designs in which stimulus blocks with rivalry were contrasted with blocks of rivalry replay [1,2,26]. One other rivalry study analyzed temporal correlations between cortical areas during passive viewing of rivalry [21]. In general, multistability studies have used methods which are quite similar to those used in rivalry studies. For example, a large number of multistability studies used event-related designs correlating brain activation to reversals [29,30,32,33,36], while others used block designs comparing multistability to baseline conditions [31,34,35,37], or multivariate pattern analysis to predict perceptual states [38].

In contrast to rivalry or multistability, the majority of depth studies have used block designs. In these studies, stimulus blocks showing images with depth were contrasted with blocks with no depth [2,3, 6,7,9-12,17], or correlated disparity versus anticorrelated disparity images [13]. Other depth studies used methods more similar to those used for binocular rivalry, such as multivoxel pattern analysis [8], event-related adaptation [15], event-related designs in which brain activation was correlated to changes in perceived depth [5,16], or adaptation in a block design to assess population responsiveness to different types of depth stimuli [14]. These differences in methodology also mean that subjects performed a task in rivalry or multistabiity studies using event-related designs [5,19,20,22,28-30,32,33,36], or block designs [1,2,26,31,34,35], but subjects did not perform tasks in depth studies [3,6,9- 13,17], although there are a few exceptions to this generalization for depth [2,4,5,7,17]. Also, a few rivalry or multistability studies did not use a task [21,37,61], and some rivalry studies used fixation tasks unrelated to the perception of rivalry [23,28].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 103

activation was consistently reported in these studies [31,34,35,37]. This could be because frontal area fluctuations only occur at the onset of alternations and sum to zero over the longer periods. Most binocular rivalry studies used event-related designs, so it is difficult to

In comparing the results across the depth, rivalry and multistability studies, a few other trends are apparent. A number of occipital areas, notably V2, V3, V3A, V4d-topo, V7 and MT+, were more frequently reported in depth studies than either rivalry or multistabililty. These areas may not have been frequently reported in studies of multistability because the analysis methods (usually event-related designs) would not find large signal differences in low-level visual representations since the visual appearance of the stimulus barely changes during alternations. For example, multistability studies involving apparent motion usually did not report any activation in these visual areas (with the exception of MT+), likely because of the similarity in stimulus configuration between the two possible percepts [19,30,33]. Likewise, there may not have been large signal changes in these areas occurring at the onset of binocular rivalry alternations because the two alternative percepts would not selectively activate any of these areas. Some of the rivalry and multistability studies that did report activation in these areas had one stimulus aspect in common: there was a depth interpretation present in the stimulus alternatives (for example, rotating structure-frommotion sphere, or slant/perspective rivalry, [5,38]). In contrast, ventral temporal areas, including fusiform gyrus, were more likely to be active in studies which used faces as one of the two possible percepts, such as faces/grating stimuli [20,21,24,29]. Another stimulus difference which could explain trends is that a number of different examples of multistability which were used had a dynamic aspect (e.g. apparent motion), which was in addition to the multistable percept itself. In general, the frontal activation was greater and included a larger number of areas for the dynamic examples of multistability

A number of questions remain unanswered by the existing functional imaging studies on binocular depth, rivalry and multistability. Current models of binocular vision need to be revised in order to explain the interrelationship between depth and rivalry and explain why they are processed in parallel through a number of cortical areas [51-54,56]. It may be possible that the strong inhibitory interactions which we are familiar with in binocular rivalry may serve the purpose of resolving ambiguity in binocular vision. The mechanisms for binocular rivalry may be important in inhibiting false matches at different orientations, suppressing noise in neural responses and sharpening the tuning of orientational mechanisms [59,60]. A more general binocular vision model would incorporate these important inhibitory mechanisms, together with binocular matching which is necessary for depth perception. In addition, it is important to incorporate the finding that it is possible to perceive both depth and rivalry simultaneously at a single spatial location. There may be a representation at the surface-level that would facilitate the grouping of binocular depth and rivalry features, and allow for more than one feature to be coded at a spatial location [2].

assess what effect this has on the results.

[19,30,33,36,37,38], compared with static examples [29,31,34,35].

**6. Future research** 

These differences in methodology are obviously related to current concepts of rivalry and multistability as essentially dynamic perceptual phenomena while depth is static, but it should also be acknowledged that these differences could systematically affect the outcome of these studies. In particular, the fact that subjects usually performed a task in rivalry or multistability studies but not depth studies could explain why frontoparietal activation was more likely to be reported for rivalry or multistability. However, this is not the whole story, as several studies have found that frontoparietal activation is present for passive viewing of rivalry (including areas SP, IP, PM, FEF, SMA, MF, IF and FO in Figure 2), even when there is no task [2,21]. However, we noted in our own study that although activation in these widespread areas was still present, the absolute levels were lower with no task [2]. One multistability study which used passive viewing found that the typical parietal activation was present (superior and inferior parietal areas including TPJ), as well as one frontal site of activation (i.e. premotor cortex), but no significant activation of any other frontal areas, notably there was no significant activation in middle or inferior frontal gyrus [37]. Hence, frontoparietal activation is still present when there is no task, but it is reduced.

Some other studies of multistability have used tasks involving spatial shifts of attention instead of the more typical motor responses. One particular study of multistability which used spatial shifts of attention between the two possible percepts but no motor reports found activation in parietal areas (SP, IP), and a smaller subset of frontal areas, including only SMA, PM, and MF [35]. As described above, a second study of multistability which used spatial shifts of attention between two possible percepts found voluntary shifts of attention associated with activation of essentially the same sets of areas (namely parietal areas, SP, IP) and a small subset of frontal areas, including SMA but no significant activation in MF, IF or prefrontal cortex [32]. Hence, the use of tasks involving spatial shifts of attention tends to restrict frontal activation, although the usual site of parietal activation (i.e. SP, IP) are still present.

The use of an event-related design also has an impact on results. Studies of multistability which used block designs reported overall less frontal activation, although parietal activation was consistently reported in these studies [31,34,35,37]. This could be because frontal area fluctuations only occur at the onset of alternations and sum to zero over the longer periods. Most binocular rivalry studies used event-related designs, so it is difficult to assess what effect this has on the results.

In comparing the results across the depth, rivalry and multistability studies, a few other trends are apparent. A number of occipital areas, notably V2, V3, V3A, V4d-topo, V7 and MT+, were more frequently reported in depth studies than either rivalry or multistabililty. These areas may not have been frequently reported in studies of multistability because the analysis methods (usually event-related designs) would not find large signal differences in low-level visual representations since the visual appearance of the stimulus barely changes during alternations. For example, multistability studies involving apparent motion usually did not report any activation in these visual areas (with the exception of MT+), likely because of the similarity in stimulus configuration between the two possible percepts [19,30,33]. Likewise, there may not have been large signal changes in these areas occurring at the onset of binocular rivalry alternations because the two alternative percepts would not selectively activate any of these areas. Some of the rivalry and multistability studies that did report activation in these areas had one stimulus aspect in common: there was a depth interpretation present in the stimulus alternatives (for example, rotating structure-frommotion sphere, or slant/perspective rivalry, [5,38]). In contrast, ventral temporal areas, including fusiform gyrus, were more likely to be active in studies which used faces as one of the two possible percepts, such as faces/grating stimuli [20,21,24,29]. Another stimulus difference which could explain trends is that a number of different examples of multistability which were used had a dynamic aspect (e.g. apparent motion), which was in addition to the multistable percept itself. In general, the frontal activation was greater and included a larger number of areas for the dynamic examples of multistability [19,30,33,36,37,38], compared with static examples [29,31,34,35].

## **6. Future research**

102 Visual Cortex – Current Status and Perspectives

SP, IP) are still present.

In contrast to rivalry or multistability, the majority of depth studies have used block designs. In these studies, stimulus blocks showing images with depth were contrasted with blocks with no depth [2,3, 6,7,9-12,17], or correlated disparity versus anticorrelated disparity images [13]. Other depth studies used methods more similar to those used for binocular rivalry, such as multivoxel pattern analysis [8], event-related adaptation [15], event-related designs in which brain activation was correlated to changes in perceived depth [5,16], or adaptation in a block design to assess population responsiveness to different types of depth stimuli [14]. These differences in methodology also mean that subjects performed a task in rivalry or multistabiity studies using event-related designs [5,19,20,22,28-30,32,33,36], or block designs [1,2,26,31,34,35], but subjects did not perform tasks in depth studies [3,6,9- 13,17], although there are a few exceptions to this generalization for depth [2,4,5,7,17]. Also, a few rivalry or multistability studies did not use a task [21,37,61], and some rivalry studies

These differences in methodology are obviously related to current concepts of rivalry and multistability as essentially dynamic perceptual phenomena while depth is static, but it should also be acknowledged that these differences could systematically affect the outcome of these studies. In particular, the fact that subjects usually performed a task in rivalry or multistability studies but not depth studies could explain why frontoparietal activation was more likely to be reported for rivalry or multistability. However, this is not the whole story, as several studies have found that frontoparietal activation is present for passive viewing of rivalry (including areas SP, IP, PM, FEF, SMA, MF, IF and FO in Figure 2), even when there is no task [2,21]. However, we noted in our own study that although activation in these widespread areas was still present, the absolute levels were lower with no task [2]. One multistability study which used passive viewing found that the typical parietal activation was present (superior and inferior parietal areas including TPJ), as well as one frontal site of activation (i.e. premotor cortex), but no significant activation of any other frontal areas, notably there was no significant activation in middle or inferior frontal gyrus [37]. Hence,

frontoparietal activation is still present when there is no task, but it is reduced.

Some other studies of multistability have used tasks involving spatial shifts of attention instead of the more typical motor responses. One particular study of multistability which used spatial shifts of attention between the two possible percepts but no motor reports found activation in parietal areas (SP, IP), and a smaller subset of frontal areas, including only SMA, PM, and MF [35]. As described above, a second study of multistability which used spatial shifts of attention between two possible percepts found voluntary shifts of attention associated with activation of essentially the same sets of areas (namely parietal areas, SP, IP) and a small subset of frontal areas, including SMA but no significant activation in MF, IF or prefrontal cortex [32]. Hence, the use of tasks involving spatial shifts of attention tends to restrict frontal activation, although the usual site of parietal activation (i.e.

The use of an event-related design also has an impact on results. Studies of multistability which used block designs reported overall less frontal activation, although parietal

used fixation tasks unrelated to the perception of rivalry [23,28].

A number of questions remain unanswered by the existing functional imaging studies on binocular depth, rivalry and multistability. Current models of binocular vision need to be revised in order to explain the interrelationship between depth and rivalry and explain why they are processed in parallel through a number of cortical areas [51-54,56]. It may be possible that the strong inhibitory interactions which we are familiar with in binocular rivalry may serve the purpose of resolving ambiguity in binocular vision. The mechanisms for binocular rivalry may be important in inhibiting false matches at different orientations, suppressing noise in neural responses and sharpening the tuning of orientational mechanisms [59,60]. A more general binocular vision model would incorporate these important inhibitory mechanisms, together with binocular matching which is necessary for depth perception. In addition, it is important to incorporate the finding that it is possible to perceive both depth and rivalry simultaneously at a single spatial location. There may be a representation at the surface-level that would facilitate the grouping of binocular depth and rivalry features, and allow for more than one feature to be coded at a spatial location [2].

A common set of frontoparietal cortical brain areas are activated during depth, rivalry or multistability, implying that there is an underlying cortical network with a complex interplay of neural processing between cortical brain areas, which is not yet understood. Such frontoparietal activations could reflect top-down processes that initiate a reorganization of activity in visual cortex during perceptual reversals. Alternatively, as a result of neural activity fluctuations in visual cortex, frontoparietal activations could merely reflect the feed-forward communication of salient neural events from visual cortex to higher-level areas. These two possibilities differ in the causal chain assumed to underlie changes in visual awareness, but it remains difficult to infer causality from correlative neurophysiological measures. Ideally, this would be addressed by probing the causal role of frontal and parietal areas using experimental lesion and microstimulation techniques. For example, a recent study which used transcranial magnetic stimulation (TMS) to create virtual lesions showed that particular frontal cortical areas (e.g. dorsolateral prefrontal cortex) were causally relevant for voluntary control over perceptual switches in a multistable structure-from-motion stimulus [80]. Other observations that activations in frontal and parietal areas precede activity associated with the sensory processing of perceptual switches also suggest that feedback signals from frontoparietal areas modulate visual processing [33,81].

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 105

**Author details** 

Athena Buckthought\*

**Acknowledgement** 

Neuron 39: 555-568.

**8. References** 

11327.

Corresponding Author

 \*

the Canadian Foundation for Innovation.

Depth Perception with FMRI. J Vis. 11: 1-15.

 and Janine D. Mendola *Department of Ophthalmology, McGill University, Montreal, Canada* 

This work was supported by NSERC and NIH R01 EY015219 grants, and a LOF grant from

[1] Buckthought A, Jessula S, Mendola JD (2011) Bistable Percepts in the Brain: FMRI Contrasts Monocular Pattern Rivalry and Binocular Rivalry. PLoS One 6: e20367. [2] Buckthought A, Mendola, JD (2011) A Matched Comparison of Binocular Rivalry and

[3] Tsao DY, Vanduffel W, Sasaki Y, Fize D, Knutsen TA, Mandeville JB, Wald LL, Dale AM, Rosen BR, Van Essen DC, Livingstone MS, Orban GA, Tootell RBH (2003) Stereopsis Activates V3A and Caudal Intraparietal Areas in Macaques and Humans.

[4] Backus BT, Fleet DJ, Parker AJ, Heeger DJ (2001) Human Cortical Activity Correlates

[5] Brouwer GJ, van Ee R, Schwarzbach J (2005) Activation in Visual Cortex Correlates with

[6] Durand JB, Peeters R, Norman JF, Todd JT, Orban GA (2009) Parietal Regions Processing Visual 3D Shape Extracted from Disparity. Neuroimage 46: 1114-1126. [7] Minini L, Parker AJ, Bridge H (2010) Neural Modulation by Binocular Disparity

[8] Preston TJ, Li S, Kourtzi Z, Welchman AE (2008) Multivoxel Pattern Selectivity for Perceptually Relevant Binocular Disparities in the Human Brain. J Neurosci. 28: 11315-

[9] Georgieva S, Peeters R, Kolster H, Todd JT, Orban GA (2009) The Processing of Three-Dimensional Shape From Disparity in the Human Brain. J Neurosci. 29: 727-742. [10] Rutschmann RM, Greenlee MW (2004) BOLD Response in Dorsal Areas Varies With

[11] Iwami T, Nishida Y, Hayashi O, Kimura M, Sakai M, Kani, K, Ito R, Shiino A, Suzuki M (2002) Common Neural Processing Regions for Dynamic and Static Stereopsis in

with Stereoscopic Depth Perception. J Neurophysiol. 86: 2054-2068.

the Awareness of Stereoscopic Depth. J Neurosci. 25: 10403-10413.

Greatest in Human Dorsal Visual Stream. J Neurophysiol. 104: 169-178.

Relative Disparity Level. Neuroreport 15: 615-619.

Human Parieto-Occipital Cortices. Neurosci Lett. 327: 29-32.

However, other results reviewed earlier suggest that the ultimate resolution will be more nuanced and complicated than the dichotomy referred to above (e.g., [19,32,61]). One particularly appealing framework previously proposed suggests that the frontal and parietal areas form part of a sensorimotor continuum and are designed to periodically check or update the current perceptual organization in the visual system [82,83]. Hence this central control network would mediate between alternative perceptions for conscious awareness. This process may in fact occur all the time in natural vision, but would usually proceed unnoticed, resulting in a stable perception of the visual world. In any case, it will be important to carry out further studies in order to clarify the functional role of frontoparietal activity and determine the manner in which it relates to the mechanisms underlying perception in general, and reorganizations during bistable perception.

## **7. Conclusions**

A review of recent functional neuroimaging studies indicates that binocular depth, rivalry and multistability are three perceptual processing domains which share neural substrates, including largely overlapping occipital, parietal and frontal cortical areas. All three of these perceptual processing modalities can be conceptualized as a series of visual perceptual processing stages in occipital areas, as well as higher-level cognitive functions in parietal and frontal areas, involving decision making, motor planning and execution, attention, awareness and memory. Current research will further study the manner in which these cortical areas interact, and the causal sequence of events which underlies each of these three perceptual processing modalities, recalling some of the most important themes of neuroscience in these overlapping and interrelated functions.

## **Author details**

104 Visual Cortex – Current Status and Perspectives

visual processing [33,81].

**7. Conclusions** 

A common set of frontoparietal cortical brain areas are activated during depth, rivalry or multistability, implying that there is an underlying cortical network with a complex interplay of neural processing between cortical brain areas, which is not yet understood. Such frontoparietal activations could reflect top-down processes that initiate a reorganization of activity in visual cortex during perceptual reversals. Alternatively, as a result of neural activity fluctuations in visual cortex, frontoparietal activations could merely reflect the feed-forward communication of salient neural events from visual cortex to higher-level areas. These two possibilities differ in the causal chain assumed to underlie changes in visual awareness, but it remains difficult to infer causality from correlative neurophysiological measures. Ideally, this would be addressed by probing the causal role of frontal and parietal areas using experimental lesion and microstimulation techniques. For example, a recent study which used transcranial magnetic stimulation (TMS) to create virtual lesions showed that particular frontal cortical areas (e.g. dorsolateral prefrontal cortex) were causally relevant for voluntary control over perceptual switches in a multistable structure-from-motion stimulus [80]. Other observations that activations in frontal and parietal areas precede activity associated with the sensory processing of perceptual switches also suggest that feedback signals from frontoparietal areas modulate

However, other results reviewed earlier suggest that the ultimate resolution will be more nuanced and complicated than the dichotomy referred to above (e.g., [19,32,61]). One particularly appealing framework previously proposed suggests that the frontal and parietal areas form part of a sensorimotor continuum and are designed to periodically check or update the current perceptual organization in the visual system [82,83]. Hence this central control network would mediate between alternative perceptions for conscious awareness. This process may in fact occur all the time in natural vision, but would usually proceed unnoticed, resulting in a stable perception of the visual world. In any case, it will be important to carry out further studies in order to clarify the functional role of frontoparietal activity and determine the manner in which it relates to the mechanisms underlying

A review of recent functional neuroimaging studies indicates that binocular depth, rivalry and multistability are three perceptual processing domains which share neural substrates, including largely overlapping occipital, parietal and frontal cortical areas. All three of these perceptual processing modalities can be conceptualized as a series of visual perceptual processing stages in occipital areas, as well as higher-level cognitive functions in parietal and frontal areas, involving decision making, motor planning and execution, attention, awareness and memory. Current research will further study the manner in which these cortical areas interact, and the causal sequence of events which underlies each of these three perceptual processing modalities, recalling some of the most important themes of

perception in general, and reorganizations during bistable perception.

neuroscience in these overlapping and interrelated functions.

Athena Buckthought\* and Janine D. Mendola *Department of Ophthalmology, McGill University, Montreal, Canada* 

## **Acknowledgement**

This work was supported by NSERC and NIH R01 EY015219 grants, and a LOF grant from the Canadian Foundation for Innovation.

## **8. References**


<sup>\*</sup> Corresponding Author

[12] Nishida Y, Hayashi O, Iwami T, Kimura M, Kani K, Ito R, Shiino A, Suzuki M (2001) Stereopsis-Processing Regions in the Human Parieto-Occipital Cortex. Neuroreport 12: 2259-2263.

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 107

[30] Sterzer P, Russ MO, Preibisch C, Kleinschmidt A (2002) Neural Correlates of Spontaneous Direction Reversals in Ambiguous Apparent Visual Motion. Neuroimage

[31] Raz A, Lamar M, Buhle JT, Kane MJ, Peterson BS (2007) Selective Biasing of a Specific Bistable-Figure Percept Involves FMRI Signal Changes in Frontostriatal Circuits: A Step Toward Unlocking the Neural Correlates of Top-Down Control and Self-Regulation.

[32] Slotnick SD, Yantis S (2005) Common Neural Substrates for the Control and Effects of Visual Attention and Perceptual Bistability. Brain Res Cogn Brain Res. 24: 97-108. [33] Sterzer P, Kleinschmidt A (2007) A Neural Basis for Inference in Perceptual Ambiguity.

[34] Tracy JI, Flanders A, Madi S, Natale P, Goyal N, Laskas J, Delvecchio N, Pyrros A (2005) The Brain Topography Associated with Active Reversal and Suppression of an

[35] Inui T, Tanaka S, Okada T, Nishizawa S, Katayama M, Konishi J (2000) Neural Substrates for Depth Perception of the Necker Cube; A Functional Magnetic Resonance

[36] Raemaekers M, van der Schaaf ME, van Ee R, van Wezel RJA (2009) Widespread fMRI Activity Differences between Perceptual States in Visual Rivalry and Correlated with

[37] Schoth F, Waberski TD, Krings T, Gobbele R, Buchner H (2007) Cerebral Processing of

[38] Brouwer GJ, van Ee R (2007) Visual Cortex Allows Prediction of Perceptual States

[39] Larsson J, Heeger DJ (2006) Two Retinotopic Visual Areas in Human Lateral Occipital

[40] Tootell RB, Hadjikhani NK, Vanduffel W, Liu AK, Mendola JD, Sereno MI, Dale AM (1998) Functional Analysis of Primary Visual Cortex (V1) in Humans. Proc Natl Acad

[41] Ungerleider LG, Mishkin M (1982) Two Cortical Visual Systems. In: Ingle DJ, Goodale MA, Mansfield RJW, editors. Analysis of Visual Behavior. Cambridge, Mass: MIT Press.

[42] Orban GA, Claeys K, Nelissen K, Smans R, Sunaert S, Todd JT, Wardak C, Durand JB, Vanduffel W (2006) Mapping the Parietal Cortex of Human and Non-Human Primates.

[43] Kourtzi Z, Bülthoff HH, Erb M, Grodd W (2002) Object-Selective Responses in the

[44] Kourtzi Z, Erb M, Grodd W, Bulthoff HH (2003) Representation of the Perceived 3-D Object Shape in the Human Lateral Occipital Complex. Cereb Cortex 13: 911-920.

Spontaneous Reversals of the Necker Cube. Neuroreport 18: 1335-1338.

During Ambiguous Structure-From-Motion. J Neurosci. 27: 1015-1023.

15: 908-916.

Am J Clin Hypn. 50: 137-156.

Proc Natl Acad Sci U S A 104: 323-328.

Cortex. J Neurosci. 26: 13128-13142.

Neuropsychologia 44: 2647-2667.

Human Motion area MT/MST. Nat Neurosci. 5:17–18.

Sci USA 95: 811– 817.

pp. 549–86.

Ambiguous Figure. Eur J Cogn Psychol. 17: 267-288.

Imaging Study in Human Subjects. Neurosci Lett. 282: 145–148.

Differences in Observer Biases. Brain Res. 1252: 161-171.


[30] Sterzer P, Russ MO, Preibisch C, Kleinschmidt A (2002) Neural Correlates of Spontaneous Direction Reversals in Ambiguous Apparent Visual Motion. Neuroimage 15: 908-916.

106 Visual Cortex – Current Status and Perspectives

Neurophysiol. 97: 1553-1565.

Human Brain. Science 280: 1930-1934.

Human Visual Cortex Using FMRI. J Vis. 7: 1-14.

2259-2263.

827.

626.

2433.

1153-1159.

Cortex. J Vis. 9: 1-22.

[12] Nishida Y, Hayashi O, Iwami T, Kimura M, Kani K, Ito R, Shiino A, Suzuki M (2001) Stereopsis-Processing Regions in the Human Parieto-Occipital Cortex. Neuroreport 12:

[13] Bridge H, Parker AJ (2007) Topographical Representation of Binocular Depth in the

[14] Neri P, Bridge H, Heeger DJ (2004) Stereoscopic Processing of Absolute and Relative

[15] Welchman AE, Deubelius A, Conrad V, Bulthoff HH, Kourtzi Z (2005) 3D Shape Perception From Combined Depth Cues in Human Visual Cortex. Nat Neurosci. 8: 820-

[16] Chandrasekaran C, Canon V, Dahmen JC, Kourtzi Z, Welchman AE (2007) Neural Correlates of Disparity-Defined Shape Discrimination in the Human Brain. J

[17] Gilaie-Dotan S, Ullman S, Kushnir T, Malach R (2002) Shape-Selective Stereo Processing

[18] Tyler CW, Likova LT, Kontsevich LL, Wade AR (2006) The Specificity of Cortical

[19] Knapen T, Brascamp J, Pearson J, van Ee R, Blake R (2011) The Role of Frontal and

[20] Lumer ED, Friston KJ, Rees G (1998). Neural Correlates of Perceptual Rivalry in the

[21] Lumer ED, Rees G (1999) Covariation of Activity in Visual and Prefrontal Cortex Associated with Subjective Visual Perception. Proc Natl Acad Sci U S A 96: 1669-1673. [22] Wilcke JC, O'Shea RP, Watts R (2009) Frontoparietal Activity and Its Structural

[23] Fang F, He S (2005) Cortical Responses to Invisible Objects in the Human Dorsal and

[24] Tong F, Nakayama K, Vaughan JT, Kanwisher N (1998) Binocular Rivalry and Visual

[25] Jiang Y, He S (2006) Cortical Responses to Invisible Faces: Dissociating Subsystems for

[26] Lee S-H, Blake R (2002) V1 Activity is Reduced During Binocular Rivalry. J Vis. 2: 618–

[27] Polonsky A, Blake R, Braun J, Heeger DJ (2000) Neuronal Activity in Human Primary Visual Cortex Correlates with Perception During Binocular Rivalry. Nat Neurosci. 3:

[28] Moradi F, Heeger DJ (2009) Inter-ocular Contrast Normalization in Human Visual

[29] Kleinschmidt A, Buchel C, Zeki S, Frackowiak RS (1998) Human Brain Activity During Spontaneously Reversing Perception of Ambiguous Figures. Proc Biol Sci. 265: 2427-

Disparity in Human Visual Cortex. J Neurophysiol. 92: 1880-1891.

in Human Object-Related Visual Areas. Hum Brain Mapp. 15: 67-79.

Parietal Brain Areas in Bistable Perception. J Neurosci. 31: 10293-10301.

Region KO to Depth Structure. Neuroimage 30: 228-238.

Connectivity in Binocular Rivalry. Brain Res. 1305: 96-107.

Awareness in Human Extrastriate Cortex. Neuron 21: 753-759.

Facial-Information Processing. Curr Biol. 16: 2023-2029.

Ventral Pathways. Nat Neurosci. 8: 1380-1385.


[45] Mendola JD, Dale AM, Fischl B, Liu AK, Tootell RBH (1999) The Representation of Real and Illusory Contours in Human Cortical Visual Areas Revealed by FMRI. J Neurosci. 19: 8560–8572.

Neural Mechanisms for Binocular Depth, Rivalry and Multistability 109

[64] Wunderlich K, Schneider KA, Kastner S (2005) Neural Correlates of Binocular Rivalry in

[65] Corbetta M, Kincade JM, Ollinger JM, McAvoy MP, Shulman GL (2000) Voluntary Orienting is Dissociated from Target Detection in Human Posterior Parietal Cortex. Nat

[66] Corbetta M, Shulman GL (2002) Control of Goal-Directed and Stimulus-Driven

[67] Kincade JM, Abrams RA, Astafiev SV, Shulman GL, Corbetta M (2005) An Event-Related Functional Magnetic Resonance Imaging Study of Voluntary and Stimulus-

[68] Yantis S, Serences JT (2003) Cortical Mechanisms of Spaced-Based and Object-Based

[69] Freeman AW, Nguyen VA, Alais D (2005) The Nature and Depth of Binocular Rivalry Suppression. In: Alais D, Blake R, editors. Binocular Rivalry. Cambridge, MA: MIT

[70] Corbetta M, Patel G, Shulman GL (2008) The Reorienting System of the Human Brain:

[71] Corbetta M, Miezin FM, Dobmeyer S, Shulman GL, Petersen SE (1990) Attentional Modulation of Neural Processing of Shape, Color, and Velocity in Humans. Science 248:

[72] Necker LA (1832) Observations on Some Remarkable Phenomena Seen in Switzerland, and an Optical Phenomenon Which Occurs on Viewing of a Crystal or Geometrical

[73] Rubin E (1921) Visuell Wahrgenommene Figuren. Kobenhaven: Glydenalske

[74] Meng M, Tong F (2004). Can Attention Selectively Bias Bistable Perception? Differences

[75] Ramachandran VS, Anstis SM (1985) Perceptual Organization in Multistable Apparent

[77] O'Shea RP, Parker A, La Rooy D, Alais D (2009) Monocular Rivalry Exhibits Three Hallmarks of Binocular Rivalry: Evidence for Common Processes. Vision Res. 49: 671-

[78] Cole MW, Schneider W (2007) The Cognitive Control Network: Integrated Cortical

[79] Pastor MA, Day BL, Macaluso E, Friston KJ, Frackowiak RS (2004) The Functional

[80] de Graaf TA, de Jong MC, Goebel R, van Ee R, Sack AT (2011) On the Functional Relevance of Frontal Cortex for Passive and Voluntarily Controlled Bistable Vision.

Between Binocular Rivalry and Ambiguous Figures. J Vis. 4: 539-551.

[76] Breese BB (1899) On Inhibition. Psychological Monographs 3: 1-65.

Regions with Dissociable Functions. Neuroimage 37: 343-360.

Neuroanatomy of Temporal Discrimination. J Neurosci. 24: 2585-2591.

he Human Lateral Geniculate Nucleus. Nat Neurosci. 8: 1595-1602.

Attention in the Brain. Nat Rev Neurosci. 3:201-215.

Driven Orienting of Attention. J Neurosci. 25: 4593-4604.

Attentional Control. Curr. Opin. Neurobiol. 13: 187– 193.

From Environment to Theory of Mind. Neuron 58: 306-324.

Solid. Philosophical Magazine 1: 329-337.

Motion. Perception 144: 135-143.

Cereb Cortex 21: 2322-2331.

Neurosci. 3: 292-297.

Press. pp. 47-62.

1556–1559.

Boghandel.

681.


[64] Wunderlich K, Schneider KA, Kastner S (2005) Neural Correlates of Binocular Rivalry in he Human Lateral Geniculate Nucleus. Nat Neurosci. 8: 1595-1602.

108 Visual Cortex – Current Status and Perspectives

Perception 29: 729-743.

768-73.

1135.

4420.

Occipital Complex. Neuron 29: 277–286.

Volume 25. Heidelberg: Springer-Verlag.

Memory. Vision Res. 47: 2741-2750.

Rivalry. Vision Res. 31: 1191-1203.

Front Hum Neurosci. 5: 1-5.

and Binocular Rivalry. Vision Res. 44: 2367-2380.

Vision and Visual Awareness. J Cogn Neurosci. 16: 1049-1059.

Share Common Perceptual Dynamics. J Vis. 7: 1-11.

Bistability and Cortical Activation. PLoS One 4: e5056.

Human Lateral Geniculate Nucleus. Nature 438: 496-499.

Plaid Patterns. Vision Res. 47: 2543-2556.

Representation. Nature 411: 195–199.

19: 8560–8572.

[45] Mendola JD, Dale AM, Fischl B, Liu AK, Tootell RBH (1999) The Representation of Real and Illusory Contours in Human Cortical Visual Areas Revealed by FMRI. J Neurosci.

[46] Moore C, Engel SA (2001) Neural Response to Perception of Volume in the Lateral

[47] Haushofer J, Baker CI, Livingstone MS, Kanwisher N (2008) Privileged Coding of Convex Shapes in Human Object-Selective Cortex. J Neurophysiol. 100: 753-62. [48] Thier P, Karnath HO (1997) Parietal Lobe Contributions to Orientation in 3D Space,

[49] Liu CH, Collin CA, Chaudhuri A (2000) Does Face Recognition Rely on Encoding of 3-D Surface? Examining the Role of Shape-From-Shading and Shape-From-Stereo.

[50] Liu CH, Ward J (2006) The Use of 3D Information in Face Recognition. Vision Res. 46:

[52] Dayan P (1998). A Hierarchical Model of Binocular Rivalry. Neural Comput. 10: 1119-

[53] Freeman AW (2005) Multistage Model for Binocular Rivalry. J Neurophysiol. 94: 4412–

[54] Wilson HR (2007) Minimal Physiological Conditions for Binocular Rivalry and Rivalry

[55] Blake R, Yang YD, Wilson HR (1991) On the Coexistence of Stereopsis and Binocular

[56] Hayashi R, Maeda T, Shimojo S, Tachi S (2004) An Integrative Model of Binocular Vision: a Stereo Model Utilizing Interocularly Unpaired Points Produces Both Depth

[57] Andrews TJ, Holmes D (2011) Stereoscopic Depth Perception During Binocular Rivalry.

[58] Buckthought A, Wilson HR (2007) Interaction Between Binocular Rivalry and Depth in

[59] Macknik SL, Martinez-Conde S (2004) Dichoptic Visual Masking Reveals That Early Binocular Neurons Exhibit Weak Interocular Suppression: Implications for Binocular

[60] van Boxtel JJ, van Ee R, Erkelens CJ (2007) Dichoptic Masking and Binocular Rivalry

[61] Brouwer GJ, Tong F, Hagoort P, van Ee R (2009) Perceptual Incongruence Influences

[62] Tong F, Engel S (2001) Interocular Rivalry Revealed in the Human Cortical Blind-Spot

[63] Haynes JD, Deichmann R, Rees G (2005) Eye-Specific Effects of Binocular Rivalry in the

[51] Blake R (1989) A Neural Theory of Binocular Rivalry. Psychol Rev. 96: 145-167.

	- [81] Britz J, Pitts MA, Michel CM (2011) Right Parietal Brain Activity Precedes Perceptual Alternation During Binocular Rivalry. Hum Brain Mapp. 32:1432-1442.

**Chapter 5** 

© 2012 Aaen-Stockdale and Thompson, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

properly cited.

cortical areas responsible for motion perception.

motion, but the movement can clearly be seen.

**Visual Motion: From Cortex to Percept** 

A brown bear pads across a snowfield. It is watched (hopefully from a safe distance) by an observer. As the bear tramps through the snow, a bear-shaped patch of darkness is projected onto the back of the observer's eye. The motion of this image across the observer's otherwise brightly illuminated retina causes a series of changes in the activity of densely packed photoreceptors that are sensitive to changes in light intensity. The observer's visual system can, as the bear progresses, perform the remarkable feat of computing its speed and direction of motion – and the speed and direction of each of the bear's constituent parts from many million changes in neural firing rate. This ability has clear evolutionary

Less common is the ability to detect motion that is not based on changes in luminance. To return to our wintery example, the force and direction of the wind or the presence of a smaller animal burrowing under the snow can be determined by detecting changes in the pattern of random flicker caused as flakes of snow on the ground are disturbed. Here, the changes in luminance do not, in themselves, signal any consistent speed or direction of

In this chapter, we review key aspects of visual motion perception with a particular emphasis on the cortical areas thought to be involved. We begin with the integration of motion signals across extended regions of the visual field. This is central to the ability of the visual cortex to bind multiple features together into a coherent, stable visual percept. We then move on to the question of plasticity within the early cortical areas responsible for motion perception and review the brain regions thought to be involved in the processing of complex motion information such the motion signals that flow over the retina as an observer moves around in their environment. In the final two sections of the chapter we consider the mechanisms involved in the perception of motion in the absence of useful luminance information and the consequences of lesions and abnormal development affecting the

and reproduction in any medium, provided the original work is properly cited.

advantages, and as such it has been widely selected for in the animal kingdom.

Craig Aaen-Stockdale and Benjamin Thompson

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50402

**1. Introduction** 


## **Visual Motion: From Cortex to Percept**

Craig Aaen-Stockdale and Benjamin Thompson

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/50402

## **1. Introduction**

110 Visual Cortex – Current Status and Perspectives

Perception. Trends Cogn Sci. 3: 254-264.

[81] Britz J, Pitts MA, Michel CM (2011) Right Parietal Brain Activity Precedes Perceptual

[82] Leopold DA, Logothetis NK (1999) Multistable Phenomena: Changing Views in

[83] Mendola JD, Conner IP, Sharma S, Bahekar A, Lemieux S (2006) FMRI Measures of Perceptual Filling-In in the Human Visual Cortex. J Cogn Neurosci. 18: 363-75.

Alternation During Binocular Rivalry. Hum Brain Mapp. 32:1432-1442.

A brown bear pads across a snowfield. It is watched (hopefully from a safe distance) by an observer. As the bear tramps through the snow, a bear-shaped patch of darkness is projected onto the back of the observer's eye. The motion of this image across the observer's otherwise brightly illuminated retina causes a series of changes in the activity of densely packed photoreceptors that are sensitive to changes in light intensity. The observer's visual system can, as the bear progresses, perform the remarkable feat of computing its speed and direction of motion – and the speed and direction of each of the bear's constituent parts from many million changes in neural firing rate. This ability has clear evolutionary advantages, and as such it has been widely selected for in the animal kingdom.

Less common is the ability to detect motion that is not based on changes in luminance. To return to our wintery example, the force and direction of the wind or the presence of a smaller animal burrowing under the snow can be determined by detecting changes in the pattern of random flicker caused as flakes of snow on the ground are disturbed. Here, the changes in luminance do not, in themselves, signal any consistent speed or direction of motion, but the movement can clearly be seen.

In this chapter, we review key aspects of visual motion perception with a particular emphasis on the cortical areas thought to be involved. We begin with the integration of motion signals across extended regions of the visual field. This is central to the ability of the visual cortex to bind multiple features together into a coherent, stable visual percept. We then move on to the question of plasticity within the early cortical areas responsible for motion perception and review the brain regions thought to be involved in the processing of complex motion information such the motion signals that flow over the retina as an observer moves around in their environment. In the final two sections of the chapter we consider the mechanisms involved in the perception of motion in the absence of useful luminance information and the consequences of lesions and abnormal development affecting the cortical areas responsible for motion perception.

© 2012 Aaen-Stockdale and Thompson, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## **2. Cortical processing of visual information**

Visual processing has often been thought of as being subdivided into two parallel processing streams known as the parvocellular (also known as ventral) and magnocellular (also known as dorsal) pathways (Ungerleider & Mishkin, 1982; Goodale & Milner, 1992). This delineation begins at the level of retinal ganglion cells. Small (midget) and larger (parasol) ganglion cells project, respectively, to the distinct parvocellular ("P") and magnocellular ("M") layers of the lateral geniculate nucleus of the thalamus (LGN) (Derrington & Lennie, 1984; Merigan et al., 1991). In turn, these M and P LGN cells project to distinct sub-regions of layer 4c within the primary visual cortex (Hubel & Wiesel, 1972). According to the "dual pathway" model, the parvocellular pathway, primarily carrying high spatial frequency (fine detail) and colour information, then projects to ventral areas of the extrastriate cortex such as V4. Conversely, the magnocellular pathway, primarily carrying low spatial (coarse detail) and high temporal frequency information, innervates dorsal extrastriate regions such as the middle temporal visual area (MT) also known as V5 and the middle superior temporal visual area (MST). These projections are thought to produce cortical pathways specialized for form processing and spatial position/motion perception respectively (Ungerleider & Haxby, 1994). It is now clear that there is considerable crosstalk between these two pathways and that other connections exist between the retina and the brain that include the koniocellular layers of the LGN and other thalamic regions such as the superior colliculus and pulvinar (see de Haan & Cowey, 2011 for a recent review). However the concept of parallel processing has provided a useful framework for the investigation of motion perception and has inspired a large number of psychophysical studies in this area.

Visual Motion: From Cortex to Percept 113

Bergen, 1985). Alternative models have been suggested based on inhibitory interactions between adjacent regions within the receptive field (Barlow & Levick, 1965), spatiotemporal differencing (Marr & Ullman, 1981) or spatiotemporal gradients (Johnston et al., 1992), but

**2.2. Global motion analysis – V3A and the middle temporal visual area (MT/V5)** 

The year 1985 was a seminal year for the study of visual motion (Burr & Thompson, 2011) seeing, as it did, publication of several influential models of local motion processing (Adelson & Bergen, 1985; van Santen & Sperling, 1985; Watson & Ahumada, 1985). Although it had taken a great leap forward, the race to understand the processing of visual motion was, however, far from over. It is one thing to understand how local motion selectivity arises in single neurons, but quite another to understand how these local motion signals are combined across space to give the perception of moving edges, surfaces and objects. The major hurdle (and it is a significant one) is that the output of individual local motion detectors, such as those present in V1 and those modelled in the literature

Since V1 neurons (or a hypothetical local motion detector) only "see" a small portion of the world, they do not respond maximally to a single stimulus form, but to a whole family of

By integrating the outputs of many motion detectors over larger and larger portions of the visual field, the visual system can disambiguate these signals and extract genuine motion from the inherently ambiguous local motion signals, a processing stage known as the extraction of *global motion.* Such a process is thought to occur in extrastriate cortical areas such as V3A and MT, both of which contain cells that respond to coherent global motion (Allman et al., 1985; Movshon et al., 1985; Newsome & Pare, 1988; Rodman & Albright, 1989; Salzman et al., 1992; Heeger et al., 1999; Braddick et al., 2001). Similar properties seem to be present in the human analogue of MT known as V5 or hMT+ (Beckers & Zeki, 1995; Tootell et al., 1995; Huk & Heeger, 2002; Cowey et al., 2006). For example, we have recently demonstrated (Figure 2) that inhibitory repetitive transcranial magnetic stimulation (rTMS) of human V5 can impair the combination of local motion signals into a global motion percept (Thompson et al., 2009). With such abundant evidence that this cortical region is crucial to the perception of global motion, a question still to be answered is what sort of

Several models of local motion combination have been proposed. A widely cited early model is the *intersection-of-constraints* (Adelson & Movshon, 1982) in which local onedimensional (1D) motions (see legend of Figure 1) are extracted from the two-dimensional (2D) visual stimulus, their respective constraint lines are computed and the motion of the pattern corresponds to the intersection of these constraint lines. The intersection-ofconstraints is a geometrical solution to the aperture problem and therefore provides a useful point of comparison for psychophysical data and resulting models aiming to understand

motions. This problem, referred to as the *aperture problem* is demonstrated in Figure 1.

the motion energy model is currently the dominant model of V1 motion selectivity.

mentioned above, is often *ambiguous.*

combinatorial processes are actually occurring in V5?

#### **2.1. Local motion analysis – V1**

The first port of call for the majority of visual information in the cortex is the primary visual cortex, or V1. It is also often referred to as the "striate cortex" due to its stratified appearance under the microscope. Thanks to the Nobel prize winning experiments by Hubel and Wiesel, we know that single striate cortex neurons in the cat (Hubel & Wiesel, 1959) and monkey (Hubel & Weisel, 1968) respond best to oriented lines and that some of these neurons are also selective for the direction in which a luminance-defined stimulus is moved across their receptive field. The neural architecture necessary to achieve luminance-based motion detection and discrimination is, therefore, already in place at the relatively low level of V1. Various computational theories have successfully modelled the detection of this type of luminance-based motion (Adelson & Bergen, 1985; van Santen & Sperling, 1985; Watson & Ahumada, 1985). The method common to all of these models is to combine the outputs of two neurons, one whose receptive field has an "odd" (sine phase) spatial profile and an "even" (cosine phase) temporal profile and another neuron whose receptive field has an even spatial profile and an odd temporal profile. By appropriately combining the outputs of non-directional V1 neurons whose spatial and temporal responses are 90° out-of-phase (often referred to as "quadrature pairs"), motion energy can be recovered (Adelson & Bergen, 1985). Alternative models have been suggested based on inhibitory interactions between adjacent regions within the receptive field (Barlow & Levick, 1965), spatiotemporal differencing (Marr & Ullman, 1981) or spatiotemporal gradients (Johnston et al., 1992), but the motion energy model is currently the dominant model of V1 motion selectivity.

112 Visual Cortex – Current Status and Perspectives

psychophysical studies in this area.

**2.1. Local motion analysis – V1** 

**2. Cortical processing of visual information** 

Visual processing has often been thought of as being subdivided into two parallel processing streams known as the parvocellular (also known as ventral) and magnocellular (also known as dorsal) pathways (Ungerleider & Mishkin, 1982; Goodale & Milner, 1992). This delineation begins at the level of retinal ganglion cells. Small (midget) and larger (parasol) ganglion cells project, respectively, to the distinct parvocellular ("P") and magnocellular ("M") layers of the lateral geniculate nucleus of the thalamus (LGN) (Derrington & Lennie, 1984; Merigan et al., 1991). In turn, these M and P LGN cells project to distinct sub-regions of layer 4c within the primary visual cortex (Hubel & Wiesel, 1972). According to the "dual pathway" model, the parvocellular pathway, primarily carrying high spatial frequency (fine detail) and colour information, then projects to ventral areas of the extrastriate cortex such as V4. Conversely, the magnocellular pathway, primarily carrying low spatial (coarse detail) and high temporal frequency information, innervates dorsal extrastriate regions such as the middle temporal visual area (MT) also known as V5 and the middle superior temporal visual area (MST). These projections are thought to produce cortical pathways specialized for form processing and spatial position/motion perception respectively (Ungerleider & Haxby, 1994). It is now clear that there is considerable crosstalk between these two pathways and that other connections exist between the retina and the brain that include the koniocellular layers of the LGN and other thalamic regions such as the superior colliculus and pulvinar (see de Haan & Cowey, 2011 for a recent review). However the concept of parallel processing has provided a useful framework for the investigation of motion perception and has inspired a large number of

The first port of call for the majority of visual information in the cortex is the primary visual cortex, or V1. It is also often referred to as the "striate cortex" due to its stratified appearance under the microscope. Thanks to the Nobel prize winning experiments by Hubel and Wiesel, we know that single striate cortex neurons in the cat (Hubel & Wiesel, 1959) and monkey (Hubel & Weisel, 1968) respond best to oriented lines and that some of these neurons are also selective for the direction in which a luminance-defined stimulus is moved across their receptive field. The neural architecture necessary to achieve luminance-based motion detection and discrimination is, therefore, already in place at the relatively low level of V1. Various computational theories have successfully modelled the detection of this type of luminance-based motion (Adelson & Bergen, 1985; van Santen & Sperling, 1985; Watson & Ahumada, 1985). The method common to all of these models is to combine the outputs of two neurons, one whose receptive field has an "odd" (sine phase) spatial profile and an "even" (cosine phase) temporal profile and another neuron whose receptive field has an even spatial profile and an odd temporal profile. By appropriately combining the outputs of non-directional V1 neurons whose spatial and temporal responses are 90° out-of-phase (often referred to as "quadrature pairs"), motion energy can be recovered (Adelson &

## **2.2. Global motion analysis – V3A and the middle temporal visual area (MT/V5)**

The year 1985 was a seminal year for the study of visual motion (Burr & Thompson, 2011) seeing, as it did, publication of several influential models of local motion processing (Adelson & Bergen, 1985; van Santen & Sperling, 1985; Watson & Ahumada, 1985). Although it had taken a great leap forward, the race to understand the processing of visual motion was, however, far from over. It is one thing to understand how local motion selectivity arises in single neurons, but quite another to understand how these local motion signals are combined across space to give the perception of moving edges, surfaces and objects. The major hurdle (and it is a significant one) is that the output of individual local motion detectors, such as those present in V1 and those modelled in the literature mentioned above, is often *ambiguous.*

Since V1 neurons (or a hypothetical local motion detector) only "see" a small portion of the world, they do not respond maximally to a single stimulus form, but to a whole family of motions. This problem, referred to as the *aperture problem* is demonstrated in Figure 1.

By integrating the outputs of many motion detectors over larger and larger portions of the visual field, the visual system can disambiguate these signals and extract genuine motion from the inherently ambiguous local motion signals, a processing stage known as the extraction of *global motion.* Such a process is thought to occur in extrastriate cortical areas such as V3A and MT, both of which contain cells that respond to coherent global motion (Allman et al., 1985; Movshon et al., 1985; Newsome & Pare, 1988; Rodman & Albright, 1989; Salzman et al., 1992; Heeger et al., 1999; Braddick et al., 2001). Similar properties seem to be present in the human analogue of MT known as V5 or hMT+ (Beckers & Zeki, 1995; Tootell et al., 1995; Huk & Heeger, 2002; Cowey et al., 2006). For example, we have recently demonstrated (Figure 2) that inhibitory repetitive transcranial magnetic stimulation (rTMS) of human V5 can impair the combination of local motion signals into a global motion percept (Thompson et al., 2009). With such abundant evidence that this cortical region is crucial to the perception of global motion, a question still to be answered is what sort of combinatorial processes are actually occurring in V5?

Several models of local motion combination have been proposed. A widely cited early model is the *intersection-of-constraints* (Adelson & Movshon, 1982) in which local onedimensional (1D) motions (see legend of Figure 1) are extracted from the two-dimensional (2D) visual stimulus, their respective constraint lines are computed and the motion of the pattern corresponds to the intersection of these constraint lines. The intersection-ofconstraints is a geometrical solution to the aperture problem and therefore provides a useful point of comparison for psychophysical data and resulting models aiming to understand

how global motion is computed within the brain. A number of psychophysical results have suggested that perception does not always follow the intersection of constraints rule (Ferrera & Wilson, 1990; Yo & Wilson, 1992) and, as such, alternative models have been developed. One key alternative is the *vector sum* model (Wilson et al., 1992; Wilson & Kim, 1994) in which global motion is computed as the sum of the local motion vectors.

Visual Motion: From Cortex to Percept 115

**Figure 2.** Evidence for the role of human V5 in the processing of global motion. The top panel depicts plaid stimuli that are constructed from two gratings moving in different directions within a circular aperture. If the spatial properties of the gratings are sufficiently dissimilar then the two gratings are seen to drift over one another in directions orthogonal to their orientation due to the aperture problem described above (top-left image). However, if the gratings are similar to one another they may be perceived as a coherent patterned surface moving globally in a different direction from either of the component gratings (top-right image). The lower panel shows average data from 11 participants who viewed ambiguous plaid patterns that could be perceived as either coherent or incoherent before and after inhibitory repetitive transcranial magnetic stimulation of either V1 or V5. The probability of perceiving the stimuli as moving coherently is shown for three different time points after stimulation. rTMS of V5 significantly reduced the probability that participants would perceive plaids as moving coherently directly after stimulation, suggesting that this area is centrally involved in global motion perception. Stimulation of V1 had the opposite effect. Data re-plotted from (Thompson et al., 2009).

stimulus, not the speed of its components (Derrington & Badcock, 1992; Wright & Gurney, 1992). In addition, the size and number of blobs within plaid stimuli influence the perceived motion direction and the direction of the associated motion after-effects (Alais et al., 1994;

**Figure 1.** The aperture problem. A) An edge, represented by the black line, moves in a particular direction, at a particular speed, through the receptive field of a motion-sensitive neuron (blue circle). The direction in which the edge is moving is represented by the red arrow, and the speed at which it is moving is represented by the length of the arrow. Assume that the neuron is sensitive to stimuli of this particular orientation moving at this particular speed, and so fires. The aperture problem arises because exactly the same response can be achieved by moving another edge, oriented at the same angle, through the receptive field in a completely different direction, at a much faster speed. Since the edge is uniform and featureless, the neuron is unable to determine its direction and speed of motion. The *global motion* of the edge is therefore ambiguous. B) It turns out that exactly the same stimulus can be reproduced by a family of different combinations of speed and direction constrained by a *constraint line* (dashed line) perpendicular to the end point of the shortest (i.e. slowest) motion vector. As such, only motion perpendicular to a contour can be detected by a single detector. It is therefore referred to as *onedimensional* motion.

Both of these models are, however, two-stage operations in which the 2D stimulus is decomposed into its local 1D motions and then reconstructed according to a mathematical rule. There are a number of psychophysical results that are not consistent with this decomposition-recombination approach. The ability to discriminate the direction of motion of a plaid appears to depend critically upon the speed of the 2D features or "blobs" in the

*dimensional* motion.

how global motion is computed within the brain. A number of psychophysical results have suggested that perception does not always follow the intersection of constraints rule (Ferrera & Wilson, 1990; Yo & Wilson, 1992) and, as such, alternative models have been developed. One key alternative is the *vector sum* model (Wilson et al., 1992; Wilson & Kim,

1994) in which global motion is computed as the sum of the local motion vectors.

**Figure 1.** The aperture problem. A) An edge, represented by the black line, moves in a particular direction, at a particular speed, through the receptive field of a motion-sensitive neuron (blue circle). The direction in which the edge is moving is represented by the red arrow, and the speed at which it is moving is represented by the length of the arrow. Assume that the neuron is sensitive to stimuli of this particular orientation moving at this particular speed, and so fires. The aperture problem arises because exactly the same response can be achieved by moving another edge, oriented at the same angle, through the receptive field in a completely different direction, at a much faster speed. Since the edge is uniform and featureless, the neuron is unable to determine its direction and speed of motion. The *global motion* of the edge is therefore ambiguous. B) It turns out that exactly the same stimulus can be reproduced by a family of different combinations of speed and direction constrained by a *constraint line* (dashed line) perpendicular to the end point of the shortest (i.e. slowest) motion vector. As such, only motion perpendicular to a contour can be detected by a single detector. It is therefore referred to as *one-*

Both of these models are, however, two-stage operations in which the 2D stimulus is decomposed into its local 1D motions and then reconstructed according to a mathematical rule. There are a number of psychophysical results that are not consistent with this decomposition-recombination approach. The ability to discriminate the direction of motion of a plaid appears to depend critically upon the speed of the 2D features or "blobs" in the

**Figure 2.** Evidence for the role of human V5 in the processing of global motion. The top panel depicts plaid stimuli that are constructed from two gratings moving in different directions within a circular aperture. If the spatial properties of the gratings are sufficiently dissimilar then the two gratings are seen to drift over one another in directions orthogonal to their orientation due to the aperture problem described above (top-left image). However, if the gratings are similar to one another they may be perceived as a coherent patterned surface moving globally in a different direction from either of the component gratings (top-right image). The lower panel shows average data from 11 participants who viewed ambiguous plaid patterns that could be perceived as either coherent or incoherent before and after inhibitory repetitive transcranial magnetic stimulation of either V1 or V5. The probability of perceiving the stimuli as moving coherently is shown for three different time points after stimulation. rTMS of V5 significantly reduced the probability that participants would perceive plaids as moving coherently directly after stimulation, suggesting that this area is centrally involved in global motion perception. Stimulation of V1 had the opposite effect. Data re-plotted from (Thompson et al., 2009).

stimulus, not the speed of its components (Derrington & Badcock, 1992; Wright & Gurney, 1992). In addition, the size and number of blobs within plaid stimuli influence the perceived motion direction and the direction of the associated motion after-effects (Alais et al., 1994;

Alais et al., 1996; Alais et al., 1997). Furthermore, physiological data show that motionselective cells in V1, that have broad orientation-tuning, respond to the motion of 2D features in the stimulus (Tinsley et al., 2003) and presumably feed this information forward to MT/V5, where it is combined with 1D local motion signals to influence perception. The visual system has for some time been likened to a Fourier analyser, that breaks down the complicated visual stimulus into its simpler 1D components, then recombines it (Campbell & Robson, 1968). The work of Tinsley et al. and Alais and colleagues is by no means fatal to the two-stage conception of global motion analysis, but rather reminds us that the first-stage is not a straightforward global Fourier analysis. In addition, these data suggest that the visual system may shift between different strategies for combining local motion based on the task demands and available information (Nishida, 2011).

Visual Motion: From Cortex to Percept 117

in the high contrast conditions (Betts et al., 2005). The authors proposed that this weaker spatial suppression in older observers is a perceptual correlate of age-related changes in GABA-mediated inhibitory processes in the brain (Leventhal et al., 2003). Similar claims regarding a weakening (or not) of centre-surround antagonism in cortical areas have since been made using the same psychophysical technique for observers with schizophrenia (Tadin et al., 2006), depression (Golomb et al., 2009) and migraine (Battista et al., 2010).

However, the proposed correlation of this psychophysical effect to centre-surround antagonism in cortical area V5, and the connection with age- or psychiatric-related changes in the GABA inhibitory system, has been criticised on the basis that the evidence to date is

Chen and colleagues propose that spatial suppression is indeed a consequence of centresurround antagonism, but that this surround inhibition is not occurring in V5 (Chen, 2011). They argue that the simple gratings or 100% coherent dot stimuli used in previous studies of these effects would primarily activate earlier motion-sensitive areas such as V1, that also show centre-surround antagonism. Rather than use these types of stimuli, they instead used a variable-coherence random dot stimulus. This sort of stimulus requires the involvement of integration and segregation mechanisms found at higher levels of visual cortex, and would result in greater activation of V5. Using this global motion stimulus, Chen et al. tested a group of schizophrenic participants and found that, contrary to previous studies, the influence of the surround was *stronger* in schizophrenics (Chen et al., 2008) not *weaker,*  suggesting that these patients have weaker inhibitory mechanisms in early visual cortex, not

The data from migraineurs are also difficult to explain purely on the basis of centre surround interactions in V5. Migraineurs show stronger than normal spatial suppression, rather than the weaker suppression found in depressives and schizophrenics (Battista et al., 2010). This is unexpected considering that one of the dominant theories of migraine proposes that it results from cortical hyperexcitability caused by reduced cortical inhibition, a theory for which there is some physiological (Aurora et al., 2005; Chadaide et al., 2007; Brighina et al., 2009) and psychophysical (Palmer et al., 2000) support. The psychophysical results of Battista et al. using the spatial suppression paradigm (Tadin et al., 2003) would therefore seem to be at odds with the reduced inhibition model of migraine. This contradiction could be resolved if migraine resulted from primary neural hyperexcitability, but it is unclear whether this is the case (Aurora & Wilkinson, 2007; Coppola et al., 2007).

With regard to the weaker spatial suppression reported in older observers (Betts et al., 2005), subsequent studies have failed to replicate this effect (Karas & McKendrick, 2011) and other studies, again using stimuli designed to selectively target V5, have concluded that any motion deficits in older observers are primarily a result of contrast sensitivity loss (Allen et al., 2010). Intrigued by the counterintuitive idea that older observers were performing better than their younger counterparts, we carried out a series of experiments in which we reproduced a "suppressive" effect in younger observers very similar to that of Tadin et al., and we also showed that this effect was absent in older observers, akin to the study of Betts

circumstantial (Aaen-Stockdale et al., 2009; Wallisch & Kumbhani, 2009).

V5.

### *2.2.1. Spatial suppression*

Centre-surround antagonism is a common feature of the visual system that begins at the level of the retina and assists segmentation of the visual image into objects and background. Briefly, stimulation in a neuron's central receptive field causes excitation, whilst stimulation in the receptive field surround results in inhibition. These two signals may cancel each other out if stimulation of the centre and surround is uniform. This results in neurons that are selective for discontinuities in the visual image. At the level of the cortex, excitatory and inhibitory regions are organised in such a way that neurons show selectivity for a variety of properties such as orientation, spatial frequency or motion within their receptive field centres. These are sometimes referred to as the "classical" receptive field. Many other cortical neurons also show a substantial inhibitory surround whose stimulus selectivity is antagonistic to the receptive field centre. It has been argued that this fundamental property of the visual system may give rise to some curious psychophysical correlates, including one well-known motion-based perceptual effect termed "spatial suppression."

As the retinal size of a stimulus increases, one would expect motion detection and discrimination to improve according to spatial summation of contrast resulting from the recruitment and integration of progressively larger numbers of motion sensors. Contrary to this idea, Tadin et al. (2003) noted that observers got *worse* at discriminating the direction of motion as the size of a high contrast stimulus was increased. Spatial summation occurred as expected only if the patch was low contrast, resulting in *better* performance as size increased. Tadin et al. proposed that their psychophysical results were a perceptual correlate of centresurround antagonism in motion-selective neurons in cortical visual area V5, and have recently reported TMS findings to support this view (Tadin et al., 2011). The rationale is that large, high contrast stimuli activate *both* the excitatory centre and inhibitory surround of cells in V5, resulting in a less robust neural representation of motion. Further, this effect does not occur for low contrast stimuli, as the inhibitory surround requires high contrasts to become active.

It was subsequently reported that older observers (over 60 years of age) showed weaker spatial suppression, which paradoxically led to *better* performance than younger observers in the high contrast conditions (Betts et al., 2005). The authors proposed that this weaker spatial suppression in older observers is a perceptual correlate of age-related changes in GABA-mediated inhibitory processes in the brain (Leventhal et al., 2003). Similar claims regarding a weakening (or not) of centre-surround antagonism in cortical areas have since been made using the same psychophysical technique for observers with schizophrenia (Tadin et al., 2006), depression (Golomb et al., 2009) and migraine (Battista et al., 2010).

116 Visual Cortex – Current Status and Perspectives

*2.2.1. Spatial suppression* 

become active.

the task demands and available information (Nishida, 2011).

well-known motion-based perceptual effect termed "spatial suppression."

Alais et al., 1996; Alais et al., 1997). Furthermore, physiological data show that motionselective cells in V1, that have broad orientation-tuning, respond to the motion of 2D features in the stimulus (Tinsley et al., 2003) and presumably feed this information forward to MT/V5, where it is combined with 1D local motion signals to influence perception. The visual system has for some time been likened to a Fourier analyser, that breaks down the complicated visual stimulus into its simpler 1D components, then recombines it (Campbell & Robson, 1968). The work of Tinsley et al. and Alais and colleagues is by no means fatal to the two-stage conception of global motion analysis, but rather reminds us that the first-stage is not a straightforward global Fourier analysis. In addition, these data suggest that the visual system may shift between different strategies for combining local motion based on

Centre-surround antagonism is a common feature of the visual system that begins at the level of the retina and assists segmentation of the visual image into objects and background. Briefly, stimulation in a neuron's central receptive field causes excitation, whilst stimulation in the receptive field surround results in inhibition. These two signals may cancel each other out if stimulation of the centre and surround is uniform. This results in neurons that are selective for discontinuities in the visual image. At the level of the cortex, excitatory and inhibitory regions are organised in such a way that neurons show selectivity for a variety of properties such as orientation, spatial frequency or motion within their receptive field centres. These are sometimes referred to as the "classical" receptive field. Many other cortical neurons also show a substantial inhibitory surround whose stimulus selectivity is antagonistic to the receptive field centre. It has been argued that this fundamental property of the visual system may give rise to some curious psychophysical correlates, including one

As the retinal size of a stimulus increases, one would expect motion detection and discrimination to improve according to spatial summation of contrast resulting from the recruitment and integration of progressively larger numbers of motion sensors. Contrary to this idea, Tadin et al. (2003) noted that observers got *worse* at discriminating the direction of motion as the size of a high contrast stimulus was increased. Spatial summation occurred as expected only if the patch was low contrast, resulting in *better* performance as size increased. Tadin et al. proposed that their psychophysical results were a perceptual correlate of centresurround antagonism in motion-selective neurons in cortical visual area V5, and have recently reported TMS findings to support this view (Tadin et al., 2011). The rationale is that large, high contrast stimuli activate *both* the excitatory centre and inhibitory surround of cells in V5, resulting in a less robust neural representation of motion. Further, this effect does not occur for low contrast stimuli, as the inhibitory surround requires high contrasts to

It was subsequently reported that older observers (over 60 years of age) showed weaker spatial suppression, which paradoxically led to *better* performance than younger observers However, the proposed correlation of this psychophysical effect to centre-surround antagonism in cortical area V5, and the connection with age- or psychiatric-related changes in the GABA inhibitory system, has been criticised on the basis that the evidence to date is circumstantial (Aaen-Stockdale et al., 2009; Wallisch & Kumbhani, 2009).

Chen and colleagues propose that spatial suppression is indeed a consequence of centresurround antagonism, but that this surround inhibition is not occurring in V5 (Chen, 2011). They argue that the simple gratings or 100% coherent dot stimuli used in previous studies of these effects would primarily activate earlier motion-sensitive areas such as V1, that also show centre-surround antagonism. Rather than use these types of stimuli, they instead used a variable-coherence random dot stimulus. This sort of stimulus requires the involvement of integration and segregation mechanisms found at higher levels of visual cortex, and would result in greater activation of V5. Using this global motion stimulus, Chen et al. tested a group of schizophrenic participants and found that, contrary to previous studies, the influence of the surround was *stronger* in schizophrenics (Chen et al., 2008) not *weaker,*  suggesting that these patients have weaker inhibitory mechanisms in early visual cortex, not V5.

The data from migraineurs are also difficult to explain purely on the basis of centre surround interactions in V5. Migraineurs show stronger than normal spatial suppression, rather than the weaker suppression found in depressives and schizophrenics (Battista et al., 2010). This is unexpected considering that one of the dominant theories of migraine proposes that it results from cortical hyperexcitability caused by reduced cortical inhibition, a theory for which there is some physiological (Aurora et al., 2005; Chadaide et al., 2007; Brighina et al., 2009) and psychophysical (Palmer et al., 2000) support. The psychophysical results of Battista et al. using the spatial suppression paradigm (Tadin et al., 2003) would therefore seem to be at odds with the reduced inhibition model of migraine. This contradiction could be resolved if migraine resulted from primary neural hyperexcitability, but it is unclear whether this is the case (Aurora & Wilkinson, 2007; Coppola et al., 2007).

With regard to the weaker spatial suppression reported in older observers (Betts et al., 2005), subsequent studies have failed to replicate this effect (Karas & McKendrick, 2011) and other studies, again using stimuli designed to selectively target V5, have concluded that any motion deficits in older observers are primarily a result of contrast sensitivity loss (Allen et al., 2010). Intrigued by the counterintuitive idea that older observers were performing better than their younger counterparts, we carried out a series of experiments in which we reproduced a "suppressive" effect in younger observers very similar to that of Tadin et al., and we also showed that this effect was absent in older observers, akin to the study of Betts et al. (Aaen-Stockdale et al., 2009). However, we also obtained contrast thresholds for all observers at all stimulus sizes and calculated the suprathreshold contrast for each stimulus. In this analysis, we found that the "suppressive" effect (and its absence in older observers) was entirely predictable from the observer's contrast threshold. This explanation of psychophysical spatial suppression based on low-level visual mechanisms has however been disputed (Glasser & Tadin, 2010).

Visual Motion: From Cortex to Percept 119

coherence task in their blind hemifield (Huxlin et al., 2009). This raises the intriguing possibility that V1 may not be necessary for perceptual learning of specific types of motion

It would appear, therefore, that MT is centrally involved in perceptual learning of motionbased tasks. However, the available neurophysiological data, collected using random dot kinematogram stimuli, do not directly support the hypothesis that perceptual learning leads to long-term plastic changes within MT. For example, Zohary et al (1994), found that neurons within MT and MST narrowed their directional tuning and increased their firing rate as monkeys improved their performance on a motion coherence task *within* a single training session. However these changes did not persist *across* multiple training sessions. More recent neurophysiological work, also in monkeys, has implicated the lateral intraparietal area in perceptual learning of coherent motion perception (Law & Gold, 2008). This suggests that perceptual learning of specific types of motion stimuli may rely on changes in the way that the responses of cells within MT are 'read out' by higher level extrastriate areas. Whether this is the case for the human brain and for other types of motion

**2.3. Complex motion analysis – the middle superior temporal area (MST) and V6** 

Still higher cortical areas respond to *complex motion* signals such as global expansion, contraction and rotation. These types of motion are particularly interesting, because they are generated by the interaction between an observer and the environment. For example, radial patterns of motion such as expansion and contraction occur on the retina when objects approach or recede from an observer, respectively. These patterns of motion could be caused by motion of the object, the head and body or both. Similarly, rotational patterns of motion can be caused either by tilting of the head or physical rotation of an object. In other words, neurons selective for these motion patterns could encode "optic flow" and allow us

Physiological work has shown that the medial superior temporal area, MST (Saito et al., 1986; Tanaka et al., 1986; Tanaka et al., 1989; Tanaka & Saito, 1989; Duffy & Wurtz, 1991b; Duffy & Wurtz, 1991a; Britten & van Wezel, 1998) is selective for such translational, radial and rotational motion patterns over wide areas of the visual field. Neuroimaging has demonstrated the presence of a human homologue of MST in what has become known as the hMT+ complex or V5, which also includes area MT (Morrone et al., 2000; Dukelow et al., 2001; Huk et al., 2002). The extraction of patterns of complex motion such as radial and rotational motion, has been modelled from the combined outputs of MT neurons (Perrone, 1992; Grossberg et al., 1999). Recently, a second visual area responsive to wide-angle flow fields has been identified (Pitzalis et al., 2010) and is thought to be a homologue of macaque area V6 (Galletti et al., 1996). Although most work on optic flow has concentrated on area MST, neurons in this second region have many similar characteristics and the two areas are strongly connected. Pitzalis et al. propose that MST and V6 work in concert, the former

analysing the motions of objects in the world and the latter extracting self-motion.

stimuli.

tasks is yet to be established.

to navigate in the world (Gibson, 1950).

The prevailing interpretation of psychophysical spatial suppression rests upon the idea that surround-inhibition is weaker at low contrasts. However, induced motion, an illusion in which the motion of a central patch is biased by the motion of its surround – a phenomenon which is almost certainly mediated by centre-surround antagonism – is *stronger* at low contrasts rather than weaker (Hanada, 2010). On the balance of it, whether centre-surround antagonism in V5 is directly to blame for paradoxical psychophysical results or whether other factors are responsible or contribute to the effect remains unresolved.

#### *2.2.2. Perceptual learning*

A number of psychophysical studies have demonstrated that specific aspects of motion perception can improve with repeated exposure, a phenomena known as perceptual learning (Fine & Jacobs, 2002). For example, repeated practise of a task that involves discrimination of the motion direction of a field of moving dots, results in significant improvements in task performance (Ball & Sekuler, 1982; Ball & Sekuler, 1987). The fact that such improvements are often highly specific for particular aspects of the trained stimulus such as motion direction and location within the visual field led to the suggestion that learning, and the associated neural plasticity, takes place at a relatively early stage of visual motion processing such as MT. There is additional evidence supporting the idea that MT plays a causal role in perceptual learning of motion tasks. Lesions of MT in monkeys result in an inability to demonstrate perceptual learning for tasks involving the detection of a coherent motion within a random dot kinematogram (Rudolph & Pasternak, 1999). This particular stimulus consists of two populations of moving dots, one moving in a coherent (signal) direction and the other moving in random (noise) directions. The task is to identify the signal direction and task difficulty is manipulated by varying the signal to noise ratio within the stimulus (Newsome & Pare, 1988). In addition, it has been demonstrated psychophysically that perceptual learning of a challenging motion orientation discrimination task is impaired or absent when the ability of MT neurons to encode the motion signal is compromised (Lu et al., 2004). This was achieved by constraining the local motion of pairs of dots within the training stimulus to be equal and opposite. The theory was that this would activate suppressive motion opponent mechanisms within MT (Qian & Andersen, 1994), which would interfere with the global processing of the stimulus and disrupt perceptual learning. In support of this concept, it was subsequently found that while the motion opponent stimulus impaired learning, simply removing the motion opponency from the stimulus by altering the phase of local dot motions resulted in pronounced perceptual learning (Thompson & Liu, 2006). It has also been reported that patients who have hemianopia due to V1 lesions that do not extend to MT are able to learn a motion coherence task in their blind hemifield (Huxlin et al., 2009). This raises the intriguing possibility that V1 may not be necessary for perceptual learning of specific types of motion stimuli.

118 Visual Cortex – Current Status and Perspectives

been disputed (Glasser & Tadin, 2010).

*2.2.2. Perceptual learning* 

et al. (Aaen-Stockdale et al., 2009). However, we also obtained contrast thresholds for all observers at all stimulus sizes and calculated the suprathreshold contrast for each stimulus. In this analysis, we found that the "suppressive" effect (and its absence in older observers) was entirely predictable from the observer's contrast threshold. This explanation of psychophysical spatial suppression based on low-level visual mechanisms has however

The prevailing interpretation of psychophysical spatial suppression rests upon the idea that surround-inhibition is weaker at low contrasts. However, induced motion, an illusion in which the motion of a central patch is biased by the motion of its surround – a phenomenon which is almost certainly mediated by centre-surround antagonism – is *stronger* at low contrasts rather than weaker (Hanada, 2010). On the balance of it, whether centre-surround antagonism in V5 is directly to blame for paradoxical psychophysical results or whether

A number of psychophysical studies have demonstrated that specific aspects of motion perception can improve with repeated exposure, a phenomena known as perceptual learning (Fine & Jacobs, 2002). For example, repeated practise of a task that involves discrimination of the motion direction of a field of moving dots, results in significant improvements in task performance (Ball & Sekuler, 1982; Ball & Sekuler, 1987). The fact that such improvements are often highly specific for particular aspects of the trained stimulus such as motion direction and location within the visual field led to the suggestion that learning, and the associated neural plasticity, takes place at a relatively early stage of visual motion processing such as MT. There is additional evidence supporting the idea that MT plays a causal role in perceptual learning of motion tasks. Lesions of MT in monkeys result in an inability to demonstrate perceptual learning for tasks involving the detection of a coherent motion within a random dot kinematogram (Rudolph & Pasternak, 1999). This particular stimulus consists of two populations of moving dots, one moving in a coherent (signal) direction and the other moving in random (noise) directions. The task is to identify the signal direction and task difficulty is manipulated by varying the signal to noise ratio within the stimulus (Newsome & Pare, 1988). In addition, it has been demonstrated psychophysically that perceptual learning of a challenging motion orientation discrimination task is impaired or absent when the ability of MT neurons to encode the motion signal is compromised (Lu et al., 2004). This was achieved by constraining the local motion of pairs of dots within the training stimulus to be equal and opposite. The theory was that this would activate suppressive motion opponent mechanisms within MT (Qian & Andersen, 1994), which would interfere with the global processing of the stimulus and disrupt perceptual learning. In support of this concept, it was subsequently found that while the motion opponent stimulus impaired learning, simply removing the motion opponency from the stimulus by altering the phase of local dot motions resulted in pronounced perceptual learning (Thompson & Liu, 2006). It has also been reported that patients who have hemianopia due to V1 lesions that do not extend to MT are able to learn a motion

other factors are responsible or contribute to the effect remains unresolved.

It would appear, therefore, that MT is centrally involved in perceptual learning of motionbased tasks. However, the available neurophysiological data, collected using random dot kinematogram stimuli, do not directly support the hypothesis that perceptual learning leads to long-term plastic changes within MT. For example, Zohary et al (1994), found that neurons within MT and MST narrowed their directional tuning and increased their firing rate as monkeys improved their performance on a motion coherence task *within* a single training session. However these changes did not persist *across* multiple training sessions. More recent neurophysiological work, also in monkeys, has implicated the lateral intraparietal area in perceptual learning of coherent motion perception (Law & Gold, 2008). This suggests that perceptual learning of specific types of motion stimuli may rely on changes in the way that the responses of cells within MT are 'read out' by higher level extrastriate areas. Whether this is the case for the human brain and for other types of motion tasks is yet to be established.

## **2.3. Complex motion analysis – the middle superior temporal area (MST) and V6**

Still higher cortical areas respond to *complex motion* signals such as global expansion, contraction and rotation. These types of motion are particularly interesting, because they are generated by the interaction between an observer and the environment. For example, radial patterns of motion such as expansion and contraction occur on the retina when objects approach or recede from an observer, respectively. These patterns of motion could be caused by motion of the object, the head and body or both. Similarly, rotational patterns of motion can be caused either by tilting of the head or physical rotation of an object. In other words, neurons selective for these motion patterns could encode "optic flow" and allow us to navigate in the world (Gibson, 1950).

Physiological work has shown that the medial superior temporal area, MST (Saito et al., 1986; Tanaka et al., 1986; Tanaka et al., 1989; Tanaka & Saito, 1989; Duffy & Wurtz, 1991b; Duffy & Wurtz, 1991a; Britten & van Wezel, 1998) is selective for such translational, radial and rotational motion patterns over wide areas of the visual field. Neuroimaging has demonstrated the presence of a human homologue of MST in what has become known as the hMT+ complex or V5, which also includes area MT (Morrone et al., 2000; Dukelow et al., 2001; Huk et al., 2002). The extraction of patterns of complex motion such as radial and rotational motion, has been modelled from the combined outputs of MT neurons (Perrone, 1992; Grossberg et al., 1999). Recently, a second visual area responsive to wide-angle flow fields has been identified (Pitzalis et al., 2010) and is thought to be a homologue of macaque area V6 (Galletti et al., 1996). Although most work on optic flow has concentrated on area MST, neurons in this second region have many similar characteristics and the two areas are strongly connected. Pitzalis et al. propose that MST and V6 work in concert, the former analysing the motions of objects in the world and the latter extracting self-motion.

Whether MST neurons are responsive to only the cardinal motion directions (radial, rotational and translational) as suggested by some psychophysical work (Morrone et al., 1999; Burr et al., 2001), or whether other intermediate forms of motion such as spiral motion are encoded directly, is still a matter of some debate. In support of the direct detection of spiral motions, it has been suggested that summation of mechanisms tuned to purely cardinal motion directions is insufficient to explain the psychophysical data (Snowden & Milne, 1996; Meese & Harris, 2001; Meese & Anderson, 2002) and some physiological work seems to have identified neurons tuned to spiral motions (Graziano et al., 1994; Geesaman & Andersen, 1996). A particularly interesting study found that continuously-changing flow stimuli, obtained by morphing one flow field into the next, lead to a continuum of responses in MST (Paolini et al., 2000). This supports the "generalised spiral" model of MST, in which the tuning of MST neurons is a continuum from pure contraction, though clockwisecontraction, to clockwise-rotation, to clockwise-expansion, to pure expansion, and so on.

Visual Motion: From Cortex to Percept 121

et al., 2005; Peelen et al., 2006) and may also involve "mirror neurons" in the ventral premotor cortex (Saygin et al., 2004) along with a range of additional visual areas including the posterior middle temporal gyrus and regions known as the extrastriate and fusiform body areas (Jastorff & Orban, 2009). A recent meta-analysis of neuroimaging data in humans has emphasised the importance of the pSTS in processing motion cues from biological motion stimuli and also identified a region within the hMT+ complex that may play a role in the

As well as being able to extract biologically relevant information from motion patterns, the visual system is also able to extract three-dimensional (3D) structure from moving twodimensional stimuli, often referred to as the "kinetic depth effect" or "structure-frommotion" (SFM) (Wallach & O'Connell, 1953). Several models have been developed to explain how the visual system achieves this remarkable feat, some based on the tracking of local positional cues (Ullman, 1984; Grzywacz & Hildreth, 1987; Shariat & Price, 1990; Snowden et al., 1991) and others based on the use of local motion information (Clocksin, 1980; Longuet-Higgins & Prazdny, 1980; Koenderink & van Doorn, 1986; Husain et al., 1989; Hildreth et al., 1995). The weight of evidence currently seems to suggest that motion rather than positional information is crucial to extracting SFM (Andersen & Bradley, 1998; Farivar, 2009; Farivar et

SFM is usually investigated by using random dot stimuli in which the dots are randomly distributed across the image, but are projected onto an "invisible" object, which is then rotated with the direction and speed of the dots being dictated by their position on the object. Provided that the dot density is high enough, SFM can be perceived with very short, two-frame displays in which dots simply jump from one position to another (Lappin et al., 1980). SFM is not perceived if a very small number of dots are used. However, periodically re-positioning a similarly small number of dots across the stimulus, allows the surface of the object to be reconstructed via interpolation (Husain et al., 1989; Treue et al., 1995). Functional magnetic resonance imaging (fMRI) suggests that SFM is carried out in a network of cortical regions: V5, lateral occipital sulcus (LOS) and several sites along the

The preceding discussion has dealt mainly with the processing of luminance-based motion, often called *first-order* motion. However, at the input stage, motion can also be defined by characteristics other than luminance, such as flicker, texture and contrast. To return to our snowfield example, the force and direction of the wind or the presence of a small animal burrowing under the snow can be detected visually by detecting changes in the pattern of random flicker caused as flakes of snow on the ground are disturbed. Motion that is defined by modulation of a property other than luminance is referred to as *second-order* motion

perception of human body movement (Grosbras et al., 2012).

intraparietal sulcus (IPS) (Orban et al., 1999; Peuskens et al., 2004).

**intraparietal sulcus (IPS)** 

**3. Second-order motion** 

al., 2009).

**2.5. Structure-from-motion – the lateral occipital sulcus (LOS) and the** 

## **2.4. Biological motion perception – the superior temporal sulcus (STS)**

Human observers are acutely sensitive to the complex pattern of motion trajectories generated by other people and animals known as biological motion (Johansson, 1973; Mather & West, 1993). Investigations of biological motion often use stimuli constructed from dots or "point lights" that represent the joints of an actor (Troje, 2002). When stationary, these displays appear as an elongated group of dots, however when set in motion, a vivid percept of a person moving is generated. Sufficient information can be extracted from dynamic point light stimuli to allow for identification of a wide range of complex attributes such as gender (Mather & Murdoch, 1994) and mood (Dittrich et al., 1996), and observers are sensitive to these stimuli in both the central and peripheral visual field (Thompson et al., 2007). Although point light stimuli contain both motion information and configural, formbased cues (the relative position of the points), motion cues do appear to be centrally involved in the processing of biological motion. For example, observers can still perform above chance on a walking direction discriminating task when the relative positions of each of the dots representing the joints of an actor are scrambled (Troje & Westhoff, 2006). Interestingly, simply inverting the dots representing the feet of walking humans or animals disrupts discrimination of walking direction suggesting that the characteristic local motion cues specific to biological motion, may be processed by dedicated "life detector" mechanisms (Troje & Westhoff, 2006).

Initial insights into the regions of the visual cortex involved in biological motion perception were provided by the neurophysiological investigations of Oram and Perrett (1994) in the monkey. Cells were found within the superior temporal polysensory area, a region anterior to MT and MST within the superior temporal sulcus, that were sensitive to biological motion stimuli. Subsequently, a large number of neuroimagaing studies have been conducted in humans with the aim of identifying the cortical areas involved in biological motion perception. It is now apparent that biological motion perception recruits a distributed neural circuit in humans which includes the posterior region of the superior temporal sulcus (Grossman et al., 2000; Grezes et al., 2001; Servos et al., 2002; Grossman et al., 2005; Pelphrey et al., 2005; Peelen et al., 2006) and may also involve "mirror neurons" in the ventral premotor cortex (Saygin et al., 2004) along with a range of additional visual areas including the posterior middle temporal gyrus and regions known as the extrastriate and fusiform body areas (Jastorff & Orban, 2009). A recent meta-analysis of neuroimaging data in humans has emphasised the importance of the pSTS in processing motion cues from biological motion stimuli and also identified a region within the hMT+ complex that may play a role in the perception of human body movement (Grosbras et al., 2012).

## **2.5. Structure-from-motion – the lateral occipital sulcus (LOS) and the intraparietal sulcus (IPS)**

As well as being able to extract biologically relevant information from motion patterns, the visual system is also able to extract three-dimensional (3D) structure from moving twodimensional stimuli, often referred to as the "kinetic depth effect" or "structure-frommotion" (SFM) (Wallach & O'Connell, 1953). Several models have been developed to explain how the visual system achieves this remarkable feat, some based on the tracking of local positional cues (Ullman, 1984; Grzywacz & Hildreth, 1987; Shariat & Price, 1990; Snowden et al., 1991) and others based on the use of local motion information (Clocksin, 1980; Longuet-Higgins & Prazdny, 1980; Koenderink & van Doorn, 1986; Husain et al., 1989; Hildreth et al., 1995). The weight of evidence currently seems to suggest that motion rather than positional information is crucial to extracting SFM (Andersen & Bradley, 1998; Farivar, 2009; Farivar et al., 2009).

SFM is usually investigated by using random dot stimuli in which the dots are randomly distributed across the image, but are projected onto an "invisible" object, which is then rotated with the direction and speed of the dots being dictated by their position on the object. Provided that the dot density is high enough, SFM can be perceived with very short, two-frame displays in which dots simply jump from one position to another (Lappin et al., 1980). SFM is not perceived if a very small number of dots are used. However, periodically re-positioning a similarly small number of dots across the stimulus, allows the surface of the object to be reconstructed via interpolation (Husain et al., 1989; Treue et al., 1995). Functional magnetic resonance imaging (fMRI) suggests that SFM is carried out in a network of cortical regions: V5, lateral occipital sulcus (LOS) and several sites along the intraparietal sulcus (IPS) (Orban et al., 1999; Peuskens et al., 2004).

## **3. Second-order motion**

120 Visual Cortex – Current Status and Perspectives

mechanisms (Troje & Westhoff, 2006).

Whether MST neurons are responsive to only the cardinal motion directions (radial, rotational and translational) as suggested by some psychophysical work (Morrone et al., 1999; Burr et al., 2001), or whether other intermediate forms of motion such as spiral motion are encoded directly, is still a matter of some debate. In support of the direct detection of spiral motions, it has been suggested that summation of mechanisms tuned to purely cardinal motion directions is insufficient to explain the psychophysical data (Snowden & Milne, 1996; Meese & Harris, 2001; Meese & Anderson, 2002) and some physiological work seems to have identified neurons tuned to spiral motions (Graziano et al., 1994; Geesaman & Andersen, 1996). A particularly interesting study found that continuously-changing flow stimuli, obtained by morphing one flow field into the next, lead to a continuum of responses in MST (Paolini et al., 2000). This supports the "generalised spiral" model of MST, in which the tuning of MST neurons is a continuum from pure contraction, though clockwisecontraction, to clockwise-rotation, to clockwise-expansion, to pure expansion, and so on.

**2.4. Biological motion perception – the superior temporal sulcus (STS)** 

Human observers are acutely sensitive to the complex pattern of motion trajectories generated by other people and animals known as biological motion (Johansson, 1973; Mather & West, 1993). Investigations of biological motion often use stimuli constructed from dots or "point lights" that represent the joints of an actor (Troje, 2002). When stationary, these displays appear as an elongated group of dots, however when set in motion, a vivid percept of a person moving is generated. Sufficient information can be extracted from dynamic point light stimuli to allow for identification of a wide range of complex attributes such as gender (Mather & Murdoch, 1994) and mood (Dittrich et al., 1996), and observers are sensitive to these stimuli in both the central and peripheral visual field (Thompson et al., 2007). Although point light stimuli contain both motion information and configural, formbased cues (the relative position of the points), motion cues do appear to be centrally involved in the processing of biological motion. For example, observers can still perform above chance on a walking direction discriminating task when the relative positions of each of the dots representing the joints of an actor are scrambled (Troje & Westhoff, 2006). Interestingly, simply inverting the dots representing the feet of walking humans or animals disrupts discrimination of walking direction suggesting that the characteristic local motion cues specific to biological motion, may be processed by dedicated "life detector"

Initial insights into the regions of the visual cortex involved in biological motion perception were provided by the neurophysiological investigations of Oram and Perrett (1994) in the monkey. Cells were found within the superior temporal polysensory area, a region anterior to MT and MST within the superior temporal sulcus, that were sensitive to biological motion stimuli. Subsequently, a large number of neuroimagaing studies have been conducted in humans with the aim of identifying the cortical areas involved in biological motion perception. It is now apparent that biological motion perception recruits a distributed neural circuit in humans which includes the posterior region of the superior temporal sulcus (Grossman et al., 2000; Grezes et al., 2001; Servos et al., 2002; Grossman et al., 2005; Pelphrey

The preceding discussion has dealt mainly with the processing of luminance-based motion, often called *first-order* motion. However, at the input stage, motion can also be defined by characteristics other than luminance, such as flicker, texture and contrast. To return to our snowfield example, the force and direction of the wind or the presence of a small animal burrowing under the snow can be detected visually by detecting changes in the pattern of random flicker caused as flakes of snow on the ground are disturbed. Motion that is defined by modulation of a property other than luminance is referred to as *second-order* motion

(Cavanagh & Mather, 1989) and second-order motion is invisible to first-order, i.e. luminance-based, motion sensors (Chubb & Sperling, 1988).

Visual Motion: From Cortex to Percept 123

If second-order motion was ultimately detected by first-order mechanisms, we might expect the two types of stimuli to interact. This does not seem to be the case for local motion. Temporally interleaving first- and second-order stimuli in alternate frames of a motion stimulus fails to generate the percept of smooth motion, suggesting that the two systems do not interact at this level (Ledgeway & Smith, 1994). Adaptation to one type of motion does not impair detection of the other type (Nishida et al., 1997), and the second-order system does not seem to be able to discriminate the direction of motion at detection threshold, unlike the first-order system (Smith & Ledgeway, 1997), which can discriminate motion direction as soon as motion is detected. It is therefore likely that first- and second-order motion are initially analysed in parallel by separate processing streams (shown in figure 3 in

Several neuroimaging studies have investigated the possibility that some cortical areas may be selective for second order motion, as predicted by the dual-pathway hypothesis. To date, however, the results from these studies have been mixed. Second-order specific responses have been reported in areas such as V3 (Smith et al., 1998), but other studies have found either substantial overlap of first and second-order motion responses throughout the visual cortex (Dumoulin et al., 2003) or no anatomical segregation of areas responsive to first- and second-order motion (Nishida et al., 2003; Seiffert et al., 2003; Ashida et al., 2007). The idea of an anatomically distinct second-order pathway may be rescued if different neurons in the same anatomical area responded selectively to different types of motion, or if the same neurons responded to both first- and second-order motion, but had different spatial/temporal tuning for first-order motion than for second-order motion. This latter contention is supported by some neurophysiological investigations of MT in the primate (O'Keefe & Movshon, 1998) and areas 17 and 18 in the cat (Mareschal & Baker, 1998). However, the idea that different neuronal populations for first- and second-order motion exist, but share common anatomical locations has also found support in a human neuroimaging study using fMRI adaptation (Ashida et al., 2007). In this technique, repeated presentation of similar stimuli causes a reduction of the blood oxygen level–dependent (BOLD) response in cortical regions containing neurons that cannot differentiate between the stimuli, whereas little or no reduction occurs if the different stimuli activate distinct populations of neurons. Ashida et al. found direction-selective fMRI adaptation for stimuli of the same type, but no cross-adaptation between first- and second-order motion. These fMRI results provide persuasive evidence that neural populations differentially selective for first- and second-order motion co-exist in motion sensitive regions of the human brain such

Assuming segregation of first- and second-order motion at early stages of visual motion processing, at what point in the visual motion hierarchy are the two types of motion combined? Models of the mammalian visual motion processing hierarchy (Wilson et al., 1992; Lu & Sperling, 1995; Lu & Sperling, 2001) usually integrate first- and second-order streams at, or before, the level of global motion analysis (see Figure 3) and insensitivity to

red and blue, respectively).

as V3a, MT and MST.

**3.2. Higher level second-order motion** 

## **3.1. Local second-order motion**

Currently, it is a mystery how the visual system detects second-order motion, as the primary input to the visual system represents changes in retinal illumination. The visual system has a small compressive non-linearity, probably at the level of the photoreceptors (Scott-Samuel & Georgeson, 1999) that could transform second-order information into a weak luminance signal. This weak *internal* artefact could mean that second-order motion is actually detected by first-order mechanisms and could explain the (usually) weaker performance for purely second-order motion stimuli. However, the distortion product measured by Scott-Samuel & Georgeson is only detectable in high speed, high contrast modulation stimuli. Since secondorder motion is still visible in slow moving, or low-modulation, stimuli, the early nonlinearity is unlikely to be able to fully explain many of the experimental results. Figure 3 shows a schematic wiring diagram of the motion processing hierarchy and the dotted orange arrow shows the presence of these "pseudo-second-order" motion signals.

**Figure 3.** A schematic view of the orthodox model of motion processing in the cortex. FO = first order, SO = second order, MT = middle temporal visual area, MST middle superior temporal visual area, STS = superior temporal sulcus, IPS = intraparietal sulcus, LOS = lateral occipital sulcus. See text for further details.

If second-order motion was ultimately detected by first-order mechanisms, we might expect the two types of stimuli to interact. This does not seem to be the case for local motion. Temporally interleaving first- and second-order stimuli in alternate frames of a motion stimulus fails to generate the percept of smooth motion, suggesting that the two systems do not interact at this level (Ledgeway & Smith, 1994). Adaptation to one type of motion does not impair detection of the other type (Nishida et al., 1997), and the second-order system does not seem to be able to discriminate the direction of motion at detection threshold, unlike the first-order system (Smith & Ledgeway, 1997), which can discriminate motion direction as soon as motion is detected. It is therefore likely that first- and second-order motion are initially analysed in parallel by separate processing streams (shown in figure 3 in red and blue, respectively).

Several neuroimaging studies have investigated the possibility that some cortical areas may be selective for second order motion, as predicted by the dual-pathway hypothesis. To date, however, the results from these studies have been mixed. Second-order specific responses have been reported in areas such as V3 (Smith et al., 1998), but other studies have found either substantial overlap of first and second-order motion responses throughout the visual cortex (Dumoulin et al., 2003) or no anatomical segregation of areas responsive to first- and second-order motion (Nishida et al., 2003; Seiffert et al., 2003; Ashida et al., 2007). The idea of an anatomically distinct second-order pathway may be rescued if different neurons in the same anatomical area responded selectively to different types of motion, or if the same neurons responded to both first- and second-order motion, but had different spatial/temporal tuning for first-order motion than for second-order motion. This latter contention is supported by some neurophysiological investigations of MT in the primate (O'Keefe & Movshon, 1998) and areas 17 and 18 in the cat (Mareschal & Baker, 1998). However, the idea that different neuronal populations for first- and second-order motion exist, but share common anatomical locations has also found support in a human neuroimaging study using fMRI adaptation (Ashida et al., 2007). In this technique, repeated presentation of similar stimuli causes a reduction of the blood oxygen level–dependent (BOLD) response in cortical regions containing neurons that cannot differentiate between the stimuli, whereas little or no reduction occurs if the different stimuli activate distinct populations of neurons. Ashida et al. found direction-selective fMRI adaptation for stimuli of the same type, but no cross-adaptation between first- and second-order motion. These fMRI results provide persuasive evidence that neural populations differentially selective for first- and second-order motion co-exist in motion sensitive regions of the human brain such as V3a, MT and MST.

#### **3.2. Higher level second-order motion**

122 Visual Cortex – Current Status and Perspectives

**3.1. Local second-order motion** 

further details.

luminance-based, motion sensors (Chubb & Sperling, 1988).

(Cavanagh & Mather, 1989) and second-order motion is invisible to first-order, i.e.

Currently, it is a mystery how the visual system detects second-order motion, as the primary input to the visual system represents changes in retinal illumination. The visual system has a small compressive non-linearity, probably at the level of the photoreceptors (Scott-Samuel & Georgeson, 1999) that could transform second-order information into a weak luminance signal. This weak *internal* artefact could mean that second-order motion is actually detected by first-order mechanisms and could explain the (usually) weaker performance for purely second-order motion stimuli. However, the distortion product measured by Scott-Samuel & Georgeson is only detectable in high speed, high contrast modulation stimuli. Since secondorder motion is still visible in slow moving, or low-modulation, stimuli, the early nonlinearity is unlikely to be able to fully explain many of the experimental results. Figure 3 shows a schematic wiring diagram of the motion processing hierarchy and the dotted

orange arrow shows the presence of these "pseudo-second-order" motion signals.

**Figure 3.** A schematic view of the orthodox model of motion processing in the cortex. FO = first order, SO = second order, MT = middle temporal visual area, MST middle superior temporal visual area, STS = superior temporal sulcus, IPS = intraparietal sulcus, LOS = lateral occipital sulcus. See text for

Assuming segregation of first- and second-order motion at early stages of visual motion processing, at what point in the visual motion hierarchy are the two types of motion combined? Models of the mammalian visual motion processing hierarchy (Wilson et al., 1992; Lu & Sperling, 1995; Lu & Sperling, 2001) usually integrate first- and second-order streams at, or before, the level of global motion analysis (see Figure 3) and insensitivity to

such low-level stimulus characteristics, termed "cue-invariance", has been found in neurons at the level of MT (Albright, 1992; O'Keefe & Movshon, 1998) and MST (Geesaman & Andersen, 1996). The argument that extrastriate areas are cue-invariant has also been supported by a TMS study in humans (Cowey et al., 2006).

Visual Motion: From Cortex to Percept 125

invariant. Similarly, it had been proposed that the contribution of the second-order pathway to structure-from-motion mechanisms was weak or non-existent (Dosher et al., 1989; Mather, 1989; Landy et al., 1991; Hess & Ziegler, 2000), but reducing the relative visibility of the first-order elements in structure-from-motion displays results in almost linear summation between first- and second-order mechanisms, suggesting that this modality may also be cue-invariant (Aaen-Stockdale et al., 2010). These findings have highlighted the importance of ensuring that local first-order and second-order motion signals are of equal

The first widely accepted report of akinetopsia or "motion blindness" due to bilateral lesions affecting the hMT+ complex but sparing the primary visual cortex, was reported by Zihl and colleagues (Zihl et al., 1983; Zeki, 1991). This patient, LM, did not have a scotoma (region of blindness) as would be expected from damage to the primary visual cortex; however LM exhibited a severe and selective impairment of motion perception. Subsequent work with this patient indicated that LM could perceive motion direction and structure from motion under certain circumstances as long as the stimuli did not contain noise elements such as static or randomly moving dots. However, as soon as noise was introduced into the stimulus, task performance was dramatically reduced (Baker et al., 1991; Rizzo et al., 1995). This pattern of deficits is similar to that reported by Rudolph and Pasternak (Rudolph & Pasternak, 1999) who studied the effects of MT lesions in macaque monkeys. Immediately after the lesions the animals exhibited a pronounced and general impairment in motion perception. However, over time, performance on motion tasks that did not include noise elements recovered, whereas performance for tasks requiring signal/noise segregation, such as those involving RDKs, did not recover. As a whole, these data from both humans and primates suggest that MT and MST may have a particular specialization for the detection of

Deficits in motion perception have also been reported for patients with a developmental disorder of the visual cortex known as amblyopia (or "lazy eye"). Unilateral amblyopia occurs when the images seen by each eye are poorly correlated during early visual development due to a chronically blurred image in one eye (anisometropia), a turned eye (strabismus) or less commonly a congenital cataract (Holmes & Clarke, 2006). This can result in abnormal development of the visual cortex and a visual impairment in the affected eye that is not due to a problem with eye itself, but is the result of abnormal processing of inputs from the amblyopic eye within the visual cortex (Hubel & Wiesel, 1965; Barnes et al., 2001) and possibly the lateral geniculate nucleus (Hess et al., 2009; Li et

strength when comparing the two systems.

**4.1. Akinetopsia** 

motion in a noisy environment.

al., 2011).

**4.2. A motion deficit in amblyopia?** 

**4. Abnormalities of motion processing** 

There is, however, plenty of counter-evidence to the presence of cue-invariance at MT and MST in the physiological literature (Olavarria et al., 1992; Churan & Ilg, 2001) and some human psychophysical studies have suggested that the two processing streams could remain segregated at even higher levels than MT. For example, Badcock and colleagues have shown that the addition of second-order noise dots to either a first-order global motion stimulus or complex motion stimulus does not impair detection and discrimination of the first-order signal (Edwards & Badcock, 1995; Badcock & Khuu, 2001; Cassanello et al., 2011). Several studies have suggested that the contribution of the second-order pathway to mechanisms responsible for extraction of structure-from-motion is weak or non-existent (Dosher et al., 1989; Mather, 1989; Landy et al., 1991; Hess & Ziegler, 2000), while research on whether second-order motion can support biological motion has produced differing results (Mather et al., 1992; Ahlstrom et al., 1997; Bellefeuille & Faubert, 1998).

Although these findings are certainly of interest, it seems counterintuitive that the visual system would maintain two separate pathways for the analysis of different types of highlevel motion stimuli that differ only in terms of their local characteristics. Some recent studies by ourselves and our collaborators support the conventional concept of a functional integration of first- and second-order motion at higher levels of the motion hierarchy. Ledgeway, et al. (2002) and Aaen-Stockdale et al. (forthcoming) have argued that the relative visibility of the first- and second-order dots in the stimuli used by Badcock and colleagues may not have been equalised (Edwards & Badcock, 1995; Badcock & Khuu, 2001). Although the *static* first- and second-order dots were highly visible, their relative visibility to first- or second-order motion sensors (which have different spatio-temporal characteristics) (Derrington et al., 1993; Ledgeway & Hess, 2002) may not have been equal. Ledgeway, Hess & McGraw (2002) demonstrated that reducing the visibility of the luminance-modulated (first-order) dots resulted in interactions between first- and secondorder dots in global motion stimuli consistent with a combination of the two motion cues within extrastriate visual areas. Aaen-Stockdale et al. (forthcoming) showed that these visibility-dependent interactions also occurred with complex (radial and rotational) motion stimuli and that weakening the first-order signal by increasing the size of dot displacements between frames resulted in similar interactions. This latter study also pitted opposing firstand second-order motion signals against each other to demonstrate that impairments of first-order motion discrimination caused by the inclusion of second-order dots within the stimulus was genuinely a result of a cue-invariant motion system attempting to integrate these separate signals, and not simply the result of increased noise. Subsequent work by us has used these same techniques to show that other types of higher-order motion perception are similarly cue-invariant. For example we (Aaen-Stockdale, et al., 2008) found that when relative stimulus visibility was varied, first- and second-order elements interacted to mask biological motion of the opposite class suggesting that biological motion perception is cue invariant. Similarly, it had been proposed that the contribution of the second-order pathway to structure-from-motion mechanisms was weak or non-existent (Dosher et al., 1989; Mather, 1989; Landy et al., 1991; Hess & Ziegler, 2000), but reducing the relative visibility of the first-order elements in structure-from-motion displays results in almost linear summation between first- and second-order mechanisms, suggesting that this modality may also be cue-invariant (Aaen-Stockdale et al., 2010). These findings have highlighted the importance of ensuring that local first-order and second-order motion signals are of equal strength when comparing the two systems.

## **4. Abnormalities of motion processing**

#### **4.1. Akinetopsia**

124 Visual Cortex – Current Status and Perspectives

supported by a TMS study in humans (Cowey et al., 2006).

such low-level stimulus characteristics, termed "cue-invariance", has been found in neurons at the level of MT (Albright, 1992; O'Keefe & Movshon, 1998) and MST (Geesaman & Andersen, 1996). The argument that extrastriate areas are cue-invariant has also been

There is, however, plenty of counter-evidence to the presence of cue-invariance at MT and MST in the physiological literature (Olavarria et al., 1992; Churan & Ilg, 2001) and some human psychophysical studies have suggested that the two processing streams could remain segregated at even higher levels than MT. For example, Badcock and colleagues have shown that the addition of second-order noise dots to either a first-order global motion stimulus or complex motion stimulus does not impair detection and discrimination of the first-order signal (Edwards & Badcock, 1995; Badcock & Khuu, 2001; Cassanello et al., 2011). Several studies have suggested that the contribution of the second-order pathway to mechanisms responsible for extraction of structure-from-motion is weak or non-existent (Dosher et al., 1989; Mather, 1989; Landy et al., 1991; Hess & Ziegler, 2000), while research on whether second-order motion can support biological motion has produced differing

Although these findings are certainly of interest, it seems counterintuitive that the visual system would maintain two separate pathways for the analysis of different types of highlevel motion stimuli that differ only in terms of their local characteristics. Some recent studies by ourselves and our collaborators support the conventional concept of a functional integration of first- and second-order motion at higher levels of the motion hierarchy. Ledgeway, et al. (2002) and Aaen-Stockdale et al. (forthcoming) have argued that the relative visibility of the first- and second-order dots in the stimuli used by Badcock and colleagues may not have been equalised (Edwards & Badcock, 1995; Badcock & Khuu, 2001). Although the *static* first- and second-order dots were highly visible, their relative visibility to first- or second-order motion sensors (which have different spatio-temporal characteristics) (Derrington et al., 1993; Ledgeway & Hess, 2002) may not have been equal. Ledgeway, Hess & McGraw (2002) demonstrated that reducing the visibility of the luminance-modulated (first-order) dots resulted in interactions between first- and secondorder dots in global motion stimuli consistent with a combination of the two motion cues within extrastriate visual areas. Aaen-Stockdale et al. (forthcoming) showed that these visibility-dependent interactions also occurred with complex (radial and rotational) motion stimuli and that weakening the first-order signal by increasing the size of dot displacements between frames resulted in similar interactions. This latter study also pitted opposing firstand second-order motion signals against each other to demonstrate that impairments of first-order motion discrimination caused by the inclusion of second-order dots within the stimulus was genuinely a result of a cue-invariant motion system attempting to integrate these separate signals, and not simply the result of increased noise. Subsequent work by us has used these same techniques to show that other types of higher-order motion perception are similarly cue-invariant. For example we (Aaen-Stockdale, et al., 2008) found that when relative stimulus visibility was varied, first- and second-order elements interacted to mask biological motion of the opposite class suggesting that biological motion perception is cue

results (Mather et al., 1992; Ahlstrom et al., 1997; Bellefeuille & Faubert, 1998).

The first widely accepted report of akinetopsia or "motion blindness" due to bilateral lesions affecting the hMT+ complex but sparing the primary visual cortex, was reported by Zihl and colleagues (Zihl et al., 1983; Zeki, 1991). This patient, LM, did not have a scotoma (region of blindness) as would be expected from damage to the primary visual cortex; however LM exhibited a severe and selective impairment of motion perception. Subsequent work with this patient indicated that LM could perceive motion direction and structure from motion under certain circumstances as long as the stimuli did not contain noise elements such as static or randomly moving dots. However, as soon as noise was introduced into the stimulus, task performance was dramatically reduced (Baker et al., 1991; Rizzo et al., 1995). This pattern of deficits is similar to that reported by Rudolph and Pasternak (Rudolph & Pasternak, 1999) who studied the effects of MT lesions in macaque monkeys. Immediately after the lesions the animals exhibited a pronounced and general impairment in motion perception. However, over time, performance on motion tasks that did not include noise elements recovered, whereas performance for tasks requiring signal/noise segregation, such as those involving RDKs, did not recover. As a whole, these data from both humans and primates suggest that MT and MST may have a particular specialization for the detection of motion in a noisy environment.

#### **4.2. A motion deficit in amblyopia?**

Deficits in motion perception have also been reported for patients with a developmental disorder of the visual cortex known as amblyopia (or "lazy eye"). Unilateral amblyopia occurs when the images seen by each eye are poorly correlated during early visual development due to a chronically blurred image in one eye (anisometropia), a turned eye (strabismus) or less commonly a congenital cataract (Holmes & Clarke, 2006). This can result in abnormal development of the visual cortex and a visual impairment in the affected eye that is not due to a problem with eye itself, but is the result of abnormal processing of inputs from the amblyopic eye within the visual cortex (Hubel & Wiesel, 1965; Barnes et al., 2001) and possibly the lateral geniculate nucleus (Hess et al., 2009; Li et al., 2011).

Although amblyopia is typically regarded as a disorder of spatial vision, a number of studies have identified deficits in motion perception that appear to be independent of impairments of form perception (see Thompson et al., 2011 for a recent overview). Many of the motion deficits that have been reported are consistent with abnormalities at the level of hMT+ complex. For example, patients with amblyopia exhibit elevated motion coherence thresholds when viewing random dot kinematograms, even when abnormal contrast sensitivity is taken into account (Simmers et al., 2003; Constantinescu et al., 2005; Simmers et al., 2006). These deficits are present across spatial scale (Aaen-Stockdale & Hess, 2008), include both first- and second-order motion stimuli (Aaen-Stockdale et al., 2007), may involve deficits in spatial summation (Thompson et al., 2011) and appear to be related to the segregation of signal from noise (Mansouri & Hess, 2006; Thompson et al., 2008b). In agreement with these psychophysical findings, it has recently been reported that cells within MT of monkeys made experimentally strabismic are less direction selective and less tolerant of noise (El-Shamayleh et al., 2011). It would appear, therefore, that motion sensitive areas such as MT are susceptible to abnormal sensory experience during development and that this can result in specific deficits in motion perception.

Visual Motion: From Cortex to Percept 127

amblyopia. This area has previously been shown to be involved in the processing of pattern motion (Merabet et al., 1998; Villeneuve et al., 2005) and has extensive reciprocal connections with extrastriate brain areas including MT (Casanova, 2004). These results in humans with amblyopia raise the intriguing possibility that alternate brain areas may play a role in motion perception if the function of hMT+ is sub-optimal or compromised. If this were to be the case, the current psychophysical data from patients with amblyopia and both humans and monkeys with MT lesions (described above) imply that this compensatory processing is highly sensitive to noise such as the random dots present in RDK stimuli.

The cortical mechanisms involved in the perception of visual motion represent one of the most widely studied and well-understood processes in neuroscience, however there are still many questions left to answer. Over the last few decades a picture has emerged of a motion processing system that is rigidly hierarchical, yet possesses considerable redundancy, and plasticity. In general terms, progressively higher-level areas of the brain integrate the outputs of areas below them in order to detect and discriminate increasingly complicated stimuli. However, with apologies to the Reverend William Paley, the visual brain is not an organ that has been designed, but one that has evolved over the millennia, and which demonstrates all the adaptations and redundancies that implies. A very much open question relates to the evolutionary origins of specialised forms of motion perception, such as secondorder motion, structure-from-motion or biological motion; assuming that there actually are dedicated mechanisms for these type of motion, a question which is by no means settled.

*Department of Optometry and Visual Science, Buskerud University College, Norway* 

*Department of Optometry and Vision Science, University of Auckland, New Zealand* 

Auckland and the Neurological Foundation of New Zealand.

scale invariant. *Vision Res, 48* (19), 1965-1971.

Craig Aaen-Stockdale is supported by an *Yggdrasil* international mobility grant from the Research Council of Norway (*Forskningsrådet*). Benjamin Thompson is supported by grants from the Health Research Council of New Zealand, the Marsden Fund, the University of

Aaen-Stockdale, C. & Hess, R.F. (2008). The amblyopic deficit for global motion is spatial

**5. Conclusion** 

**Author details** 

Craig Aaen-Stockdale

Benjamin Thompson

**Acknowledgement** 

**6. References** 

In this context we were surprised to find that patients with amblyopia did not show pronounced impairments in the perception of coherent plaid stimuli (Thompson et al., 2008a). As described above, plaid stimuli have been used extensively to investigate the integration of local motion signals within the visual cortex and cells have been found within MT that respond selectively to the integrated "pattern motion" direction of coherent plaids. We therefore expected to find reduced levels of coherent motion perception in patients with amblyopia as would be predicted by deficient processing within MT. In contrast to this we found that within the small region of the parameter space where amblyopes did exhibit differences from controls, these differences were characterised by an increased probability of coherent motion perception (Thompson et al., 2008a). Since a similar change in plaid perception can be induced in the normal visual system when inhibitory rTMS is delivered over V1 (Thompson et al., 2009) (figure 2), the differences between amblyopic observers and controls in plaid perception may be due to abnormal processing within V1 rather than MT.

This seemingly anomalous result has recently been further explored using fMRI (Thompson et al., 2012). Coherent and incoherent plaid stimuli that were perceived in exactly the same way by control observers and amblyopes, activated distinct networks of brain areas when the plaids were viewed by non-amblyopic eyes vs. amblyopic eyes. For controls and patients viewing through their non-amblyopic eye, the hMT+ complex was differentially activated by coherent and incoherent plaids consistent with previous fMRI studies (Castelo-Branco et al., 2002) and the idea that this area is centrally involved in motion integration. However this was not the case for amblyopic eye viewing for which there appeared to be a selective loss of response within hMT+ to incoherent plaid motion. In patients for whom the hMT+ complex could be subdivided into MT and MST, this loss was apparent in both areas. It would appear, therefore, that areas other than hMT+ were involved in the normal perception of plaid patterns for amblyopic eye viewing. The fMRI data provided preliminary evidence for a preserved response to incoherent motion within the pulvinar complex of the patients with amblyopia. This area has previously been shown to be involved in the processing of pattern motion (Merabet et al., 1998; Villeneuve et al., 2005) and has extensive reciprocal connections with extrastriate brain areas including MT (Casanova, 2004). These results in humans with amblyopia raise the intriguing possibility that alternate brain areas may play a role in motion perception if the function of hMT+ is sub-optimal or compromised. If this were to be the case, the current psychophysical data from patients with amblyopia and both humans and monkeys with MT lesions (described above) imply that this compensatory processing is highly sensitive to noise such as the random dots present in RDK stimuli.

## **5. Conclusion**

126 Visual Cortex – Current Status and Perspectives

this can result in specific deficits in motion perception.

Although amblyopia is typically regarded as a disorder of spatial vision, a number of studies have identified deficits in motion perception that appear to be independent of impairments of form perception (see Thompson et al., 2011 for a recent overview). Many of the motion deficits that have been reported are consistent with abnormalities at the level of hMT+ complex. For example, patients with amblyopia exhibit elevated motion coherence thresholds when viewing random dot kinematograms, even when abnormal contrast sensitivity is taken into account (Simmers et al., 2003; Constantinescu et al., 2005; Simmers et al., 2006). These deficits are present across spatial scale (Aaen-Stockdale & Hess, 2008), include both first- and second-order motion stimuli (Aaen-Stockdale et al., 2007), may involve deficits in spatial summation (Thompson et al., 2011) and appear to be related to the segregation of signal from noise (Mansouri & Hess, 2006; Thompson et al., 2008b). In agreement with these psychophysical findings, it has recently been reported that cells within MT of monkeys made experimentally strabismic are less direction selective and less tolerant of noise (El-Shamayleh et al., 2011). It would appear, therefore, that motion sensitive areas such as MT are susceptible to abnormal sensory experience during development and that

In this context we were surprised to find that patients with amblyopia did not show pronounced impairments in the perception of coherent plaid stimuli (Thompson et al., 2008a). As described above, plaid stimuli have been used extensively to investigate the integration of local motion signals within the visual cortex and cells have been found within MT that respond selectively to the integrated "pattern motion" direction of coherent plaids. We therefore expected to find reduced levels of coherent motion perception in patients with amblyopia as would be predicted by deficient processing within MT. In contrast to this we found that within the small region of the parameter space where amblyopes did exhibit differences from controls, these differences were characterised by an increased probability of coherent motion perception (Thompson et al., 2008a). Since a similar change in plaid perception can be induced in the normal visual system when inhibitory rTMS is delivered over V1 (Thompson et al., 2009) (figure 2), the differences between amblyopic observers and controls in plaid perception may be due to abnormal processing within V1 rather than MT.

This seemingly anomalous result has recently been further explored using fMRI (Thompson et al., 2012). Coherent and incoherent plaid stimuli that were perceived in exactly the same way by control observers and amblyopes, activated distinct networks of brain areas when the plaids were viewed by non-amblyopic eyes vs. amblyopic eyes. For controls and patients viewing through their non-amblyopic eye, the hMT+ complex was differentially activated by coherent and incoherent plaids consistent with previous fMRI studies (Castelo-Branco et al., 2002) and the idea that this area is centrally involved in motion integration. However this was not the case for amblyopic eye viewing for which there appeared to be a selective loss of response within hMT+ to incoherent plaid motion. In patients for whom the hMT+ complex could be subdivided into MT and MST, this loss was apparent in both areas. It would appear, therefore, that areas other than hMT+ were involved in the normal perception of plaid patterns for amblyopic eye viewing. The fMRI data provided preliminary evidence for a preserved response to incoherent motion within the pulvinar complex of the patients with The cortical mechanisms involved in the perception of visual motion represent one of the most widely studied and well-understood processes in neuroscience, however there are still many questions left to answer. Over the last few decades a picture has emerged of a motion processing system that is rigidly hierarchical, yet possesses considerable redundancy, and plasticity. In general terms, progressively higher-level areas of the brain integrate the outputs of areas below them in order to detect and discriminate increasingly complicated stimuli. However, with apologies to the Reverend William Paley, the visual brain is not an organ that has been designed, but one that has evolved over the millennia, and which demonstrates all the adaptations and redundancies that implies. A very much open question relates to the evolutionary origins of specialised forms of motion perception, such as secondorder motion, structure-from-motion or biological motion; assuming that there actually are dedicated mechanisms for these type of motion, a question which is by no means settled.

## **Author details**

Craig Aaen-Stockdale *Department of Optometry and Visual Science, Buskerud University College, Norway* 

Benjamin Thompson *Department of Optometry and Vision Science, University of Auckland, New Zealand* 

## **Acknowledgement**

Craig Aaen-Stockdale is supported by an *Yggdrasil* international mobility grant from the Research Council of Norway (*Forskningsrådet*). Benjamin Thompson is supported by grants from the Health Research Council of New Zealand, the Marsden Fund, the University of Auckland and the Neurological Foundation of New Zealand.

## **6. References**

Aaen-Stockdale, C. & Hess, R.F. (2008). The amblyopic deficit for global motion is spatial scale invariant. *Vision Res, 48* (19), 1965-1971.

Aaen-Stockdale, C., Ledgeway, T., & Hess, R.F. (2007). Second-order optic flow deficits in amblyopia. *Invest Ophthalmol Vis Sci, 48* (12), 5532-5538.

Visual Motion: From Cortex to Percept 129

Badcock, D.R. & Khuu, S.K. (2001). Independent first- and second-order motion energy

Baker, C.L., Jr., Hess, R.F., & Zihl, J. (1991). Residual motion perception in a "motion-blind" patient, assessed with limited-lifetime random dot stimuli. *J Neurosci, 11* (2), 454-461. Ball, K. & Sekuler, R. (1982). A specific and enduring improvement in visual motion

Ball, K. & Sekuler, R. (1987). Direction-specific improvement in motion discrimination.

Barlow, H.B. & Levick, W.R. (1965). The mechanism of directionally selective units in

Barnes, G.R., Hess, R.F., Dumoulin, S.O., Achtman, R.L., & Pike, G.B. (2001). The cortical

Battista, J., Badcock, D.R., & McKendrick, A.M. (2010). Centre-surround visual motion processing in migraine. *Investigative Ophthalmology & Visual Science, 51* (11), 6070-6076. Beckers, G. & Zeki, S. (1995). The consequences of inactivating areas V1 and V5 on visual

Bellefeuille, A. & Faubert, J. (1998). Independence of contour and biological-motion cues for

Betts, L.R., Taylor, C.P., Sekuler, A.B., & Bennett, P.J. (2005). Aging reduces center-surround

Braddick, O.J., O'Brien, J.M., Wattam-Bell, J., Atkinson, J., Hartley, T., & Turner, R. (2001).

Brighina, F., Palermo, A., & Fierro, B. (2009). Cortical inhibition and habituation to evoked potentials: Relevance for pathophysiology of migraine. *J Headache Pain, 10* (2), 77-84. Britten, K.H. & van Wezel, R.J. (1998). Electrical microstimulation of cortical area MST biases

Burr, D.C., Badcock, D.R., & Ross, J. (2001). Cardinal axes for radial and circular motion,

Burr, D.C. & Thompson, P. (2011). Motion psychophysics: 1985-2010. *Vision Res, 51* (13),

Campbell, F.W. & Robson, J.G. (1968). Application of fourier analysis to the visibility of

Casanova, C. (2004). The visual functions of the pulvinar. In: L.M. Chalupa & J.S. Werner (Eds.), *The Visual Neurosciences* (pp. 592-608). Cambridge, USA: The MIT Press. Cassanello, C.R., Edwards, M., Badcock, D.R., & Nishida, S. (2011). No interaction of firstand second-order signals in the extraction of global-motion and optic-flow. *Vision Res,* 

Castelo-Branco, M., Formisano, E., Backes, W., Zanella, F., Neuenschwander, S., Singer, W., & Goebel, R. (2002). Activity patterns in human motion-sensitive areas depend on the

interpretation of global motion. *Proc Natl Acad Sci U S A, 99* (21), 13914-13919. Cavanagh, P. & Mather, G. (1989). Motion: The long and short of it. *Spat Vis, 4* (2-3), 103-129.

deficit in humans with strabismic amblyopia. *J Physiol, 533* (Pt 1), 281-297.

analyses of optic flow. *Psychol Res, 65* (1), 50-56.

discrimination. *Science, 218* (4573), 697-698.

rabbit's retina. *J Physiol, 178* (3), 477-504.

motion perception. *Brain, 118 ( Pt 1)*, 49-60.

motion-defined animal shapes. *Perception, 27* (2), 225-235.

heading perception in monkeys. *Nat Neurosci, 1* (1), 59-63.

antagonism in visual motion processing. *Neuron, 45* (3), 361-366.

Brain areas sensitive to coherent visual motion. *Perception, 30* (1), 61-72.

revealed by summation and by masking. *Vision Res, 41* (4), 473-481.

*Vision Res, 27* (6), 953-965.

1431-1456.

*51* (3), 352-361.

gratings. *J Physiol, 197* (3), 551-566.


Badcock, D.R. & Khuu, S.K. (2001). Independent first- and second-order motion energy analyses of optic flow. *Psychol Res, 65* (1), 50-56.

128 Visual Cortex – Current Status and Perspectives

*of Vision, 10* (13), 6.

Aaen-Stockdale, C., Ledgeway, T., & Hess, R.F. (2007). Second-order optic flow deficits in

Aaen-Stockdale, C.R., Farivar, R., & Hess, R.F. (2010). Co-operative interactions between first- and second-order mechanisms in the processing of structure from motion *Journal* 

Aaen-Stockdale, C.R., Ledgeway, T., McGraw, P.V., & Hess, R.F. (forthcoming). Integration of first- and second-order motion at, above and beyond MT/V5. *Vision Research,*  Aaen-Stockdale, C.R., Thompson, B., Hess, R.F., & Troje, N.F. (2008). Biological motion

Aaen-Stockdale, C.R., Thompson, B., Huang, P.C., & Hess, R.F. (2009). Low-level mechanisms may contribute to paradoxical motion percepts. *Journal of Vision, 9* (5) Adelson, E.H. & Bergen, J.R. (1985). Spatiotemporal energy models for the perception of

Adelson, E.H. & Movshon, J.A. (1982). Phenomenal coherence of moving visual patterns.

Ahlstrom, V., Blake, R., & Ahlstrom, U. (1997). Perception of biological motion. *Perception, 26* 

Alais, D., Burke, D., & Wenderoth, P. (1996). Further evidence for monocular determinants

Alais, D., Wenderoth, P., & Burke, D. (1994). The contribution of one-dimensional motion mechanisms to the perceived direction of drifting plaids and their after effects. *Vision* 

Alais, D., Wenderoth, P., & Burke, D. (1997). The size and number of plaid blobs mediate the

Albright, T.D. (1992). Form-cue invariant motion processing in primate visual cortex.

Allen, H.A., Hutchinson, C.V., Ledgeway, T., & Gayle, P. (2010). The role of contrast sensitivity in global motion processing deficits in the elderly. *J Vis, 10* (10), 15. Allman, J., Miezin, F., & McGuinness, E. (1985). Direction- and velocity-specific responses from beyond the classical receptive field in the middle temporal visual area (MT).

Andersen, R.A. & Bradley, D.C. (1998). Perception of three-dimensional structure-from-

Ashida, H., Lingnau, A., Wall, M.B., & Smith, A.T. (2007). fMRI adaptation reveals separate mechanisms for first-order and second-order motion. *J Neurophysiol, 97* (2), 1319-1325. Aurora, S.K., Barrodale, P., Chronicle, E.P., & Mulleners, W.M. (2005). Cortical inhibition is reduced in chronic and episodic migraine and demonstrates a spectrum of illness.

Aurora, S.K. & Wilkinson, F. (2007). The brain is hyperexcitable in migraine. *Cephalalgia, 27* 

amblyopia. *Invest Ophthalmol Vis Sci, 48* (12), 5532-5538.

perception is cue-invariant. *Journal of Vision, 8* (8), 6.

of perceived plaid direction. *Vision Res, 36* (9), 1247-1253.

motion. *Trends in Cognitive Sciences, 2* (6), 222-228.

misperception of type-ii plaid direction. *Vision Res, 37* (1), 143-150.

motion. *J Opt Soc Am A, 2* (2), 284-299.

*Nature, 300* (5892), 523-525.

*Res, 34* (14), 1823-1834.

*Science, 255* (5048), 1141-1143.

*Perception, 14* (2), 105-126.

*Headache, 45* (5), 546-552.

(12), 1442-1453.

(12), 1539-1548.


Chadaide, Z., Arlt, S., Antal, A., Nitsche, M.A., Lang, N., & Paulus, W. (2007). Transcranial direct current stimulation reveals inhibitory deficiency in migraine. *Cephalalgia, 27* (7), 833-839.

Visual Motion: From Cortex to Percept 131

Dukelow, S.P., DeSouza, J.F., Culham, J.C., van den Berg, A.V., Menon, R.S., & Vilis, T. (2001). Distinguishing subregions of the human MT+ complex using visual fields and

Dumoulin, S.O., Baker, C.L., Jr., Hess, R.F., & Evans, A.C. (2003). Cortical specialization for

Edwards, M. & Badcock, D.R. (1995). Global motion perception: No interaction between the

El-Shamayleh, Y., Kiorpes, L., Kohn, A., & Movshon, J.A. (2011). Visual motion processing by neurons in area MT of macaque monkeys with experimental amblyopia. *J Neurosci,* 

Farivar, R., Blanke, O., & Chaudhuri, A. (2009). Dorsal-ventral integration in the recognition

Ferrera, V.P. & Wilson, H.R. (1990). Perceived direction of moving two-dimensional

Fine, I. & Jacobs, R.A. (2002). Comparing perceptual learning tasks: A review. *J Vis, 2* (2),

Galletti, C., Fattori, P., Battaglini, P.P., Shipp, S., & Zeki, S. (1996). Functional demarcation of a border between areas V6 and V6a in the superior parietal gyrus of the macaque

Geesaman, B.J. & Andersen, R.A. (1996). The analysis of complex motion patterns by

Glasser, D.M. & Tadin, D. (2010). Low-level mechanisms do not explain paradoxical motion

Golomb, J.D., Beck, J.R., Ruf, B.M., Chen, J.I., Saricicek, A., Maloney, K.H., Hu, J., Chun, M.M., & Bhagwagar, Z. (2009). Enhanced visual motion processing in major depressive

Goodale, M.A. & Milner, A.D. (1992). Separate visual pathways for perception and action.

Graziano, M.S., Andersen, R.A., & Snowden, R.J. (1994). Tuning of MST neurons to spiral

Grezes, J., Fonlupt, P., Bertenthal, B., Delon-Martin, C., Segebarth, C., & Decety, J. (2001). Does perception of biological motion rely on specific brain regions? *Neuroimage, 13* (5),

Grosbras, M.H., Beaton, S., & Eickhoff, S.B. (2012). Brain regions involved in human movement perception: A quantitative voxel-based meta-analysis. *Hum Brain Mapp, 33* 

Grossberg, S., Mingolla, E., & Pack, C. (1999). A neural model of motion processing and

visual navigation by cortical area MST. *Cereb Cortex, 9* (8), 878-895.

processing first- and second-order motion. *Cereb Cortex, 13* (12), 1375-1385.

first- and second-order motion pathways. *Vision Res, 35* (18), 2589-2602.

Farivar, R. (2009). Dorsal-ventral integration in object recognition. *Brain Res Rev,* 

of motion-defined unfamiliar faces. *J Neurosci, 29* (16), 5336-5342.

form/cue invariant MSTd neurons. *J Neurosci, 16* (15), 4716-4732. Gibson, J.J. (1950). Perception of the visual world. Boston: Houghton Mifflin.

disorder. *Journal of Neuroscience, 29* (28), 9072-9077.

pursuit eye movements. *J Neurophysiol, 86* (4), 1991-2000.

*30* (36), 12198-12209.

190-203.

775-785.

(2), 431-454.

patterns. *Vision Res, 30* (2), 273-287.

monkey. *Eur J Neurosci, 8* (1), 30-52.

percepts. *J Vis, 10* (4), 20 21-29.

*Trends Neurosci, 15* (1), 20-25.

motions. *J Neurosci, 14* (1), 54-67.


progress. *Schizophr Bull, 37* (4), 709-715.

schizophrenia. *Biol Psychiatry, 64* (1), 74-77.

computational approach. *Perception, 9* (3), 253-269.

vision. *Trends Cogn Sci, 15* (10), 460-466.

hyperresponsive in migraine? *Cephalalgia, 27* (12), 1427-1439.

patterns, what is the first stage? *Vision Res, 32* (4), 691-698.

shape from fourier motion. *Vision Res, 29* (12), 1789-1813.

in lateral geniculate nucleus of macaque. *J Physiol, 357*, 219-240.

transcranial magnetic stimulation study. *Exp Brain Res, 171* (4), 558-562.

833-839.

(9), 2297-2306.

(6), 1346-1359.

*Sci, 46* (8), 3008-3012.

Chadaide, Z., Arlt, S., Antal, A., Nitsche, M.A., Lang, N., & Paulus, W. (2007). Transcranial direct current stimulation reveals inhibitory deficiency in migraine. *Cephalalgia, 27* (7),

Chen, Y. (2011). Abnormal visual motion processing in schizophrenia: A review of research

Chen, Y., Norton, D., & Ongur, D. (2008). Altered center-surround motion inhibition in

Chubb, C. & Sperling, G. (1988). Drift-balanced random stimuli: A general basis for studying

Churan, J. & Ilg, U.J. (2001). Processing of second-order motion stimuli in primate middle temporal area and medial superior temporal area. *J Opt Soc Am A Opt Image Sci Vis, 18* 

Clocksin, W.F. (1980). Perception of surface slant and edge labels from optical flow: A

Constantinescu, T., Schmidt, L., Watson, R., & Hess, R.F. (2005). A residual deficit for global motion processing after acuity recovery in deprivation amblyopia. *Invest Ophthalmol Vis* 

Coppola, G., Pierelli, F., & Schoenen, J. (2007). Is the cerebral cortex hyperexcitable or

Cowey, A., Campana, G., Walsh, V., & Vaina, L.M. (2006). The role of human extra-striate visual areas V5/MT and V2/V3 in the perception of the direction of global motion: A

de Haan, E.H. & Cowey, A. (2011). On the usefulness of 'what' and 'where' pathways in

Derrington, A.M. & Badcock, D.R. (1992). Two-stage analysis of the motion of 2-dimensional

Derrington, A.M., Badcock, D.R., & Henning, G.B. (1993). Discriminating the direction of second-order motion at short stimulus durations. *Vision Res, 33* (13), 1785-1794. Derrington, A.M. & Lennie, P. (1984). Spatial and temporal contrast sensitivities of neurones

Dittrich, W.H., Troscianko, T., Lea, S.E., & Morgan, D. (1996). Perception of emotion from dynamic point-light displays represented in dance. *Perception, 25* (6), 727-738. Dosher, B.A., Landy, M.S., & Sperling, G. (1989). Kinetic depth effect and optic flow--I. 3D

Duffy, C.J. & Wurtz, R.H. (1991a). Sensitivity of MST neurons to optic flow stimuli. I. A continuum of response selectivity to large-field stimuli. *J Neurophysiol, 65* (6), 1329-1345. Duffy, C.J. & Wurtz, R.H. (1991b). Sensitivity of mst neurons to optic flow stimuli. II. Mechanisms of response selectivity revealed by small-field stimuli. *J Neurophysiol, 65* 

non-fourier motion perception. *J Opt Soc Am A, 5* (11), 1986-2007.


Grossman, E.D., Battelli, L., & Pascual-Leone, A. (2005). Repetitive TMS over posterior STS disrupts perception of biological motion. *Vision Res, 45* (22), 2847-2853.

Visual Motion: From Cortex to Percept 133

Johnston, A., McOwan, P.W., & Buxton, H. (1992). A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells. *Proc R* 

Karas, R. & McKendrick, A.M. (2011). Age related changes to perceptual surround

Koenderink, J.J. & van Doorn, A.J. (1986). Depth and shape from differential perspective in

Landy, M.S., Dosher, B.A., Sperling, G., & Perkins, M.E. (1991). The kinetic depth effect and

Lappin, J.S., Doner, J.F., & Kottas, B.L. (1980). Minimal conditions for the visual detection of

Law, C.T. & Gold, J.I. (2008). Neural correlates of perceptual learning in a sensory-motor,

Ledgeway, T. & Hess, R.F. (2002). Failure of direction identification for briefly presented second-order motion stimuli: Evidence for weak direction selectivity of the mechanisms

Ledgeway, T., Hess, R.F., & McGraw, P.V. (2002). Masking effects between local first-order and second-order motions in the extraction of global-motion direction depend critically

Ledgeway, T. & Smith, A.T. (1994). Evidence for separate motion-detecting mechanisms for first- and second-order motion in human vision. *Vision Res, 34* (20), 2727-2740. Leventhal, A.G., Wang, Y., Pu, M., Zhou, Y., & Ma, Y. (2003). GABA and its agonists improved visual cortical function in senescent monkeys. *Science, 300* (5620), 812-815. Li, X., Mullen, K.T., Thompson, B., & Hess, R.F. (2011). Effective connectivity anomalies in

Longuet-Higgins, H.C. & Prazdny, K. (1980). The interpretation of a moving retinal image.

Lu, H., Qian, N., & Liu, Z. (2004). Learning motion discrimination with suppressed MT.

Lu, Z.L. & Sperling, G. (1995). The functional architecture of human visual motion

Lu, Z.L. & Sperling, G. (2001). Three-systems theory of human visual motion perception:

Mansouri, B. & Hess, R.F. (2006). The global processing deficit in amblyopia involves noise

Mareschal, I. & Baker, C.L., Jr. (1998). Temporal and spatial response to second-order stimuli

Marr, D. & Ullman, S. (1981). Directional selectivity and its use in early visual processing.

Mather, G. (1989). Early motion process and the kinetic depth effect. *Quarterly Journal of* 

Review and update. *J Opt Soc Am A Opt Image Sci Vis, 18* (9), 2331-2370.

suppression of moving stimuli. *Seeing Perceiving,* EPub ahead of print.

the presence of bending deformations. *J Opt Soc Am A, 3* (2), 242-249.

optic flow--II. First- and second-order motion. *Vision Res, 31* (5), 859-876.

structure and motion in three dimensions. *Science, 209* (4457), 717-719.

but not a sensory, cortical area. *Nat Neurosci, 11* (4), 505-513.

on stimulus visibility. *Perception, 31* (ECVP Abstract Supplement)

encoding motion. *Vision Res, 42* (14), 1739-1758.

human amblyopia. *Neuroimage, 54* (1), 505-516.

*Proc R Soc Lond B Biol Sci, 208* (1173), 385-397.

perception. *Vision Res, 35* (19), 2697-2722.

segregation. *Vision Res, 46* (24), 4104-4117.

in cat area 18. *J Neurophysiol, 80* (6), 2811-2823.

*Proc R Soc Lond B Biol Sci, 211* (1183), 151-180.

*Experimental Psychology, 41A* (1), 183-198.

*Vision Res, 44* (15), 1817-1825.

*Soc Lond B Biol Sci, 250* (1329), 297-306.


Johnston, A., McOwan, P.W., & Buxton, H. (1992). A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells. *Proc R Soc Lond B Biol Sci, 250* (1329), 297-306.

132 Visual Cortex – Current Status and Perspectives

711-720.

*A, 4* (3), 503-518.

induction. *Perception, 39* (11), 1452-1465.

striate cortex. *J Physiol, 195* (1), 215-243.

cortex. *J Physiol, 148*, 574-591.

*Neurosci, 5* (1), 72-75.

*Neurosci, 29* (13), 3981-3991.

*The Journal of Neuroscience, 29* (22), 7315-7329.

*Attention, Perception, & Psychophysics, 14* (2), 201-211.

opponency in visual cortex. *J Neurosci, 19* (16), 7162-7174.

perception of surface shape? *Vision Res, 40* (16), 2125-2133.

Holmes, J.M. & Clarke, M.P. (2006). Amblyopia. *Lancet, 367* (9519), 1343-1351.

with artificial squint. *Journal of Neurophysiology, 28* (6), 1041-&.

fibers in the macaque monkey. *J Comp Neurol, 146* (4), 421-450.

structure-from-motion perception. *Neural Computation, 1*, 324-333.

human areas MT and MST. *J Neurosci, 22* (16), 7195-7205.

Grossman, E.D., Battelli, L., & Pascual-Leone, A. (2005). Repetitive TMS over posterior STS

Grossman, E.D., Donnelly, M., Price, R., Pickens, D., Morgan, V., Neighbor, G., & Blake, R. (2000). Brain areas involved in perception of biological motion. *J Cogn Neurosci, 12* (5),

Grzywacz, N.M. & Hildreth, E.C. (1987). Incremental rigidity scheme for recovering structure from motion: Position-based versus velocity-based formulations. *J Opt Soc Am* 

Hanada, M. (2010). Differential effect of luminance contrast reduction and noise on motion

Heeger, D.J., Boynton, G.M., Demb, J.B., Seidemann, E., & Newsome, W.T. (1999). Motion

Hess, R.F., Thompson, B., Gole, G., & Mullen, K.T. (2009). Deficient responses from the lateral geniculate nucleus in humans with amblyopia. *Eur J Neurosci, 29* (5), 1064-1070. Hess, R.F. & Ziegler, L.R. (2000). What limits the contribution of second-order motion to the

Hildreth, E.C., Ando, H., Andersen, R.A., & Treue, S. (1995). Recovering three-dimensional structure from motion with surface reconstruction. *Vision Res, 35* (1), 117-137.

Hubel, D.H. & Weisel, T.N. (1968). Receptive fields and functional architecture of monkey

Hubel, D.H. & Wiesel, T.N. (1959). Receptive fields of single neurones in the cat's striate

Hubel, D.H. & Wiesel, T.N. (1965). Binocular interaction in striate cortex of kittens reared

Hubel, D.H. & Wiesel, T.N. (1972). Laminar and columnar distribution of geniculo-cortical

Huk, A.C., Dougherty, R.F., & Heeger, D.J. (2002). Retinotopy and functional subdivision of

Huk, A.C. & Heeger, D.J. (2002). Pattern-motion responses in human visual cortex. *Nat* 

Husain, M., Treue, S., & Andersen, R.A. (1989). Surface interpolation in three-dimensional

Huxlin, K.R., Martin, T., Kelly, K., Riley, M., Friedman, D.I., Burgin, W.S., & Hayhoe, M. (2009). Perceptual relearning of complex visual motion after V1 damage in humans. *J* 

Jastorff, J. & Orban, G.A. (2009). Human functional magnetic resonance imaging reveals separation and integration of shape and motion cues in biological motion processing.

Johansson, G. (1973). Visual perception of biological motion and a model for its analysis.

disrupts perception of biological motion. *Vision Res, 45* (22), 2847-2853.


Mather, G. & Murdoch, L. (1994). Gender discrimination in biological motion displays based on dynamic cues. *Proceedings of the Royal Society of London. Series B: Biological Sciences, 258* (1353), 273-279.

Visual Motion: From Cortex to Percept 135

Orban, G.A., Sunaert, S., Todd, J.T., Van Hecke, P., & Marchal, G. (1999). Human cortical

Palmer, J.E., Chronicle, E.P., Rolan, P., & Mulleners, W.M. (2000). Cortical hyperexcitability is cortical under-inhibition: Evidence from a novel functional test of migraine patients.

Paolini, M., Distler, C., Bremmer, F., Lappe, M., & Hoffmann, K.P. (2000). Responses to

Pelphrey, K.A., Morris, J.P., Michelich, C.R., Allison, T., & McCarthy, G. (2005). Functional anatomy of biological motion perception in posterior temporal cortex: An fMRI study of

Perrone, J.A. (1992). Model for the computation of self-motion in biological systems. *J Opt* 

Peuskens, H., Claeys, K.G., Todd, J.T., Norman, J.F., Van Hecke, P., & Orban, G.A. (2004). Attention to 3-d shape, 3-d motion, and texture in 3-d structure from motion displays. *J* 

Pitzalis, S., Sereno, M.I., Committeri, G., Fattori, P., Galati, G., Patria, F., & Galletti, C. (2010).

Qian, N. & Andersen, R.A. (1994). Transparent motion perception as detection of

Rizzo, M., Nawrot, M., & Zihl, J. (1995). Motion and shape perception in cerebral

Rodman, H.R. & Albright, T.D. (1989). Single-unit analysis of pattern-motion selective properties in the middle temporal visual area (MT). *Exp Brain Res, 75* (1), 53-64. Rudolph, K. & Pasternak, T. (1999). Transient and permanent deficits in motion perception after lesions of cortical areas MT and MST in the macaque monkey. *Cereb Cortex, 9* (1),

Saito, H., Yukie, M., Tanaka, K., Hikosaka, K., Fukada, Y., & Iwai, E. (1986). Integration of direction signals of image motion in the superior temporal sulcus of the macaque

Salzman, C.D., Murasugi, C.M., Britten, K.H., & Newsome, W.T. (1992). Microstimulation in visual area MT: Effects on direction discrimination performance. *J Neurosci, 12* (6), 2331-

Saygin, A.P., Wilson, S.M., Hagler, D.J., Jr., Bates, E., & Sereno, M.I. (2004). Point-light biological motion perception activates human premotor cortex. *J Neurosci, 24* (27), 6181-

Scott-Samuel, N.E. & Georgeson, M.A. (1999). Does early non-linearity account for second-

continuously changing optic flow in area MST. *J Neurophysiol, 84* (2), 730-743. Peelen, M.V., Wiggett, A.J., & Downing, P.E. (2006). Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion. *Neuron, 49* (6),

eye, mouth and hand movements. *Cereb Cortex, 15* (12), 1866-1876.

Human V6: The medial motion area. *Cereb Cortex, 20* (2), 411-424.

unbalanced motion signals. II. Physiology. *J Neurosci, 14* (12), 7367-7380.

regions involved in extracting depth from motion. *Neuron, 24* (4), 929-940.

*Cephalalgia, 20* (6), 525-532.

*Soc Am A, 9* (2), 177-194.

*Cogn Neurosci, 16* (4), 665-682.

akinetopsia. *Brain, 118 ( Pt 5)*, 1105-1127.

monkey. *J Neurosci, 6* (1), 145-157.

order motion? *Vision Res, 39* (17), 2853-2865.

815-822.

90-100.

2355.

6188.


Orban, G.A., Sunaert, S., Todd, J.T., Van Hecke, P., & Marchal, G. (1999). Human cortical regions involved in extracting depth from motion. *Neuron, 24* (4), 929-940.

134 Visual Cortex – Current Status and Perspectives

*Proc Biol Sci, 249* (1325), 149-155.

displays. *Perception, 22* (7), 759-766.

*258* (1353), 273-279.

*11* (4), 994-1001.

(12), 1322-1328.

11.

3254.

99-116.

*Neurophysiol, 68* (1), 164-181.

Mather, G. & Murdoch, L. (1994). Gender discrimination in biological motion displays based on dynamic cues. *Proceedings of the Royal Society of London. Series B: Biological Sciences,* 

Mather, G., Radford, K., & West, S. (1992). Low-level visual processing of biological motion.

Mather, G. & West, S. (1993). Recognition of animal locomotion from dynamic point-light

Meese, T.S. & Anderson, S.J. (2002). Spiral mechanisms are required to account for

Meese, T.S. & Harris, M.G. (2001). Independent detectors for expansion and rotation, and for

Merabet, L., Desautels, A., Minville, K., & Casanova, C. (1998). Motion integration in a

Merigan, W.H., Katz, L.M., & Maunsell, J.H. (1991). The effects of parvocellular lateral geniculate lesions on the acuity and contrast sensitivity of macaque monkeys. *J Neurosci,* 

Morrone, M.C., Burr, D.C., Di Pietro, S., & Stefanelli, M.A. (1999). Cardinal directions for

Morrone, M.C., Tosetti, M., Montanaro, D., Fiorentini, A., Cioni, G., & Burr, D.C. (2000). A cortical area that responds specifically to optic flow, revealed by fMRI. *Nat Neurosci, 3* 

Movshon, J.A., Adelson, E.H., Gizzi, M.S., & Newsome, W.T. (1985). The analysis of moving visual patterns. In: C. Chagass, R. Gattass, & C. Gross (Eds.), *Pattern Recognition* 

Newsome, W.T. & Pare, E.B. (1988). A selective impairment of motion perception following

Nishida, S. (2011). Advancement of motion psychophysics: Review 2001-2010. *J Vis, 11* (5),

Nishida, S., Ledgeway, T., & Edwards, M. (1997). Dual multiple-scale processing for motion

Nishida, S., Sasaki, Y., Murakami, I., Watanabe, T., & Tootell, R.B. (2003). Neuroimaging of direction-selective mechanisms for second-order motion. *J Neurophysiol, 90* (5), 3242-

O'Keefe, L.P. & Movshon, J.A. (1998). Processing of first- and second-order motion signals

Olavarria, J.F., DeYoe, E.A., Knierim, J.J., Fox, J.M., & van Essen, D.C. (1992). Neural responses to visual texture patterns in middle temporal area of the macaque monkey. *J* 

Oram, M.W. & Perrett, D.I. (1994). Responses of anterior superior temporal polysensory (STPA) neurons to "biological motion" stimuli. *Journal of Cognitive Neuroscience, 6* (2),

by neurons in area MT of the macaque monkey. *Vis Neurosci, 15* (2), 305-317.

lesions of the middle temporal visual area (MT). *J Neurosci, 8* (6), 2201-2211.

summation of complex motion components. *Vision Res, 42* (9), 1073-1080.

orthogonal components of deformation. *Perception, 30* (10), 1189-1202.

thalamic visual nucleus. *Nature, 396* (6708), 265-268.

visual optic flow. *Curr Biol, 9* (14), 763-766.

*Mechanisms* (pp. 117-151). Rome: Vatican Press.

in the human visual system. *Vision Res, 37* (19), 2685-2698.


Seiffert, A.E., Somers, D.C., Dale, A.M., & Tootell, R.B. (2003). Functional MRI studies of human visual motion perception: Texture, luminance, attention and after-effects. *Cereb Cortex, 13* (4), 340-349.

Visual Motion: From Cortex to Percept 137

Thompson, B., Aaen-Stockdale, C.R., Mansouri, B., & Hess, R.F. (2008a). Plaid perception is

Thompson, B., Hansen, B.C., Hess, R.F., & Troje, N.F. (2007). Peripheral vision: Good for

Thompson, B. & Liu, Z. (2006). Learning motion discrimination with suppressed and un-

Thompson, B., Richard, A., Churan, J., Hess, R.F., Aaen-Stockdale, C., & Pack, C.C. (2011). Impaired spatial and binocular summation for motion direction discrimination in

Thompson, B., Troje, N.F., Hansen, B.C., & Hess, R.F. (2008b). Amblyopic perception of

Thompson, B., Villeneuve, M.Y., Casanova, C., & Hess, R.F. (2012). Abnormal cortical processing of pattern motion in amblyopia: Evidence from fMRI. *Neuroimage, 60* (2),

Tinsley, C.J., Webb, B.S., Barraclough, N.E., Vincent, C.J., Parker, A., & Derrington, A.M. (2003). The nature of V1 neural responses to 2d moving patterns depends on receptive-

Tootell, R.B., Reppas, J.B., Kwong, K.K., Malach, R., Born, R.T., Brady, T.J., Rosen, B.R., & Belliveau, J.W. (1995). Functional analysis of human MT and related visual cortical

Treue, S., Andersen, R.A., Ando, H., & Hildreth, E.C. (1995). Structure-from-motion:

Troje, N.F. (2002). Decomposing biological motion: A framework for analysis and synthesis

Troje, N.F. & Westhoff, C. (2006). The inversion effect in biological motion perception:

Ullman, S. (1984). Maximizing rigidity: The incremental recovery of 3-d structure from rigid

Ungerleider, L.G. & Haxby, J.V. (1994). 'What' and 'where' in the human brain. *Curr Opin* 

Ungerleider, L.G. & Mishkin, M. (1982). Two cortical visual systems. In: D.J. Ingle, M.A. Goodale, & R.J.W. Mansfield (Eds.), *Analysis of Visual Behavior* (pp. 549-586): MIT Press. van Santen, J.P. & Sperling, G. (1985). Elaborated reichardt detectors. *J Opt Soc Am A, 2* (2),

Villeneuve, M.Y., Kupers, R., Gjedde, A., Ptito, M., & Casanova, C. (2005). Pattern-motion

Wallach, H. & O'Connell, D.N. (1953). The kinetic depth effect. *J Exp Psychol, 45* (4), 205-217. Wallisch, P. & Kumbhani, R.D. (2009). Can major depression improve the perception of

Watson, A.B. & Ahumada, A.J., Jr. (1985). Model of human visual-motion sensing. *J Opt Soc* 

field structure in the marmoset monkey. *J Neurophysiol, 90* (2), 930-937.

areas using magnetic resonance imaging. *J Neurosci, 15* (4), 3215-3230.

Perceptual evidence for surface interpolation. *Vision Res, 35* (1), 139-148.

only subtly impaired in strabismic amblyopia. *Vision Res, 48* (11), 1307-1314.

biological motion, bad for signal noise segregation? *J Vis, 7* (10), 12 11-17.

suppressed MT. *Vision Res, 46* (13), 2110-2121.

strabismic amblyopia. *Vision Res, 51* (6), 577-584.

biological motion. *J Vis, 8* (4), 22.21-2214.

of human gait patterns. *J Vis, 2* (5), 371-387.

and nonrigid motion. *Perception, 13* (3), 255-274.

*Neurobiol, 4* (2), 157-165.

*Am A, 2* (2), 322-341.

300-321.

Evidence for a "Life detector"? *Curr Biol, 16* (8), 821-824.

selectivity in the human pulvinar. *Neuroimage, 28* (2), 474-480.

visual motion? *Journal of Neuroscience, 29* (46), 14381-14382.

1307-1315.


Thompson, B., Aaen-Stockdale, C.R., Mansouri, B., & Hess, R.F. (2008a). Plaid perception is only subtly impaired in strabismic amblyopia. *Vision Res, 48* (11), 1307-1314.

136 Visual Cortex – Current Status and Perspectives

*Cortex, 13* (4), 340-349.

Seiffert, A.E., Somers, D.C., Dale, A.M., & Tootell, R.B. (2003). Functional MRI studies of human visual motion perception: Texture, luminance, attention and after-effects. *Cereb* 

Servos, P., Osu, R., Santi, A., & Kawato, M. (2002). The neural substrates of biological motion

Shariat, H. & Price, K.E. (1990). Motion estimation with more than 2 frames. *IEEE* 

Simmers, A.J., Ledgeway, T., Hess, R.F., & McGraw, P.V. (2003). Deficits to global motion

Simmers, A.J., Ledgeway, T., Mansouri, B., Hutchinson, C.V., & Hess, R.F. (2006). The extent

Smith, A.T., Greenlee, M.W., Singh, K.D., Kraemer, F.M., & Hennig, J. (1998). The processing of first- and second-order motion in human visual cortex assessed by functional

Smith, A.T. & Ledgeway, T. (1997). Separate detection of moving luminance and contrast

Snowden, R.J. & Milne, A.B. (1996). The effects of adapting to complex motions: Position invariance and tuning to spiral motions. *Journal of Cognitive Neuroscience, 8*, 435-452. Snowden, R.J., Treue, S., Erickson, R.G., & Andersen, R.A. (1991). The response of area MT

Tadin, D., Kim, J., Doop, M.L., Gibson, C., Lappin, J.S., Blake, R., & Park, S. (2006). Weakened center-surround interactions in visual motion processing in schizophrenia. *J* 

Tadin, D., Lappin, J.S., Gilroy, L.A., & Blake, R. (2003). Perceptual consequences of centresurround antagonism in visual motion processing. *Nature, 424* (6946), 312-315. Tadin, D., Silvanto, J., Pascual-Leone, A., & Battelli, L. (2011). Improved motion perception and impaired spatial suppression following disruption of cortical area MT/V5. *J* 

Tanaka, K., Fukada, Y., & Saito, H.A. (1989). Underlying mechanisms of the response specificity of expansion/contraction and rotation cells in the dorsal part of the medial

Tanaka, K. & Saito, H. (1989). Analysis of motion of the visual field by direction, expansion/contraction, and rotation cells clustered in the dorsal part of the medial

superior temporal area of the macaque monkey. *J Neurophysiol, 62* (3), 642-656. Tanaka, K., Hikosaka, K., Saito, H., Yukie, M., Fukada, Y., & Iwai, E. (1986). Analysis of local and wide-field movements in the superior temporal visual areas of the macaque

superior temporal area of the macaque monkey. *J Neurophysiol, 62* (3), 626-641. Thompson, B., Aaen-Stockdale, C., Koski, L., & Hess, R.F. (2009). A double dissociation between striate and extrastriate visual cortex for pattern motion perception revealed

*Transactions on Pattern Analysis and Machine Intelligence, 12* (5), 417-434.

of the dorsal extra-striate deficit in amblyopia. *Vision Res, 46*, 2571-2580.

magnetic resonance imaging (fMRI). *J Neurosci, 18* (10), 3816-3830.

and V1 neurons to transparent motion. *J Neurosci, 11* (9), 2768-2785.

perception: An fMRI study. *Cereb Cortex, 12* (7), 772-782.

processing in human amblyopia. *Vision Res., 43*, 729-738.

modulations: Fact or artifact? *Vision Res, 37* (1), 45-62.

*Neurosci, 26* (44), 11403-11412.

*Neurosci, 31* (4), 1279-1283.

monkey. *J Neurosci, 6* (1), 134-144.

using rTMS. *Human Brain Mapping, 30* (10), 3115-3126.


Wilson, H.R., Ferrera, V.P., & Yo, C. (1992). A psychophysically motivated model for twodimensional motion perception. *Vis Neurosci, 9* (1), 79-97.

**Chapter 6** 

© 2012 Cameron and Binsted, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Visual Processing in the Action-Oriented Brain** 

Visual processing does not occur in passive systems. We and our evolutionary ancestors are and were acting creatures, and our visual systems evolved to enable effective locomotion and action. Indeed, sensation of the external world is only evolutionarily meaningful to the extent that the sensing organism can respond to the input (accounting for the dearth of eyes in the plant kingdom), and if vision's purpose is action, our understanding of the visual brain is likely to be well served by studying vision in an action context. Even for our more sedentary activities, like reading, watching TV, or chatting with a friend, our eyes are in constant motion, actively gathering information. We are virtually incapable of passive vision. Our goals in this chapter are to highlight some of the ways in which seeing is coupled to action and to show that our understanding of visual processing can be enhanced by considering its relationship to action. To make our case, we will demonstrate that action has access to visual information that perception does not, and we will demonstrate that

We will begin the chapter with an overview of cortical substrates for visuomotor processing and prevailing theories about the division of visual-processing labour in the cortex. We will then discuss findings from neuropsychological and behavioural studies and what they have taught us about the complex and sometimes counter-intuitive relationship between vision and action. Our approach might be considered ecological to the extent that it stresses vision's tight coupling with action processing. Indeed, our analysis focuses mainly on functions that are considered to be under the purview of the visuomotor 'dorsal stream' in the posterior parietal cortex and, as we discuss in the next section, this visual stream is thought to be involved in the direct transformation of vision into action. It has previously been suggested that the visual functions of the dorsal stream might be those for which an ecological approach to vision is appropriate, in contrast to the visual functions of the perceptual 'ventral stream,' for which a constructivist approach to vision may be more fitting [1]. We are sympathetic to this view and to the idea that a greater respect for vision's

Brendan D. Cameron and Gordon Binsted

Additional information is available at the end of the chapter

actions and action plans influence what we consciously perceive.

http://dx.doi.org/10.5772/48461

**1. Introduction** 


## **Visual Processing in the Action-Oriented Brain**

Brendan D. Cameron and Gordon Binsted

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/48461

## **1. Introduction**

138 Visual Cortex – Current Status and Perspectives

(14), 1835-1842.

811-824.

Wilson, H.R., Ferrera, V.P., & Yo, C. (1992). A psychophysically motivated model for two-

Wilson, H.R. & Kim, J. (1994). Perceived motion in the vector sum direction. *Vision Res, 34* 

Wright, M.J. & Gurney, K.N. (1992). Lower threshold of motion for one and two dimensional patterns in central and peripheral vision. *Vision Res, 32* (1), 121-134. Yo, C. & Wilson, H.R. (1992). Perceived direction of moving two-dimensional patterns

Zeki, S. (1991). Cerebral akinetopsia (visual motion blindness). A review. *Brain, 114 ( Pt 2)*,

Zihl, J., von Cramon, D., & Mai, N. (1983). Selective disturbance of movement vision after

Zohary, E., Celebrini, S., Britten, K.H., & Newsome, W.T. (1994). Neuronal plasticity that underlies improvement in perceptual performance. *Science, 263* (5151), 1289-1292.

depends on duration, contrast and eccentricity. *Vision Res, 32* (1), 135-147.

dimensional motion perception. *Vis Neurosci, 9* (1), 79-97.

bilateral brain damage. *Brain, 106 (Pt 2)*, 313-340.

Visual processing does not occur in passive systems. We and our evolutionary ancestors are and were acting creatures, and our visual systems evolved to enable effective locomotion and action. Indeed, sensation of the external world is only evolutionarily meaningful to the extent that the sensing organism can respond to the input (accounting for the dearth of eyes in the plant kingdom), and if vision's purpose is action, our understanding of the visual brain is likely to be well served by studying vision in an action context. Even for our more sedentary activities, like reading, watching TV, or chatting with a friend, our eyes are in constant motion, actively gathering information. We are virtually incapable of passive vision. Our goals in this chapter are to highlight some of the ways in which seeing is coupled to action and to show that our understanding of visual processing can be enhanced by considering its relationship to action. To make our case, we will demonstrate that action has access to visual information that perception does not, and we will demonstrate that actions and action plans influence what we consciously perceive.

We will begin the chapter with an overview of cortical substrates for visuomotor processing and prevailing theories about the division of visual-processing labour in the cortex. We will then discuss findings from neuropsychological and behavioural studies and what they have taught us about the complex and sometimes counter-intuitive relationship between vision and action. Our approach might be considered ecological to the extent that it stresses vision's tight coupling with action processing. Indeed, our analysis focuses mainly on functions that are considered to be under the purview of the visuomotor 'dorsal stream' in the posterior parietal cortex and, as we discuss in the next section, this visual stream is thought to be involved in the direct transformation of vision into action. It has previously been suggested that the visual functions of the dorsal stream might be those for which an ecological approach to vision is appropriate, in contrast to the visual functions of the perceptual 'ventral stream,' for which a constructivist approach to vision may be more fitting [1]. We are sympathetic to this view and to the idea that a greater respect for vision's

© 2012 Cameron and Binsted, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

behavioural role will assist future vision science. And though we will not be endorsing Gibson's direct perception approach to vision [2] or declaring allegiance to a strictly ecological approach to vision, the spirit of our argument will echo J.J. Gibson's contention that vision relies on the moving, acting person in their environment.

Visual Processing in the Action-Oriented Brain 141

largely preserved. For example, D.F. can accurately rotate a card to fit through a slot during a posting action, but will fail to perceptually report the orientation of the slot [7]. D.F. suffered damage to her lateral occipital cortex (LOC) [8] in the ventral stream but has a largely intact PPC, and Milner and Goodale [5] inferred that the PPC was responsible for her preserved visuomotor function. Milner and Goodale contrast D.F.'s preserved motor abilities and pattern of cortical damage with those of patients with optic ataxia (a condition we describe in more detail later), who experience impaired goal-directed action and tend to

Additional support for the PAM was provided by studies in non-patient participants interacting with visual illusions, which were shown to fool perceptual reports but not goaldirected actions [e.g. 9,10]. For instance, when participants responded to the central circle in the Ebbinghaus illusion, a size-contrast illusion in which the central circle appears larger when surrounded by smaller circles, perceptual reports were more susceptible to the illusion than actions were [9]. These findings were consistent with the idea that the ventral stream considers relationships among objects, driving the size-contrast illusion effects, while the dorsal stream considers only the action-relevant parameters of the target object. However, conflicting findings across studies and issues relating to task differences among perceptual and reach tasks ultimately weakened – though may not have defeated – the case from the illusion literature, and we refer the reader elsewhere for an extensive discussion of

Milner and Goodale's [5] reformulation of the two visual streams hypothesis was possible thanks to a consideration of the relationship between action and vision. They shifted the focus from the *kinds* of visual information (spatial vs. identity) to the *behavioural role* of the visual information (acting vs. perceiving). Thus, Milner and Goodale's [5] model provides

Glover's [6] PCM, like the PAM, also emerged from a focus on the role of action in visual processing. Indeed, it shares much in common with the PAM; both models assign perceptual/cognitive and visuomotor processes to different cortical streams. Where the PCM differs from the PAM is in its proposal that different visual information is used for movement planning than for movement control and in its contention that distinct cortical areas serve as substrates for the two phases of action. To assist in clarifying the differences between the PCM and the PAM, we return, briefly, to the PAM and its view of movement

According to the PAM, the role of the ventral stream is to permit a high-level understanding of one's visual environment and the relationships among the objects within it. Information derived from ventral stream processing allows one to select, based on current goals, an object for action. Once this is done, control is passed to the dorsal stream, which carries out the specification of the movement parameters and monitors online performance. This action preparation and online control is, according to Milner and Goodale [5], carried out within

the superior parietal lobe (SPL) and the intraparietal sulcus (IPS).

an example of how thinking about action advanced our understanding of vision.

have lesions to areas within the PPC [5].

the illusion controversy [11,12].

preparation and control.

## **2. Cortical substrates for visuomotor processing**

In the human primary visual pathway, visual input from the retina proceeds, via the lateral geniculate nucleus, to primary visual cortex (V1), located in the occipital lobe. Visual information then proceeds to extrastriate visual areas in the occipital lobe, posterior parietal lobe, and temporal lobe. Our discussion will focus mainly on functional areas within the posterior parietal cortex (PPC), which are thought to be involved in the preparation and control of visually guided actions. Some of the areas of the PPC that will be particularly relevant to the discussion are the parietal-occipital junction (POJ), the superior parietal lobule (SPL), the intra-parietal sulcus (IPS), and the inferior parietal lobule (IPL). Studies in patient populations have shed light on the functions of these areas, and we address some of this work later in the section. First, however, we discuss three theories of cortical organization for visual processing that have significantly advanced our understanding of extrastriate processing: Mishkin, Ungerleider, and Macko's [3] two visual stream hypothesis, Goodale and Milner's [4,5] Perception-Action Model (PAM), and Glover's [6] Planning-Control Model (PCM). These theories all posit branching paths of visual output from V1, and their differences lie in the functions they assign to each visual processing stream. The PAM and the PCM are particularly relevant to our thesis, for they highlight regions of parietal cortex devoted to the translation of vision into action.

Mishkin, Ungerleider, and Macko [3] proposed that the primate visual system is divided into two cortical streams: a dorsal stream projecting from primary visual cortex to posterior parietal cortex (PPC) and a ventral stream projecting from primary visual cortex to inferior temporal cortex (IT). They suggested that the dorsal stream is responsible for processing visual information relating to object location ('where'), while the ventral stream is responsible for processing visual information relating to object identity ('what'). This functional division was supported by evidence from monkeys with lesioned PPC, who were impaired in their ability to select a target based on its relationship to a landmark object, and from monkeys with lesioned IT, who were impaired in their ability to select a target based on its shape and surface patterning.

Goodale and Milner [4,5] suggested an important modification to the two visual streams hypothesis. They argued that the functions of the two streams should be considered in terms of the purpose for which the visual information is being processed. According to the PAM the dorsal stream processes visual information for action ('how') and the ventral stream processes visual information for perception ('what' and 'perceptual where'). One of the key pieces of evidence for Milner and Goodale's model was the perceptual and motor performance of D.F., a patient with visual form agnosia. D.F.'s ability to identify objects and their shapes is dramatically impaired. However, her ability to reach to and grasp objects is largely preserved. For example, D.F. can accurately rotate a card to fit through a slot during a posting action, but will fail to perceptually report the orientation of the slot [7]. D.F. suffered damage to her lateral occipital cortex (LOC) [8] in the ventral stream but has a largely intact PPC, and Milner and Goodale [5] inferred that the PPC was responsible for her preserved visuomotor function. Milner and Goodale contrast D.F.'s preserved motor abilities and pattern of cortical damage with those of patients with optic ataxia (a condition we describe in more detail later), who experience impaired goal-directed action and tend to have lesions to areas within the PPC [5].

140 Visual Cortex – Current Status and Perspectives

on its shape and surface patterning.

behavioural role will assist future vision science. And though we will not be endorsing Gibson's direct perception approach to vision [2] or declaring allegiance to a strictly ecological approach to vision, the spirit of our argument will echo J.J. Gibson's contention

In the human primary visual pathway, visual input from the retina proceeds, via the lateral geniculate nucleus, to primary visual cortex (V1), located in the occipital lobe. Visual information then proceeds to extrastriate visual areas in the occipital lobe, posterior parietal lobe, and temporal lobe. Our discussion will focus mainly on functional areas within the posterior parietal cortex (PPC), which are thought to be involved in the preparation and control of visually guided actions. Some of the areas of the PPC that will be particularly relevant to the discussion are the parietal-occipital junction (POJ), the superior parietal lobule (SPL), the intra-parietal sulcus (IPS), and the inferior parietal lobule (IPL). Studies in patient populations have shed light on the functions of these areas, and we address some of this work later in the section. First, however, we discuss three theories of cortical organization for visual processing that have significantly advanced our understanding of extrastriate processing: Mishkin, Ungerleider, and Macko's [3] two visual stream hypothesis, Goodale and Milner's [4,5] Perception-Action Model (PAM), and Glover's [6] Planning-Control Model (PCM). These theories all posit branching paths of visual output from V1, and their differences lie in the functions they assign to each visual processing stream. The PAM and the PCM are particularly relevant to our thesis, for they highlight

Mishkin, Ungerleider, and Macko [3] proposed that the primate visual system is divided into two cortical streams: a dorsal stream projecting from primary visual cortex to posterior parietal cortex (PPC) and a ventral stream projecting from primary visual cortex to inferior temporal cortex (IT). They suggested that the dorsal stream is responsible for processing visual information relating to object location ('where'), while the ventral stream is responsible for processing visual information relating to object identity ('what'). This functional division was supported by evidence from monkeys with lesioned PPC, who were impaired in their ability to select a target based on its relationship to a landmark object, and from monkeys with lesioned IT, who were impaired in their ability to select a target based

Goodale and Milner [4,5] suggested an important modification to the two visual streams hypothesis. They argued that the functions of the two streams should be considered in terms of the purpose for which the visual information is being processed. According to the PAM the dorsal stream processes visual information for action ('how') and the ventral stream processes visual information for perception ('what' and 'perceptual where'). One of the key pieces of evidence for Milner and Goodale's model was the perceptual and motor performance of D.F., a patient with visual form agnosia. D.F.'s ability to identify objects and their shapes is dramatically impaired. However, her ability to reach to and grasp objects is

that vision relies on the moving, acting person in their environment.

regions of parietal cortex devoted to the translation of vision into action.

**2. Cortical substrates for visuomotor processing** 

Additional support for the PAM was provided by studies in non-patient participants interacting with visual illusions, which were shown to fool perceptual reports but not goaldirected actions [e.g. 9,10]. For instance, when participants responded to the central circle in the Ebbinghaus illusion, a size-contrast illusion in which the central circle appears larger when surrounded by smaller circles, perceptual reports were more susceptible to the illusion than actions were [9]. These findings were consistent with the idea that the ventral stream considers relationships among objects, driving the size-contrast illusion effects, while the dorsal stream considers only the action-relevant parameters of the target object. However, conflicting findings across studies and issues relating to task differences among perceptual and reach tasks ultimately weakened – though may not have defeated – the case from the illusion literature, and we refer the reader elsewhere for an extensive discussion of the illusion controversy [11,12].

Milner and Goodale's [5] reformulation of the two visual streams hypothesis was possible thanks to a consideration of the relationship between action and vision. They shifted the focus from the *kinds* of visual information (spatial vs. identity) to the *behavioural role* of the visual information (acting vs. perceiving). Thus, Milner and Goodale's [5] model provides an example of how thinking about action advanced our understanding of vision.

Glover's [6] PCM, like the PAM, also emerged from a focus on the role of action in visual processing. Indeed, it shares much in common with the PAM; both models assign perceptual/cognitive and visuomotor processes to different cortical streams. Where the PCM differs from the PAM is in its proposal that different visual information is used for movement planning than for movement control and in its contention that distinct cortical areas serve as substrates for the two phases of action. To assist in clarifying the differences between the PCM and the PAM, we return, briefly, to the PAM and its view of movement preparation and control.

According to the PAM, the role of the ventral stream is to permit a high-level understanding of one's visual environment and the relationships among the objects within it. Information derived from ventral stream processing allows one to select, based on current goals, an object for action. Once this is done, control is passed to the dorsal stream, which carries out the specification of the movement parameters and monitors online performance. This action preparation and online control is, according to Milner and Goodale [5], carried out within the superior parietal lobe (SPL) and the intraparietal sulcus (IPS).

PCM's differences from the PAM are related primarily to what factors influence movement preparation and what cortical areas are responsible for different kinds of visuomotor processing. The PCM proposes a third stream, in the inferior parietal lobe (IPL), that is responsible for movement preparation. This stream, according to Glover [6], considers nonspatial target factors (e.g. object weight) and contextual elements (e.g. background motion), which inform -- and, in the case of visual illusions, fool -- the initial preparation of the movement. Whereas the PAM places both movement preparation and online control within the same 'how' stream in the SPL and IPS, Glover's PCM separates the initial phase of movement production (in the IPL) from real-time control (in the SPL), and argues that separate representations underlie each phase of the movement. One of the pieces of behavioural evidence for the PCM was a careful examination of the unfolding movement during actions towards visual illusions. Glover and Dixon [13] showed that the effect of an illusion on grip aperture was stronger at the start of the movement and diminished as the movement progressed, potentially indicating that movement planning and movement control were drawing upon different representations of the target and its environment.

Visual Processing in the Action-Oriented Brain 143

situations where action is able to access visual information that perception cannot. The phenomenon is analogous to the one described in the visual agnosia patient D.F., who can interact with objects of which she is perceptually unaware [5]. Indeed, Milner and Goodale [5] have previously drawn attention to the implications of blindsight for the PAM, and have suggested that projections from subcortical structures to areas in the PPC may permit the preserved action in blindsight. We refer the reader to that text for a more detailed discussion

A recent blindsight study by Whitwell et al. [21] does provide indirect support for Milner and Goodale's [5] PAM model. Whitwell et al. [21] examined real-time and delayed grasping performance in a blindsight patient (S.J.) when she reached to targets presented in her blind field. They found that S.J. scaled her grip aperture to target size when movements were initiated while the target was 'visible' (i.e., not occluded by the experimenter), but failed to scale her grip aperture when the target was occluded 2 seconds prior to the imperative stimulus. Furthermore, S.J. was incapable of perceptually reporting the size of the target presented in her blind field. These findings are consistent with Milner and Goodale's [5] suggestion that the dorsal stream has no memory and only processes currently available visual information; however, this inference does rely on the assumption that

Independent of any dorsal/ventral considerations, blindsight studies show that goaldirected actions can uncover visual function that perceptual reports may not. Action's ability to tap into non-conscious vision has also been observed in non-patients: In a later section we describe a behavioural study [22] that reveals movement scaling to perceptuallyinaccessible target size, an effect that mimics the non-conscious reach performance in blindsight. That study suggests that the putative dorsal-stream processing in blindsight patients naturally occurs in non-patient participants, whose visuomotor systems are able to

Optic ataxia is a motor disorder characterized by deficits in goal-directed reaching, and neuropsychological investigations of the brain lesions associated with optic ataxia have contributed considerably to our understanding of visuomotor processing. However, the nature of the disorder is complex, and some of its implications for our understanding of the

One of the main pieces of evidence presented by Milner and Goodale [5] for the PAM is the proposed double-dissociation between visual agnosia and optic ataxia. The preserved motor function and impaired object perception in D.F., who has lesioned ventral stream and intact PPC, contrasts with the impaired motor function and preserved object perception in patients with optic ataxia, who have intact ventral streams and lesioned PPC. Other researchers,

Part of the difficulty with interpreting performance in optic ataxia stems from the observation that motor performance to targets presented in central vision is often

however, have questioned the validity of this double-dissociation [15,6].

of the evidence.

dorsal stream processing is preserved in S.J.

visuomotor dorsal stream are not yet clear.

see what perception cannot.

**2.3. Optic ataxia** 

Glover [6] is not alone in arguing for a third visual stream. Rizzolatti and Matelli [14] have argued for a division of labour within the dorsal stream: a dorsal-dorsal stream in the SPL that is responsible for online control and a ventral-dorsal stream in the IPL that is involved in both action and perception. Pisella et al. [15] have also argued for more than two visual streams. We will encounter some of the evidence for a divided dorsal stream in the section on optic ataxia.

## **2.1. Evidence from patient populations**

We turn next to visual deficits in patients who have suffered damage to different areas of visual cortex. We have chosen to focus on three conditions: blindsight, because it provides a dramatic example of vision for action without conscious awareness; optic ataxia, because it involves damage to the PPC and is informative with regard to action processing in that region; and hemispatial neglect, because it provides an important contrast to optic ataxia and also provides insight into the role of attention in action.

## **2.2. Blindsight**

The term 'blindsight' is typically used to describe the phenomenon in patients with lesioned V1 who report being unaware of objects in their blind visual field yet remain able to access some visual information about objects presented within it. For instance, patients can locate stimuli in their blind field for which they do not report any conscious awareness [16]. There has been considerable debate regarding the implications of blindsight for conscious visual processing and whether incomplete lesioning of V1 can explain performance in blindsight patients (see [17,18] for reviews). We will not outline the debate here; rather, we wish to highlight the important point that several of the behaviours observed in blindsight - looking at a target [19], pointing to a target [16], or even reaching to post a letter through a slot [20] -- involve acting without consciously seeing. These behaviours provide examples of situations where action is able to access visual information that perception cannot. The phenomenon is analogous to the one described in the visual agnosia patient D.F., who can interact with objects of which she is perceptually unaware [5]. Indeed, Milner and Goodale [5] have previously drawn attention to the implications of blindsight for the PAM, and have suggested that projections from subcortical structures to areas in the PPC may permit the preserved action in blindsight. We refer the reader to that text for a more detailed discussion of the evidence.

A recent blindsight study by Whitwell et al. [21] does provide indirect support for Milner and Goodale's [5] PAM model. Whitwell et al. [21] examined real-time and delayed grasping performance in a blindsight patient (S.J.) when she reached to targets presented in her blind field. They found that S.J. scaled her grip aperture to target size when movements were initiated while the target was 'visible' (i.e., not occluded by the experimenter), but failed to scale her grip aperture when the target was occluded 2 seconds prior to the imperative stimulus. Furthermore, S.J. was incapable of perceptually reporting the size of the target presented in her blind field. These findings are consistent with Milner and Goodale's [5] suggestion that the dorsal stream has no memory and only processes currently available visual information; however, this inference does rely on the assumption that dorsal stream processing is preserved in S.J.

Independent of any dorsal/ventral considerations, blindsight studies show that goaldirected actions can uncover visual function that perceptual reports may not. Action's ability to tap into non-conscious vision has also been observed in non-patients: In a later section we describe a behavioural study [22] that reveals movement scaling to perceptuallyinaccessible target size, an effect that mimics the non-conscious reach performance in blindsight. That study suggests that the putative dorsal-stream processing in blindsight patients naturally occurs in non-patient participants, whose visuomotor systems are able to see what perception cannot.

## **2.3. Optic ataxia**

142 Visual Cortex – Current Status and Perspectives

on optic ataxia.

**2.2. Blindsight** 

**2.1. Evidence from patient populations** 

and also provides insight into the role of attention in action.

PCM's differences from the PAM are related primarily to what factors influence movement preparation and what cortical areas are responsible for different kinds of visuomotor processing. The PCM proposes a third stream, in the inferior parietal lobe (IPL), that is responsible for movement preparation. This stream, according to Glover [6], considers nonspatial target factors (e.g. object weight) and contextual elements (e.g. background motion), which inform -- and, in the case of visual illusions, fool -- the initial preparation of the movement. Whereas the PAM places both movement preparation and online control within the same 'how' stream in the SPL and IPS, Glover's PCM separates the initial phase of movement production (in the IPL) from real-time control (in the SPL), and argues that separate representations underlie each phase of the movement. One of the pieces of behavioural evidence for the PCM was a careful examination of the unfolding movement during actions towards visual illusions. Glover and Dixon [13] showed that the effect of an illusion on grip aperture was stronger at the start of the movement and diminished as the movement progressed, potentially indicating that movement planning and movement control were drawing upon different representations of the target and its environment.

Glover [6] is not alone in arguing for a third visual stream. Rizzolatti and Matelli [14] have argued for a division of labour within the dorsal stream: a dorsal-dorsal stream in the SPL that is responsible for online control and a ventral-dorsal stream in the IPL that is involved in both action and perception. Pisella et al. [15] have also argued for more than two visual streams. We will encounter some of the evidence for a divided dorsal stream in the section

We turn next to visual deficits in patients who have suffered damage to different areas of visual cortex. We have chosen to focus on three conditions: blindsight, because it provides a dramatic example of vision for action without conscious awareness; optic ataxia, because it involves damage to the PPC and is informative with regard to action processing in that region; and hemispatial neglect, because it provides an important contrast to optic ataxia

The term 'blindsight' is typically used to describe the phenomenon in patients with lesioned V1 who report being unaware of objects in their blind visual field yet remain able to access some visual information about objects presented within it. For instance, patients can locate stimuli in their blind field for which they do not report any conscious awareness [16]. There has been considerable debate regarding the implications of blindsight for conscious visual processing and whether incomplete lesioning of V1 can explain performance in blindsight patients (see [17,18] for reviews). We will not outline the debate here; rather, we wish to highlight the important point that several of the behaviours observed in blindsight - looking at a target [19], pointing to a target [16], or even reaching to post a letter through a slot [20] -- involve acting without consciously seeing. These behaviours provide examples of Optic ataxia is a motor disorder characterized by deficits in goal-directed reaching, and neuropsychological investigations of the brain lesions associated with optic ataxia have contributed considerably to our understanding of visuomotor processing. However, the nature of the disorder is complex, and some of its implications for our understanding of the visuomotor dorsal stream are not yet clear.

One of the main pieces of evidence presented by Milner and Goodale [5] for the PAM is the proposed double-dissociation between visual agnosia and optic ataxia. The preserved motor function and impaired object perception in D.F., who has lesioned ventral stream and intact PPC, contrasts with the impaired motor function and preserved object perception in patients with optic ataxia, who have intact ventral streams and lesioned PPC. Other researchers, however, have questioned the validity of this double-dissociation [15,6].

Part of the difficulty with interpreting performance in optic ataxia stems from the observation that motor performance to targets presented in central vision is often comparable to that of controls; major performance deficits appear only when targets are presented in the visual periphery [23]. In other words, when patients are able to fixate the target, they can accurately reach to it. However, there is evidence that subtle movement deficits can be detected when the target is in central vision. A study by Pisella et al. [24], for instance, showed that an optic ataxia patient, I.G., who has bilateral lesions to PPC was much slower and less fluid than controls in correcting her movements online when a target in central vision was displaced during a reach (though this effect, too, may be explained by the central/peripheral distinction, for the displacement moved the target away from fixation). Such findings have been taken to indicate that the dorsal stream may be more important for on-line movement control than it is for the initial parameterization of movements [25,6].

Visual Processing in the Action-Oriented Brain 145

of optic ataxia patients, and the reaching deficits in optic ataxia are thought to be related to damage to the SPL or POJ, while the performance deficits in neglect are thought to be

Hemispatial neglect theoretically provides an interesting comparison to optic ataxia, for it should allow researchers to examine the relationship between attention and visuomotor control without the complicating visuomotor deficits present in optic ataxia. It also allows researchers to examine the impact, or lack thereof, of impaired visual awareness on motor function. However, one of the challenges researchers face when interpreting performance in neglect patients lies in ruling out the possibility that any visuomotor deficits observed in these patients arise from cortical damage that extends into visuomotor areas. For instance, Himmelbach and Karnath [30] have suggested that superior temporal cortex, rather than IPL, is directly responsible for the deficits of perceptual space representation found in neglect, and that the motor deficits found in some patients with neglect might stem from damage that extends to the IPL, a region they argue is involved in spatial coding for motor function, but which is not involved in the cognitive spatial coding that characterizes neglect. More recently, Himmelbach et al. [31] have argued that the neglect-specific effects of space representation are specifically linked to lesion sites at the superior temporal gyrus and temporo-parietal junction. They have also suggested that real-time motor control functions, such as those observed in optic ataxia, are supported by the POJ, an argument that aligns

As mentioned earlier, the motor deficits of optic ataxia are particularly prominent when participants reach for targets presented in the visual periphery. This contrasts with the visuomotor performance of patients with neglect, who generally exhibit accurate motor performance to objects presented in their neglected field [31]. Although motor deficits have been observed in neglect patients, these tend to be relatively minor compared to the deficits of optic ataxia patients. One of the motor performance deficits that has been found in neglect patients is a delay in the initiation of reaching movements into their neglected visual field. Some studies also indicate minor impairments in online performance, whereas others show an absence of any deficits in online control (see [32] for a review). Himmelbach et al. [30,31] argue that when a proper control group is used (i.e., patients with parietal damage who do not exhibit neglect), hemispatial neglect is not associated with any impairments in movement control. However, a recent study by Rossit et al. [33] suggests that neglect patients may be slower to correct their movements online compared to both healthy controls

In the Rossit et al. [33] study, the authors applied a target jump design modeled after Pisella et al.'s [24] design, in which a participant is tasked with either going to a target when it is displaced at movement onset (location-go) or trying to stop their movement as soon as the target is displaced (location-stop). The location-stop condition allows the researcher to probe the automaticity of the online corrections; any deviations toward the target that occur in this condition can be attributed to automatic online control. In the location-go condition, neglect patients were slower than the control groups, by 80-100ms, to correct their movements online when the target jumped into their neglected field. Endpoint accuracy, however, was

related to damage to the IPL [29].

with Pisella et al.'s [27].

and right hemisphere patients without neglect.

Milner and Goodale [26] have outlined some evidence that counters this view of an 'on-line only' dorsal stream, noting preparation deficits in optic ataxia as well as the preserved movement preparation, not just preserved on-line control, in the visual agnosia patient D.F. However, Milner and Goodale [26] do not directly address the central vs. peripheral vision discrepancy observed in optic ataxia [15,25] leaving the optic ataxia/visual agnosia doubledissociation question unresolved. At the very least, however, research on optic ataxia suggests that regions within the SPL are involved in transforming visual input to motor output. Whether the SPL's function is restricted to *on-line* visuomotor processing has yet to be determined.

More recently, Pisella et al. [27] have suggested that the evidence from optic ataxia indicates that one of the key functions of the dorsal stream is the spatial coding of targets in an eyecentered coordinate frame. Pisella et al. [27] assign this spatial coding function specifically to the parietal-occipital junction (POJ), a common lesion site in patients with optic ataxia. This account helps explain the peripheral target deficit observed in optic ataxia. Pisella et al. [27] further argue that dorsal stream function is important for both action *and* perception. This claim is supported by a recent study [28], which showed that optic ataxia patient I.G. was not only impaired in her on-line responses to a target displacement, but that she was also impaired in her perceptual report of the same target displacement.

Pisella et al. [27] also raise the important point that the dorsal stream probably has some role in perception for another reason: Areas within the dorsal stream are thought to be involved in attention orienting, which is fundamental to perceptual processing. Later in the chapter we address the important links between attention, perception, and action. In anticipation of that discussion, we provide first an overview of hemispatial neglect, a disorder of attention and spatial representation that, like optic ataxia, typically results from damage to regions within the parietal cortex.

#### **2.4. Hemispatial neglect**

Patients with hemispatial neglect suffer from a tendency to ignore half of their visual field, failing to acknowledge or interact with objects in the neglected field unless strongly encouraged to do so. Their performance deficits are generally considered distinct from those of optic ataxia patients, and the reaching deficits in optic ataxia are thought to be related to damage to the SPL or POJ, while the performance deficits in neglect are thought to be related to damage to the IPL [29].

144 Visual Cortex – Current Status and Perspectives

movements [25,6].

be determined.

within the parietal cortex.

**2.4. Hemispatial neglect** 

comparable to that of controls; major performance deficits appear only when targets are presented in the visual periphery [23]. In other words, when patients are able to fixate the target, they can accurately reach to it. However, there is evidence that subtle movement deficits can be detected when the target is in central vision. A study by Pisella et al. [24], for instance, showed that an optic ataxia patient, I.G., who has bilateral lesions to PPC was much slower and less fluid than controls in correcting her movements online when a target in central vision was displaced during a reach (though this effect, too, may be explained by the central/peripheral distinction, for the displacement moved the target away from fixation). Such findings have been taken to indicate that the dorsal stream may be more important for on-line movement control than it is for the initial parameterization of

Milner and Goodale [26] have outlined some evidence that counters this view of an 'on-line only' dorsal stream, noting preparation deficits in optic ataxia as well as the preserved movement preparation, not just preserved on-line control, in the visual agnosia patient D.F. However, Milner and Goodale [26] do not directly address the central vs. peripheral vision discrepancy observed in optic ataxia [15,25] leaving the optic ataxia/visual agnosia doubledissociation question unresolved. At the very least, however, research on optic ataxia suggests that regions within the SPL are involved in transforming visual input to motor output. Whether the SPL's function is restricted to *on-line* visuomotor processing has yet to

More recently, Pisella et al. [27] have suggested that the evidence from optic ataxia indicates that one of the key functions of the dorsal stream is the spatial coding of targets in an eyecentered coordinate frame. Pisella et al. [27] assign this spatial coding function specifically to the parietal-occipital junction (POJ), a common lesion site in patients with optic ataxia. This account helps explain the peripheral target deficit observed in optic ataxia. Pisella et al. [27] further argue that dorsal stream function is important for both action *and* perception. This claim is supported by a recent study [28], which showed that optic ataxia patient I.G. was not only impaired in her on-line responses to a target displacement, but that she was also

Pisella et al. [27] also raise the important point that the dorsal stream probably has some role in perception for another reason: Areas within the dorsal stream are thought to be involved in attention orienting, which is fundamental to perceptual processing. Later in the chapter we address the important links between attention, perception, and action. In anticipation of that discussion, we provide first an overview of hemispatial neglect, a disorder of attention and spatial representation that, like optic ataxia, typically results from damage to regions

Patients with hemispatial neglect suffer from a tendency to ignore half of their visual field, failing to acknowledge or interact with objects in the neglected field unless strongly encouraged to do so. Their performance deficits are generally considered distinct from those

impaired in her perceptual report of the same target displacement.

Hemispatial neglect theoretically provides an interesting comparison to optic ataxia, for it should allow researchers to examine the relationship between attention and visuomotor control without the complicating visuomotor deficits present in optic ataxia. It also allows researchers to examine the impact, or lack thereof, of impaired visual awareness on motor function. However, one of the challenges researchers face when interpreting performance in neglect patients lies in ruling out the possibility that any visuomotor deficits observed in these patients arise from cortical damage that extends into visuomotor areas. For instance, Himmelbach and Karnath [30] have suggested that superior temporal cortex, rather than IPL, is directly responsible for the deficits of perceptual space representation found in neglect, and that the motor deficits found in some patients with neglect might stem from damage that extends to the IPL, a region they argue is involved in spatial coding for motor function, but which is not involved in the cognitive spatial coding that characterizes neglect. More recently, Himmelbach et al. [31] have argued that the neglect-specific effects of space representation are specifically linked to lesion sites at the superior temporal gyrus and temporo-parietal junction. They have also suggested that real-time motor control functions, such as those observed in optic ataxia, are supported by the POJ, an argument that aligns with Pisella et al.'s [27].

As mentioned earlier, the motor deficits of optic ataxia are particularly prominent when participants reach for targets presented in the visual periphery. This contrasts with the visuomotor performance of patients with neglect, who generally exhibit accurate motor performance to objects presented in their neglected field [31]. Although motor deficits have been observed in neglect patients, these tend to be relatively minor compared to the deficits of optic ataxia patients. One of the motor performance deficits that has been found in neglect patients is a delay in the initiation of reaching movements into their neglected visual field. Some studies also indicate minor impairments in online performance, whereas others show an absence of any deficits in online control (see [32] for a review). Himmelbach et al. [30,31] argue that when a proper control group is used (i.e., patients with parietal damage who do not exhibit neglect), hemispatial neglect is not associated with any impairments in movement control. However, a recent study by Rossit et al. [33] suggests that neglect patients may be slower to correct their movements online compared to both healthy controls and right hemisphere patients without neglect.

In the Rossit et al. [33] study, the authors applied a target jump design modeled after Pisella et al.'s [24] design, in which a participant is tasked with either going to a target when it is displaced at movement onset (location-go) or trying to stop their movement as soon as the target is displaced (location-stop). The location-stop condition allows the researcher to probe the automaticity of the online corrections; any deviations toward the target that occur in this condition can be attributed to automatic online control. In the location-go condition, neglect patients were slower than the control groups, by 80-100ms, to correct their movements online when the target jumped into their neglected field. Endpoint accuracy, however, was equivalent across the groups. In the location-stop condition, neglect patients exhibited an equivalent number of online corrections to the control groups. The neglect patients, in fact, had greater difficulty stopping their movements than the participants in the control groups. These results suggest that the 'automatic pilot' is intact in neglect patients. However, the results also suggest that visuomotor processing in the neglected field is slowed, perhaps as a result of impaired attention-for-action in that field.

Visual Processing in the Action-Oriented Brain 147

The findings from patients with cortical lesions to visual areas support the idea that visionfor-action *can* proceed independently of vision-for-perception, though the possibility remains that the effects observed in patients do not represent the normal function of the preserved cortical areas. In the following section we examine converging evidence from non-patient participants for the idea that vision-for-action can access information that

**3. Action can proceed without perception: Evidence from cortically-intact** 

In this section we provide further evidence for action processing without visual awareness. We focus on studies in non-patients, which show that even when all areas of visual cortex are intact, visual information that drives action can elude conscious detection. This suggests that action's access to unperceived visual information is part of normal visual processing. We examine evidence from three different paradigms: backward masking, saccadic suppression of target displacement, and motor adaptation. In each case, motor responses to events are not only possible, but do not appear to suffer as a result of suppressed visual

One of the ways that non-conscious visual processing has been investigated is by masking a response-relevant stimulus and observing its impact on behaviour. When a stimulus is successfully masked, the participant does not report awareness of it. In metacontrast masking, for instance, the stimulus to be masked (the prime) is presented and then, shortly after, a larger stimulus (the mask) is presented around the prime. This sequence of stimuli can eliminate participants' awareness of the prime while influencing motor responses [36].

Taylor and McCloskey [37], for example, used metacontrast masking and showed that the reaction times for a motor task were influenced by the unseen prime. When a light was briefly flashed and then, 50ms later, 4 lights that closely surrounded the location of the first light were flashed, thereby producing a metacontrast mask, participants' reaction times were linked to the presentation of the initial stimulus, in spite of their having failed to consciously report its presence. Furthermore, Cressman et al. [38] have shown that movements that have already been initiated can be influenced by an unseen directional prime, such that participants adjust their movement online. In that study, a directional prime (left arrow, right arrow, or neutral stimulus) was presented at movement onset and then quickly masked with a larger arrow. Participants' movement endpoints were dictated by the mask, but the unseen directional primes triggered substantial trajectory deviations

These results suggest that the motor system can respond to visual information that is inaccessible to conscious awareness. However, this does not necessarily imply that the prime information is being processed by the dorsal stream. In fact, when the prime is a

perception does not.

**participants** 

awareness.

**3.1. Evidence from backward masking** 

ahead of the explicit response to the mask.

Some authors have suggested that optic ataxia and hemispatial neglect represent a double dissociation (e.g. [31,32]). In support of this view, a recent case study showed that real-time grasping was preserved in a neglect patient's neglected field, whereas delayed grasping was dramatically impaired [34]. These outcomes are the inverse of those in optic ataxia patient I.G., who exhibits impaired real-time grasping but actually improves when asked to execute a delayed pantomime grasp [35].

The relative absence of major motor deficits in hemispatial neglect provides a further piece of evidence for action having access to visual information that perception does not. As noted at the outset of this section, patients with hemispatial neglect have a failure in the perceptual representation of part of their visual world. Their visuomotor system's preserved ability to reach to and grasp objects within this neglected field is suggestive of a perception/action dissociation, though it may not fall precisely along ventral/dorsal lines. A recent review by Harvey and Rossit [29] provides a comprehensive overview of visuomotor function in hemispatial neglect, and we direct the interested reader there for a fuller account of the syndrome's complexities and its implications for the functional organization of parietal and temporal cortex.

## **2.5. Section summary**

We overviewed three theories of cortical organization for visual processing and discussed neuropsychological findings from blindsight, optic ataxia, and hemispatial neglect. The picture that emerges is one of a modularized PPC, with current evidence favouring the POJ and the SPL as key sites for real-time visuomotor computations. Critically, the visuomotor processing carried out by these areas appears to proceed automatically, without mediation by conscious visual processing. These sites are implicated in direct visual-to-motor transformations, and they are areas whose visual functions can only be probed by engaging participants in goal-directed movement tasks.

The visuomotor role of the IPL is somewhat less clear. It is a common lesion site in hemispatial neglect, which may implicate it in the orienting of attention. Glover [6] has argued that the IPL is important for movement planning, and neglect patients with damage to that area do tend to exhibit more motor deficits than neglect patients with undamaged IPL [30], which provides some support for Glover's assertion. At the same time, the breakdown in the cognitive spatial representation that is associated with damage to superior areas of the temporal cortex [31], a spatial deficit that does not appear to undermine action control, is consistent with Milner and Goodale's [5] argument for different spatial representations for perception and action.

The findings from patients with cortical lesions to visual areas support the idea that visionfor-action *can* proceed independently of vision-for-perception, though the possibility remains that the effects observed in patients do not represent the normal function of the preserved cortical areas. In the following section we examine converging evidence from non-patient participants for the idea that vision-for-action can access information that perception does not.

## **3. Action can proceed without perception: Evidence from cortically-intact participants**

In this section we provide further evidence for action processing without visual awareness. We focus on studies in non-patients, which show that even when all areas of visual cortex are intact, visual information that drives action can elude conscious detection. This suggests that action's access to unperceived visual information is part of normal visual processing. We examine evidence from three different paradigms: backward masking, saccadic suppression of target displacement, and motor adaptation. In each case, motor responses to events are not only possible, but do not appear to suffer as a result of suppressed visual awareness.

## **3.1. Evidence from backward masking**

146 Visual Cortex – Current Status and Perspectives

a delayed pantomime grasp [35].

temporal cortex.

**2.5. Section summary** 

participants in goal-directed movement tasks.

representations for perception and action.

result of impaired attention-for-action in that field.

equivalent across the groups. In the location-stop condition, neglect patients exhibited an equivalent number of online corrections to the control groups. The neglect patients, in fact, had greater difficulty stopping their movements than the participants in the control groups. These results suggest that the 'automatic pilot' is intact in neglect patients. However, the results also suggest that visuomotor processing in the neglected field is slowed, perhaps as a

Some authors have suggested that optic ataxia and hemispatial neglect represent a double dissociation (e.g. [31,32]). In support of this view, a recent case study showed that real-time grasping was preserved in a neglect patient's neglected field, whereas delayed grasping was dramatically impaired [34]. These outcomes are the inverse of those in optic ataxia patient I.G., who exhibits impaired real-time grasping but actually improves when asked to execute

The relative absence of major motor deficits in hemispatial neglect provides a further piece of evidence for action having access to visual information that perception does not. As noted at the outset of this section, patients with hemispatial neglect have a failure in the perceptual representation of part of their visual world. Their visuomotor system's preserved ability to reach to and grasp objects within this neglected field is suggestive of a perception/action dissociation, though it may not fall precisely along ventral/dorsal lines. A recent review by Harvey and Rossit [29] provides a comprehensive overview of visuomotor function in hemispatial neglect, and we direct the interested reader there for a fuller account of the syndrome's complexities and its implications for the functional organization of parietal and

We overviewed three theories of cortical organization for visual processing and discussed neuropsychological findings from blindsight, optic ataxia, and hemispatial neglect. The picture that emerges is one of a modularized PPC, with current evidence favouring the POJ and the SPL as key sites for real-time visuomotor computations. Critically, the visuomotor processing carried out by these areas appears to proceed automatically, without mediation by conscious visual processing. These sites are implicated in direct visual-to-motor transformations, and they are areas whose visual functions can only be probed by engaging

The visuomotor role of the IPL is somewhat less clear. It is a common lesion site in hemispatial neglect, which may implicate it in the orienting of attention. Glover [6] has argued that the IPL is important for movement planning, and neglect patients with damage to that area do tend to exhibit more motor deficits than neglect patients with undamaged IPL [30], which provides some support for Glover's assertion. At the same time, the breakdown in the cognitive spatial representation that is associated with damage to superior areas of the temporal cortex [31], a spatial deficit that does not appear to undermine action control, is consistent with Milner and Goodale's [5] argument for different spatial One of the ways that non-conscious visual processing has been investigated is by masking a response-relevant stimulus and observing its impact on behaviour. When a stimulus is successfully masked, the participant does not report awareness of it. In metacontrast masking, for instance, the stimulus to be masked (the prime) is presented and then, shortly after, a larger stimulus (the mask) is presented around the prime. This sequence of stimuli can eliminate participants' awareness of the prime while influencing motor responses [36].

Taylor and McCloskey [37], for example, used metacontrast masking and showed that the reaction times for a motor task were influenced by the unseen prime. When a light was briefly flashed and then, 50ms later, 4 lights that closely surrounded the location of the first light were flashed, thereby producing a metacontrast mask, participants' reaction times were linked to the presentation of the initial stimulus, in spite of their having failed to consciously report its presence. Furthermore, Cressman et al. [38] have shown that movements that have already been initiated can be influenced by an unseen directional prime, such that participants adjust their movement online. In that study, a directional prime (left arrow, right arrow, or neutral stimulus) was presented at movement onset and then quickly masked with a larger arrow. Participants' movement endpoints were dictated by the mask, but the unseen directional primes triggered substantial trajectory deviations ahead of the explicit response to the mask.

These results suggest that the motor system can respond to visual information that is inaccessible to conscious awareness. However, this does not necessarily imply that the prime information is being processed by the dorsal stream. In fact, when the prime is a symbol that must be translated into a directional response (e.g. [36,38]), it is likely that ventral stream processing is involved. The perceptual representation of the shape may fail to reach awareness, but it is a representation that has the *potential* to be perceived, which, as Milner and Goodale [5] argue, should still be classified as ventrally-mediated.

Visual Processing in the Action-Oriented Brain 149

example, if one were to view the world through displacing prisms that shifted the visual world to the right, one's movements would initially err to the right of a reach target. Visual feedback would allow correction of this error over the course of multiple movements. Subsequent removal of the prisms would then produce motor errors in a leftward direction ('aftereffects') as a result of the newly acquired mapping between vision and motor output.

The interesting effect for the purposes of the current discussion is that people can acquire new visuomotor mappings without any awareness that their visual environment has changed. In fact, learning appears to be more robust if people do not know that the environment has been altered. Michel et al. [44], for instance, showed that gradually incrementing the amount of prism shift, such that participants were unaware of it, led to stronger aftereffects than the introduction of a large, consciously detectable prism shift.

People can also adapt to systematic, imperceptible changes in a target's location between the start and end of their movements. This adaptation can occur when the movement error is presented at the end of a reaching movement [45], but it can also occur when the target is displaced during the reaching movement, allowing for online corrections that eliminate any visual error at the end of the reach [46, cf. 47]. Furthermore, if participants are made aware

The adaptation effect for reaching movements to displaced targets is similar to the adaptation that occurs for eye movements. Saccadic adaptation is a well-documented phenomenon in which the size of saccades gradually increases (or decreases) when people are repeatedly exposed to forward (or backward) displacements of the target [49]. This effect, like the one for reaching movements, is thought to draw upon the natural calibration of our movements that occurs throughout our everyday lives, a process that typically occurs

Conscious perception of changes in the visual environment is required for neither real-time control nor motor learning. In fact, motor learning may even be enhanced if one is unaware that a change has occurred. These findings do not necessarily imply that vision for perception and vision for action rely on separate cortical streams, but they do show that what action sees is not necessarily what perception sees. This is an important point, for it suggests that the principles governing vision for perception may differ from those governing vision for action. By measuring motor responses, not just perceptual reports, we

So far, in discussing topics such as the PAM, blindsight, and masking studies in healthy participants, we have devoted much attention to the phenomenon of acting without consciously seeing. In this section we turn our attention to perception, and examine some of

can tap into a wealth of visual processing that we might otherwise miss.

**4. Action influences visual attention and perception** 

the ways that the intention to act changes what we see.

of the target displacement, the amount of adaptation is considerably diminished [48].

without any awareness of the error in our movements.

**3.4. Section summary** 

However, when the masked stimulus is, itself, the target of the action, direct involvement of the dorsal stream is more likely. In a study by Binsted et al. [22], participants were tasked with making aiming movements directly to a masked target, the size of which was manipulated across trials. The study showed that movement times were scaled to the size of the target (shorter times for larger targets, longer ones for smaller targets), in accordance with Fitts' Law. Thus, even though participants did not consciously perceive any changes in the size of the target, their motor responses were appropriately tuned to it. This study showed that healthy participants could experience a blindsight-like ability to scale their visuomotor response to something they could not consciously see. Because the visual information that action is drawing upon in this instance is presumably the same information that it would be using in the absence of the mask, we can infer that visual processing for immediate action control is not normally mediated by conscious vision. Action may have access to sub-threshold conscious vision or it may draw upon different visual information altogether, as suggested by the PAM. Thus, either as a matter of *degree* of visual input to which they are sensitive or *kind* of visual input upon which they rely, vision-for-action and vision-for-perception clearly differ.

## **3.2. Evidence from reaches to saccadically-suppressed target displacements**

We consider next a very robust dissociation between perception and control that occurs when people make simultaneous eye and hand movements. We will take up the perceptual effects of saccadic eye movements in more detail in a later section. For the current section, one need only know that when a target is displaced during a saccadic eye movement, the displacement is largely invisible. Surprisingly, people fail to notice a change in the target's location even when the displacement is as large as one third of the saccade magnitude [39].

Bridgeman et al. [40] showed that when participants pointed to a target that had been displaced during a saccade, they could accurately acquire it, even though they were unaware of the change in location. Goodale et al. [41] and Pelisson et al. [42] demonstrated that online responses of the motor system were also sensitive to saccadically-displaced targets. They showed that even when participants had initiated a reach towards a target's pre-saccadic location, the reach smoothly updated itself to acquire the displaced post-saccadic target location. This adjustment to the reach occurred in spite of participants having no vision of their hand and no awareness of the target displacement. This effect was also shown for targets that were displaced tangentially to the primary axis of the movement [43]. In sum, awareness of a target displacement is not needed for motor adjustments to the displacement.

#### **3.3. Evidence from motor learning in response to unperceived visual changes**

When people encounter an altered visual environment, they adapt their movements over the course of exposure to it, such that initially inaccurate movements gradually improve. For example, if one were to view the world through displacing prisms that shifted the visual world to the right, one's movements would initially err to the right of a reach target. Visual feedback would allow correction of this error over the course of multiple movements. Subsequent removal of the prisms would then produce motor errors in a leftward direction ('aftereffects') as a result of the newly acquired mapping between vision and motor output.

The interesting effect for the purposes of the current discussion is that people can acquire new visuomotor mappings without any awareness that their visual environment has changed. In fact, learning appears to be more robust if people do not know that the environment has been altered. Michel et al. [44], for instance, showed that gradually incrementing the amount of prism shift, such that participants were unaware of it, led to stronger aftereffects than the introduction of a large, consciously detectable prism shift.

People can also adapt to systematic, imperceptible changes in a target's location between the start and end of their movements. This adaptation can occur when the movement error is presented at the end of a reaching movement [45], but it can also occur when the target is displaced during the reaching movement, allowing for online corrections that eliminate any visual error at the end of the reach [46, cf. 47]. Furthermore, if participants are made aware of the target displacement, the amount of adaptation is considerably diminished [48].

The adaptation effect for reaching movements to displaced targets is similar to the adaptation that occurs for eye movements. Saccadic adaptation is a well-documented phenomenon in which the size of saccades gradually increases (or decreases) when people are repeatedly exposed to forward (or backward) displacements of the target [49]. This effect, like the one for reaching movements, is thought to draw upon the natural calibration of our movements that occurs throughout our everyday lives, a process that typically occurs without any awareness of the error in our movements.

## **3.4. Section summary**

148 Visual Cortex – Current Status and Perspectives

symbol that must be translated into a directional response (e.g. [36,38]), it is likely that ventral stream processing is involved. The perceptual representation of the shape may fail to reach awareness, but it is a representation that has the *potential* to be perceived, which, as

However, when the masked stimulus is, itself, the target of the action, direct involvement of the dorsal stream is more likely. In a study by Binsted et al. [22], participants were tasked with making aiming movements directly to a masked target, the size of which was manipulated across trials. The study showed that movement times were scaled to the size of the target (shorter times for larger targets, longer ones for smaller targets), in accordance with Fitts' Law. Thus, even though participants did not consciously perceive any changes in the size of the target, their motor responses were appropriately tuned to it. This study showed that healthy participants could experience a blindsight-like ability to scale their visuomotor response to something they could not consciously see. Because the visual information that action is drawing upon in this instance is presumably the same information that it would be using in the absence of the mask, we can infer that visual processing for immediate action control is not normally mediated by conscious vision. Action may have access to sub-threshold conscious vision or it may draw upon different visual information altogether, as suggested by the PAM. Thus, either as a matter of *degree* of visual input to which they are sensitive or *kind* of visual

Milner and Goodale [5] argue, should still be classified as ventrally-mediated.

input upon which they rely, vision-for-action and vision-for-perception clearly differ.

**3.2. Evidence from reaches to saccadically-suppressed target displacements** 

target displacement is not needed for motor adjustments to the displacement.

**3.3. Evidence from motor learning in response to unperceived visual changes** 

When people encounter an altered visual environment, they adapt their movements over the course of exposure to it, such that initially inaccurate movements gradually improve. For

We consider next a very robust dissociation between perception and control that occurs when people make simultaneous eye and hand movements. We will take up the perceptual effects of saccadic eye movements in more detail in a later section. For the current section, one need only know that when a target is displaced during a saccadic eye movement, the displacement is largely invisible. Surprisingly, people fail to notice a change in the target's location even when the displacement is as large as one third of the saccade magnitude [39]. Bridgeman et al. [40] showed that when participants pointed to a target that had been displaced during a saccade, they could accurately acquire it, even though they were unaware of the change in location. Goodale et al. [41] and Pelisson et al. [42] demonstrated that online responses of the motor system were also sensitive to saccadically-displaced targets. They showed that even when participants had initiated a reach towards a target's pre-saccadic location, the reach smoothly updated itself to acquire the displaced post-saccadic target location. This adjustment to the reach occurred in spite of participants having no vision of their hand and no awareness of the target displacement. This effect was also shown for targets that were displaced tangentially to the primary axis of the movement [43]. In sum, awareness of a

Conscious perception of changes in the visual environment is required for neither real-time control nor motor learning. In fact, motor learning may even be enhanced if one is unaware that a change has occurred. These findings do not necessarily imply that vision for perception and vision for action rely on separate cortical streams, but they do show that what action sees is not necessarily what perception sees. This is an important point, for it suggests that the principles governing vision for perception may differ from those governing vision for action. By measuring motor responses, not just perceptual reports, we can tap into a wealth of visual processing that we might otherwise miss.

## **4. Action influences visual attention and perception**

So far, in discussing topics such as the PAM, blindsight, and masking studies in healthy participants, we have devoted much attention to the phenomenon of acting without consciously seeing. In this section we turn our attention to perception, and examine some of the ways that the intention to act changes what we see.

## **4.1. Saccades in action**

Perhaps the most obvious example of the link between action and perception is eye movements. To pick up detailed information about the world around us, we constantly reorient our gaze via movements of our eyes and head. Saccades, which are fast and largely ballistic, are the most common type of eye movement, and much of our internal representation of the visual world is constructed from the detailed snapshots they provide. While it is probably not surprising to many of us that saccades are constantly being used to shift our gaze and thus inform perception, saccades also influence perception in other more subtle ways.

Visual Processing in the Action-Oriented Brain 151

task. The coupling between eye and hand is a perfect example of action's role in dictating

As much as the eyes may want to lead the hand, it is possible to override the coupling by fixating the eyes in one location prior to initiating a reach to a peripheral target. The task requires some effort on the part of the performer, but it can be done (and is, in fact, well employed in laboratory settings when tight control over visual input is desired). You may have noticed, for instance, that you can reach for a cup of coffee while keeping your eyes on the book or screen before you, though at some cost to movement accuracy. As the next sections will show, however, even when the eyes remain locked in place during a manual

Attention is vital to our experience of the visual world. Most of us have probably experienced the frustrating search through a crowded restaurant in which we only see our dinner companions after having already walked past them once or twice, or the search that happens at the open fridge door, where the item we want, and cannot find, has been in front of us all along. Controlled experiments have shown that people will reliably fail to see large objects that disappear and reappear in blinking scenes [56] or even fail to see a person in a gorilla suit walk through the middle of a scene [57]. Attention is the construct used to explain these effects. The idea is that there is far more information in the visual field than our brain can or wants to cope with at any one time. The brain, therefore, relies on attention to select a portion of visual information for analysis. And, as a result, if we do not attend to

That attention is important for conscious perception is clear. When we consider, however, that the purpose of human information processing is not just perception but also action, it is also clear that attention systems should not be examined independently of action systems. One of the first to raise this point was Allport [58], who noted that the important constraint upon visual analysis of a scene may not be central processing limitations, but the need for action coherence. Allport's [58] argument was that motor systems need to be tied to one object at a time; if visual information about multiple objects is permitted access to these systems simultaneously, the action will fail. The hand, for example, cannot successfully grasp a cup if the information guiding the reach is also coming from the apple, the bottle,

The importance of action to the allocation of attention has also been stressed by Rizzolatti et al. [59], who proposed a premotor theory of attention, in which eye-movement motor programs drive the spatial allocation of visual attention. Tipper et al. [60] have likewise emphasized the role of action in attention, proposing that attention operates within an action-centered representation of visual space. Schneider [61], meanwhile, has proposed the Visual Attention Model (VAM), a framework in which a central attention mechanism binds perceptual and action systems to the same object. Each of these perspectives on the

relationship between attention and action will be examined next.

where and when we acquire visual information from our environment.

task, visual attention does not; it is bound to action goals.

something, we are blind to it.

and pencil sitting next to the cup.

**4.3. On the relationship between action and attention** 

One perceptually subtle (but experimentally dramatic) effect of saccades is their ability to mask large changes in the visual scene. As previously mentioned, saccade targets can be displaced by distances as large as a third of the saccade magnitude without the participant reporting any change [39]. Entire objects can be rotated or even deleted from a scene during a saccade, and participants will fail to notice the change [50]. In short, saccadic eye movements introduce periods of change blindness. This effect is thought to be partly due to our visual system's built-in assumption that the world is stable and that trans-saccadic changes in object locations are more likely to result from eye movement errors than they are to result from actual changes in the scene [51]. Ironically, then, the perceptual effect of saccadic suppression is a no-percept effect; suppression serves to keep the visual world stable and our conscious perception of it unperturbed. This demonstrates not only that oculomotor plans influence perception, but also that our action-driven visual system is carefully tuned to compensate for perturbations that are caused by internally generated movement.

#### **4.2. Action goals dictate where our eyes go**

When we reach to, pick up, and use objects to accomplish goals, our eyes precede our manual actions, orienting to the relevant parts of relevant objects. Land et al. [52] tracked people's eye movements as they carried out the actions of brewing a cup of tea in a kitchen, and the researchers observed that people's eye movements were tied to the behavioural goals; the eyes did not jump from one visually salient object to another but, rather, moved deliberately from one task-relevant object to the next. Detailed analysis of eye-hand coordination during object grasping and manipulation tells the same story: the eyes are drawn to contact points between the hand and the object and between the manipulated object and other objects [53,54].

Furthermore, the coupling between the eyes and the hand appears to be quite strong, and will resist conscious attempts to break it. For instance, when people are told to look and point to a target and then move their eyes to a new saccade target that appears while the hand is in flight, they fail to complete the saccade task. The eyes remain locked on the target of the reaching movement until the hand has landed [55]. Thus, the eyes strategically move to pick up relevant information for goal-directed action, and they are tightly bound to this task. The coupling between eye and hand is a perfect example of action's role in dictating where and when we acquire visual information from our environment.

As much as the eyes may want to lead the hand, it is possible to override the coupling by fixating the eyes in one location prior to initiating a reach to a peripheral target. The task requires some effort on the part of the performer, but it can be done (and is, in fact, well employed in laboratory settings when tight control over visual input is desired). You may have noticed, for instance, that you can reach for a cup of coffee while keeping your eyes on the book or screen before you, though at some cost to movement accuracy. As the next sections will show, however, even when the eyes remain locked in place during a manual task, visual attention does not; it is bound to action goals.

## **4.3. On the relationship between action and attention**

150 Visual Cortex – Current Status and Perspectives

Perhaps the most obvious example of the link between action and perception is eye movements. To pick up detailed information about the world around us, we constantly reorient our gaze via movements of our eyes and head. Saccades, which are fast and largely ballistic, are the most common type of eye movement, and much of our internal representation of the visual world is constructed from the detailed snapshots they provide. While it is probably not surprising to many of us that saccades are constantly being used to shift our gaze and thus inform perception, saccades also influence perception in other more

One perceptually subtle (but experimentally dramatic) effect of saccades is their ability to mask large changes in the visual scene. As previously mentioned, saccade targets can be displaced by distances as large as a third of the saccade magnitude without the participant reporting any change [39]. Entire objects can be rotated or even deleted from a scene during a saccade, and participants will fail to notice the change [50]. In short, saccadic eye movements introduce periods of change blindness. This effect is thought to be partly due to our visual system's built-in assumption that the world is stable and that trans-saccadic changes in object locations are more likely to result from eye movement errors than they are to result from actual changes in the scene [51]. Ironically, then, the perceptual effect of saccadic suppression is a no-percept effect; suppression serves to keep the visual world stable and our conscious perception of it unperturbed. This demonstrates not only that oculomotor plans influence perception, but also that our action-driven visual system is carefully tuned to compensate for perturbations that are caused by internally generated

When we reach to, pick up, and use objects to accomplish goals, our eyes precede our manual actions, orienting to the relevant parts of relevant objects. Land et al. [52] tracked people's eye movements as they carried out the actions of brewing a cup of tea in a kitchen, and the researchers observed that people's eye movements were tied to the behavioural goals; the eyes did not jump from one visually salient object to another but, rather, moved deliberately from one task-relevant object to the next. Detailed analysis of eye-hand coordination during object grasping and manipulation tells the same story: the eyes are drawn to contact points between the hand and the object and between the manipulated

Furthermore, the coupling between the eyes and the hand appears to be quite strong, and will resist conscious attempts to break it. For instance, when people are told to look and point to a target and then move their eyes to a new saccade target that appears while the hand is in flight, they fail to complete the saccade task. The eyes remain locked on the target of the reaching movement until the hand has landed [55]. Thus, the eyes strategically move to pick up relevant information for goal-directed action, and they are tightly bound to this

**4.1. Saccades in action** 

subtle ways.

movement.

**4.2. Action goals dictate where our eyes go** 

object and other objects [53,54].

Attention is vital to our experience of the visual world. Most of us have probably experienced the frustrating search through a crowded restaurant in which we only see our dinner companions after having already walked past them once or twice, or the search that happens at the open fridge door, where the item we want, and cannot find, has been in front of us all along. Controlled experiments have shown that people will reliably fail to see large objects that disappear and reappear in blinking scenes [56] or even fail to see a person in a gorilla suit walk through the middle of a scene [57]. Attention is the construct used to explain these effects. The idea is that there is far more information in the visual field than our brain can or wants to cope with at any one time. The brain, therefore, relies on attention to select a portion of visual information for analysis. And, as a result, if we do not attend to something, we are blind to it.

That attention is important for conscious perception is clear. When we consider, however, that the purpose of human information processing is not just perception but also action, it is also clear that attention systems should not be examined independently of action systems. One of the first to raise this point was Allport [58], who noted that the important constraint upon visual analysis of a scene may not be central processing limitations, but the need for action coherence. Allport's [58] argument was that motor systems need to be tied to one object at a time; if visual information about multiple objects is permitted access to these systems simultaneously, the action will fail. The hand, for example, cannot successfully grasp a cup if the information guiding the reach is also coming from the apple, the bottle, and pencil sitting next to the cup.

The importance of action to the allocation of attention has also been stressed by Rizzolatti et al. [59], who proposed a premotor theory of attention, in which eye-movement motor programs drive the spatial allocation of visual attention. Tipper et al. [60] have likewise emphasized the role of action in attention, proposing that attention operates within an action-centered representation of visual space. Schneider [61], meanwhile, has proposed the Visual Attention Model (VAM), a framework in which a central attention mechanism binds perceptual and action systems to the same object. Each of these perspectives on the relationship between attention and action will be examined next.

## **4.4. Premotor theory**

The premotor theory of attention has probably been the most influential of the action-based theories of attention. As initially proposed [59], premotor theory attributed the control of attention to oculomotor programming; even when the eyes remained still while attention was shifted (covert orienting), the attention shift was purportedly due to the programming of an eye movement that was subsequently inhibited. Premotor theory was later modified to allow for goal-directed motor programming of any kind (e.g. reaching) to produce attention shifts [62], but the basic premise remained the same. The mechanism underlying this process was, according to Rizzolatti et al. [62], the activation of neurons in spatial pragmatic maps.

Visual Processing in the Action-Oriented Brain 153

an environment-centered representation, and an action-centered representation. A 2-D retina-centered representation is one in which the spatial relationships between objects are defined in terms of the objects' relative positions in the 2-D retinal image. Thus, when it comes to attention, a distractor on the far side of a target (with respect to the viewer) would produce more interference than a distractor on the near side of the target, according to Tipper et al., because in the 2-D image the far object is closer to the target than the near object is. A 3-D viewer-centered representation, on the other hand, is one in which the distance of objects from the viewer is a relevant factor. If attention operates within this kind of representation, distractors on the near side of a target would potentially produce greater interference than distractors on the far side of the target. This type of representation differs from an environment-centered representation in that the orientation of the viewer with respect to the objects affects their salience. In the environment-centered representation, viewer orientation is irrelevant. Finally, the action-centered representation is one in which an object's potential for interference depends upon its relationship to a planned action path. Thus, a distractor that resides within the action path will potentially produce greater

Tipper et al. [60] provided evidence that, during a reaching task, attention operates within an action-centered representation. They had participants reach and press target buttons that were arranged in a 3 x 3 array in the horizontal (transverse) plane. Below each button were a red and a yellow light. Illumination of the red light indicated that the corresponding button was the target; the yellow light was irrelevant to the task, but it would sometimes be illuminated simultaneously at a different location, serving as a distractor. Tipper et al. examined the cost to the total time (TT) of the reaching movement produced by the distractor, and found that TT suffered more (i.e., there was greater interference) when the distractor fell within the same row as the target or in a row between the hand start position and the target row. Furthermore, when the hand start position was moved to the opposite end of the board (i.e., to the far end of the board), the same pattern of results was found, ruling out a 3-D viewer-centered representation. Tipper et al., in discussing the mechanism underlying the action-centered interference, suggest that motor programs are activated,

A later experiment by Meegan and Tipper [65] investigated whether the pattern of interference observed by Tipper et al. [60] was due to the distractor's relationship to the response path, as Tipper et al. [60] had suggested, or to the distractor's proximity to the start position of the hand. Meegan and Tipper [65] found that proximity to the hand was a better predictor of distractor interference. This finding does not necessarily undermine the actioncentered model; Meegan and Tipper [65], for instance, suggest that objects nearer to the hand might produce greater response competition than objects farther from the hand, a framework consistent with the parallel response activation proposed by Tipper et al. [60]. However, it is also possible that, because information about the location of the hand is important during action preparation [66], attention may initially be oriented to it, leading to

interference than one that resides beyond the path.

simultaneously, to both the target and the distractor.

greater interference from objects in its vicinity.

These pragmatic maps are proposed to reside in brain areas associated with action (e.g., parietal reach areas; parietal, frontal, or sub-cortical eye movement areas), and they code space only insofar as it is relevant to the action that they are involved in programming. Thus, according to premotor theory, there is no higher-level attention system. Rather, attention shifts simply result from the selective activation of pragmatic map neurons, and this activation only occurs when a movement is programmed to that region of space.

Some of the strongest support for premotor theory can be found in neurophysiological studies. Moore and Fallah [63], for instance, showed a causal link between activation of eye movement cortex and the allocation of attention. Moore and Fallah stimulated monkeys' frontal eye field (FEF), a cortical area involved in the control of voluntary eye movements. They began by stimulating a part of the FEF with enough current to trigger an eye movement. They then reduced the stimulation to a sub-threshold level (i.e., the stimulation was too low to trigger an eye movement). They found that this sub-threshold stimulation improved the monkey's ability to detect a change in the target stimulus when the stimulus fell within the region of the visual field corresponding to the destination of the eye movement that had previously been triggered by supra-threshold stimulation of the FEF. In a similar study investigating the attentional role of the superior colliculus (SC) (a subcortical area directly involved in the control of eye movements), Muller, Philiastides, and Newsome [64] found that sub-threshold stimulation of the SC also produced enhanced detection of the target stimulus. Both of these studies demonstrate covert orienting of attention resulting from activation of oculomotor areas of the brain, consistent with premotor theory.

## **4.5. Action-centered attention**

An important step in understanding attention is determining the nature of the spatial representation upon which it operates. Premotor theory suggests that spatial pragmatic maps underlie the allocation of spatial attention (though it also states that attention emerges from these maps, rather than operating upon them). A related view, advanced by Tipper et al. [60], suggests that attention operates upon an action-centered representation. To get a better sense of what such a representation might be, we will first consider other kinds of spatial representation.

Tipper et al. [60] outline 4 possible kinds of spatial representation upon which attention might operate: a 2-D retina-centered representation, a 3-D viewer-centered representation, an environment-centered representation, and an action-centered representation. A 2-D retina-centered representation is one in which the spatial relationships between objects are defined in terms of the objects' relative positions in the 2-D retinal image. Thus, when it comes to attention, a distractor on the far side of a target (with respect to the viewer) would produce more interference than a distractor on the near side of the target, according to Tipper et al., because in the 2-D image the far object is closer to the target than the near object is. A 3-D viewer-centered representation, on the other hand, is one in which the distance of objects from the viewer is a relevant factor. If attention operates within this kind of representation, distractors on the near side of a target would potentially produce greater interference than distractors on the far side of the target. This type of representation differs from an environment-centered representation in that the orientation of the viewer with respect to the objects affects their salience. In the environment-centered representation, viewer orientation is irrelevant. Finally, the action-centered representation is one in which an object's potential for interference depends upon its relationship to a planned action path. Thus, a distractor that resides within the action path will potentially produce greater interference than one that resides beyond the path.

152 Visual Cortex – Current Status and Perspectives

**4.5. Action-centered attention** 

spatial representation.

The premotor theory of attention has probably been the most influential of the action-based theories of attention. As initially proposed [59], premotor theory attributed the control of attention to oculomotor programming; even when the eyes remained still while attention was shifted (covert orienting), the attention shift was purportedly due to the programming of an eye movement that was subsequently inhibited. Premotor theory was later modified to allow for goal-directed motor programming of any kind (e.g. reaching) to produce attention shifts [62], but the basic premise remained the same. The mechanism underlying this process was, according to Rizzolatti et al. [62], the activation of neurons in spatial pragmatic maps. These pragmatic maps are proposed to reside in brain areas associated with action (e.g., parietal reach areas; parietal, frontal, or sub-cortical eye movement areas), and they code space only insofar as it is relevant to the action that they are involved in programming. Thus, according to premotor theory, there is no higher-level attention system. Rather, attention shifts simply result from the selective activation of pragmatic map neurons, and

this activation only occurs when a movement is programmed to that region of space.

from activation of oculomotor areas of the brain, consistent with premotor theory.

An important step in understanding attention is determining the nature of the spatial representation upon which it operates. Premotor theory suggests that spatial pragmatic maps underlie the allocation of spatial attention (though it also states that attention emerges from these maps, rather than operating upon them). A related view, advanced by Tipper et al. [60], suggests that attention operates upon an action-centered representation. To get a better sense of what such a representation might be, we will first consider other kinds of

Tipper et al. [60] outline 4 possible kinds of spatial representation upon which attention might operate: a 2-D retina-centered representation, a 3-D viewer-centered representation,

Some of the strongest support for premotor theory can be found in neurophysiological studies. Moore and Fallah [63], for instance, showed a causal link between activation of eye movement cortex and the allocation of attention. Moore and Fallah stimulated monkeys' frontal eye field (FEF), a cortical area involved in the control of voluntary eye movements. They began by stimulating a part of the FEF with enough current to trigger an eye movement. They then reduced the stimulation to a sub-threshold level (i.e., the stimulation was too low to trigger an eye movement). They found that this sub-threshold stimulation improved the monkey's ability to detect a change in the target stimulus when the stimulus fell within the region of the visual field corresponding to the destination of the eye movement that had previously been triggered by supra-threshold stimulation of the FEF. In a similar study investigating the attentional role of the superior colliculus (SC) (a subcortical area directly involved in the control of eye movements), Muller, Philiastides, and Newsome [64] found that sub-threshold stimulation of the SC also produced enhanced detection of the target stimulus. Both of these studies demonstrate covert orienting of attention resulting

**4.4. Premotor theory** 

Tipper et al. [60] provided evidence that, during a reaching task, attention operates within an action-centered representation. They had participants reach and press target buttons that were arranged in a 3 x 3 array in the horizontal (transverse) plane. Below each button were a red and a yellow light. Illumination of the red light indicated that the corresponding button was the target; the yellow light was irrelevant to the task, but it would sometimes be illuminated simultaneously at a different location, serving as a distractor. Tipper et al. examined the cost to the total time (TT) of the reaching movement produced by the distractor, and found that TT suffered more (i.e., there was greater interference) when the distractor fell within the same row as the target or in a row between the hand start position and the target row. Furthermore, when the hand start position was moved to the opposite end of the board (i.e., to the far end of the board), the same pattern of results was found, ruling out a 3-D viewer-centered representation. Tipper et al., in discussing the mechanism underlying the action-centered interference, suggest that motor programs are activated, simultaneously, to both the target and the distractor.

A later experiment by Meegan and Tipper [65] investigated whether the pattern of interference observed by Tipper et al. [60] was due to the distractor's relationship to the response path, as Tipper et al. [60] had suggested, or to the distractor's proximity to the start position of the hand. Meegan and Tipper [65] found that proximity to the hand was a better predictor of distractor interference. This finding does not necessarily undermine the actioncentered model; Meegan and Tipper [65], for instance, suggest that objects nearer to the hand might produce greater response competition than objects farther from the hand, a framework consistent with the parallel response activation proposed by Tipper et al. [60]. However, it is also possible that, because information about the location of the hand is important during action preparation [66], attention may initially be oriented to it, leading to greater interference from objects in its vicinity.

#### **4.6. The Visual Attention Model and action-perception coupling**

The Visual Attention Model (VAM), like premotor theory, posits that motor preparation and perceptual selection are coupled [61,67]. However, VAM differs from premotor theory in two major ways. For one, VAM suggests that the coupling between selection-for-action and selection-for-perception is bi-directional. In other words, selecting an object based on perceptual attributes (e.g. colour) also binds action systems to that object, and selecting an object for action (e.g. preparing to grasp an apple) also binds perceptual systems to that object. (Premotor theory only allows for action preparation to bind perceptual attention to an object.) The other way that VAM differs from premotor theory is that VAM posits an independent, higher-level, attention mechanism that binds action and perceptual processes. (Premotor theory argues against an independent attention system.) Much of the research that has been conducted within the VAM framework does not directly test VAM's predictions against those of premotor theory. As a result, the research presented in this section – research that demonstrates the coupling between action and perceptual selection – can be taken as support for either VAM or premotor theory.

Visual Processing in the Action-Oriented Brain 155

while maintaining central fixation. The DT was always presented in the same location, so participants could attempt to attend to the DT while reaching to the AT. Despite participants' foreknowledge of the DT location, discrimination performance was best when the AT coincided with the DT, suggesting obligatory coupling between reach programming and perceptual selection. A later study by Deubel and Schneider [70], however, showed that the coupling between reaching and perceptual selection persists only for a short period of time. If movement onset was delayed by more than 300ms after the imperative stimulus, attention could be decoupled from the action target and oriented elsewhere. Eye movements,

Baldauf, Wolf, and Deubel [71] replicated Deubel et al.'s [69] finding that manual preparation orients attention to the aiming target and extended it to show that preparing a multiple component movement can orient attention to multiple targets simultaneously. Participants executed rapid sequential aiming movements to 2 or 3 targets within a circular array of 12 stimuli. Identification of the transiently displayed DT was enhanced when its location coincided with the location of any one of the targets of the movement. Identification of the DT was poor at other locations, even a location falling directly between two target locations. This suggests that action preparation can drive multiple attention 'spotlights' in

A further example of the link between action intention and visual attention was provided by Bekkering and Neggers [72], who showed that visual target selection was influenced by whether the participant intended to grasp an object or point to it. When participants planned to *grasp* an object within a field of distractor objects, their initial eye movements, which were used as a marker of attention capture, were drawn less often to distractors of the wrong orientation than when participants intended to *point* to the target. That is, the intention to grasp may have allowed a pre-filtering of object orientation (a grasp-relevant, but not pointing-relevant, feature), thereby reducing the effect of the distractors on the

The studies discussed in this section have provided behavioural evidence that both eye and hand movement preparation produce covert orienting of attention. Furthermore, this binding of action and perception appears to be obligatory; even when participants attempt to orient elsewhere, motor preparation carries attention to the action target. So, although high-level decisions about how to interact with the world rely on perceptual representations, once the

We set out to show that a great deal of our daily visual processing is intimately linked with the motor system. Much of that processing, in fact, proceeds without our being aware of it, and it automatically drives our actions. We began the chapter by describing some of the

decision to act has been made, visual perception becomes yoked to action.

on the other hand, always bound attention to the saccade target, regardless of delay.

parallel.

initial eye movement.

**4.7. Section summary** 

**5. Conclusion** 

Deubel and Schneider [68] provide strong evidence of the coupling between oculomotor preparation and visual attention. In one experiment participants were instructed to make a saccade to a peripheral target based on a central cue (a number specifying the location of the target). After cue presentation, but prior to saccade initiation, a discrimination target (DT) (which was a normal 'E' or a reverse 'E') appeared either at the same location as the saccade target (ST) or at a different location. The DT was present very briefly, and was masked prior to the onset of the saccade. Participants' discrimination performance was the dependent measure, and Deubel and Schneider [68] used this measure to infer the locus of attention. They found that participants' performance was considerably enhanced when the DT was at the ST position. Performance dropped off considerably when the DT was at a different position than the ST, even if by only 1 or 2 degrees of visual angle. Because all perceptual discrimination occurred prior to any movement of the eyes, these results provide evidence of covert orienting resulting from oculomotor preparation. In another experiment, Deubel and Schneider [68] showed the same effect when the ST was specified exogenously. Furthermore, in order to control for the possibility that covert orienting might be occurring independently of saccade preparation rather than being driven by it, Deubel and Schneider [68] conducted an experiment in which participants were told beforehand the upcoming location of the DT. Participants could then try to attend to the DT location while programming a saccade to a different location. Again, however, discrimination performance was best at the ST location, suggesting strong coupling between oculomotor programming and perceptual selection.

Deubel, Schneider, and Paprotta [69] extended these findings to show that reaching movements have the same impact as oculomotor programming on perceptual selection. The experiment's design was similar to that of Deubel and Schneider [68], but with manual aiming movements instead of saccades. A central cue indicated which peripheral object was the aiming target (AT), and the participant's task was to rapidly aim his/her finger to it while maintaining central fixation. The DT was always presented in the same location, so participants could attempt to attend to the DT while reaching to the AT. Despite participants' foreknowledge of the DT location, discrimination performance was best when the AT coincided with the DT, suggesting obligatory coupling between reach programming and perceptual selection. A later study by Deubel and Schneider [70], however, showed that the coupling between reaching and perceptual selection persists only for a short period of time. If movement onset was delayed by more than 300ms after the imperative stimulus, attention could be decoupled from the action target and oriented elsewhere. Eye movements, on the other hand, always bound attention to the saccade target, regardless of delay.

Baldauf, Wolf, and Deubel [71] replicated Deubel et al.'s [69] finding that manual preparation orients attention to the aiming target and extended it to show that preparing a multiple component movement can orient attention to multiple targets simultaneously. Participants executed rapid sequential aiming movements to 2 or 3 targets within a circular array of 12 stimuli. Identification of the transiently displayed DT was enhanced when its location coincided with the location of any one of the targets of the movement. Identification of the DT was poor at other locations, even a location falling directly between two target locations. This suggests that action preparation can drive multiple attention 'spotlights' in parallel.

A further example of the link between action intention and visual attention was provided by Bekkering and Neggers [72], who showed that visual target selection was influenced by whether the participant intended to grasp an object or point to it. When participants planned to *grasp* an object within a field of distractor objects, their initial eye movements, which were used as a marker of attention capture, were drawn less often to distractors of the wrong orientation than when participants intended to *point* to the target. That is, the intention to grasp may have allowed a pre-filtering of object orientation (a grasp-relevant, but not pointing-relevant, feature), thereby reducing the effect of the distractors on the initial eye movement.

### **4.7. Section summary**

154 Visual Cortex – Current Status and Perspectives

and perceptual selection.

**4.6. The Visual Attention Model and action-perception coupling** 

can be taken as support for either VAM or premotor theory.

The Visual Attention Model (VAM), like premotor theory, posits that motor preparation and perceptual selection are coupled [61,67]. However, VAM differs from premotor theory in two major ways. For one, VAM suggests that the coupling between selection-for-action and selection-for-perception is bi-directional. In other words, selecting an object based on perceptual attributes (e.g. colour) also binds action systems to that object, and selecting an object for action (e.g. preparing to grasp an apple) also binds perceptual systems to that object. (Premotor theory only allows for action preparation to bind perceptual attention to an object.) The other way that VAM differs from premotor theory is that VAM posits an independent, higher-level, attention mechanism that binds action and perceptual processes. (Premotor theory argues against an independent attention system.) Much of the research that has been conducted within the VAM framework does not directly test VAM's predictions against those of premotor theory. As a result, the research presented in this section – research that demonstrates the coupling between action and perceptual selection –

Deubel and Schneider [68] provide strong evidence of the coupling between oculomotor preparation and visual attention. In one experiment participants were instructed to make a saccade to a peripheral target based on a central cue (a number specifying the location of the target). After cue presentation, but prior to saccade initiation, a discrimination target (DT) (which was a normal 'E' or a reverse 'E') appeared either at the same location as the saccade target (ST) or at a different location. The DT was present very briefly, and was masked prior to the onset of the saccade. Participants' discrimination performance was the dependent measure, and Deubel and Schneider [68] used this measure to infer the locus of attention. They found that participants' performance was considerably enhanced when the DT was at the ST position. Performance dropped off considerably when the DT was at a different position than the ST, even if by only 1 or 2 degrees of visual angle. Because all perceptual discrimination occurred prior to any movement of the eyes, these results provide evidence of covert orienting resulting from oculomotor preparation. In another experiment, Deubel and Schneider [68] showed the same effect when the ST was specified exogenously. Furthermore, in order to control for the possibility that covert orienting might be occurring independently of saccade preparation rather than being driven by it, Deubel and Schneider [68] conducted an experiment in which participants were told beforehand the upcoming location of the DT. Participants could then try to attend to the DT location while programming a saccade to a different location. Again, however, discrimination performance was best at the ST location, suggesting strong coupling between oculomotor programming

Deubel, Schneider, and Paprotta [69] extended these findings to show that reaching movements have the same impact as oculomotor programming on perceptual selection. The experiment's design was similar to that of Deubel and Schneider [68], but with manual aiming movements instead of saccades. A central cue indicated which peripheral object was the aiming target (AT), and the participant's task was to rapidly aim his/her finger to it The studies discussed in this section have provided behavioural evidence that both eye and hand movement preparation produce covert orienting of attention. Furthermore, this binding of action and perception appears to be obligatory; even when participants attempt to orient elsewhere, motor preparation carries attention to the action target. So, although high-level decisions about how to interact with the world rely on perceptual representations, once the decision to act has been made, visual perception becomes yoked to action.

## **5. Conclusion**

We set out to show that a great deal of our daily visual processing is intimately linked with the motor system. Much of that processing, in fact, proceeds without our being aware of it, and it automatically drives our actions. We began the chapter by describing some of the cortical areas that have been shown to provide direct links between incoming visual information and real-time motor output. Investigations of neurological conditions such as visual agnosia and blindsight (impaired visual awareness), optic ataxia (impaired control to peripheral targets), and hemispatial neglect (impaired attention and perceptual representation in one half of visual space) have furthered our understanding of visuomotor control, and many of the findings from these populations are consistent with the idea that visual processing in the PPC is action-related and inaccessible to conscious awareness. Behavioural studies in non-patient participants have also shown that vision-for-action can operate without any awareness on the part of the performer. For instance, masking studies reveal motor responses driven by unperceived stimuli; saccadic suppression studies show automatic responses to unperceived location changes; and motor learning studies show that awareness of a perturbation is not necessary for, and may even be detrimental to, visuomotor adaptation.

Visual Processing in the Action-Oriented Brain 157

[7] Goodale MA, Milner AD, Jakobson LS, Carey DP (1991) A neurological dissociation

[8] James TW, Culham J, Humphrey GK, Milner AD, Goodale MA (2003) Ventral occipital lesions impair object recognition but not object-directed grasping: an fMRI study. Brain

[9] Aglioti S, DeSouza JFX, Goodale MA (1995) Size-contrast illusions deceive the eye but

[10] Haffenden AM, Goodale MA (1998) The effect of pictorial illusion on prehension and

[11] Westwood DA, Goodale MA (2011) Converging evidence for diverging pathways: Neuropsychology and psychophysics tell the same story. Vision Research 51: 804-811. [12] Schenk T, Franz V, Bruno N (2011) Vision-for-perception and vision-for action: which model is compatible with the available psychophysical and neuropsychological data?

[13] Glover S, Dixon P (2002) Dynamic effects of the Ebbinghaus illusion in grasping:

[14] Rizzolatti G, Matelli M (2003) Two different streams form the dorsal visual system:

[15] Pisella L, Binkofski F, Lasek K, Toni I, Rossetti Y (2006) No double-dissociation between optic ataxia and visual agnosia: multiple sub-streams for multiple visuo-manual

[16] Weiskrantz L, Warrington EK, Sanders MD, Marshall J (1974) Visual capacity in the

[19] Poppel E, Held R, Frost D (1973) Residual visual function after brain wounds involving

[20] Perenin MT, Rossetti Y (1996) Grasping without form discrimination in a hemianopic

[21] Whitwell RL, Striemer CL, Nicolle DA, Goodale MA (2011) Grasping the non-conscious: Preserved grip scaling to unseen objects for immediate by not delayed grasping following a unilateral lesion to primary visual cortex. Vision Research 51: 908-924. [22] Binsted G, Brownell K, Vorontsova Z, Heath M, Saucier D (2007) Visuomotor system uses target features unavailable to conscious awareness. Proceedings of the National

[23] Perenin MT, Vighetto A (1988) Optic ataxia: a specific disruption in visuomotor mechanisms. I. Different aspects of the deficit in reaching for objects. Brain 111: 643-674. [24] Pisella L, Grea H, Tilikete C, Vighetto A, Desmurget M, Rode G, Boisson D, Rossetti Y (2000) An 'automatic pilot' for the hand in human posterior parietal cortex: toward

[25] Rossetti Y, Pisella L, Vighetto A (2003) Optic ataxia revisited: visually guided action versus immediate visuomotor control. Experimental Brain Research 153: 171-179.

hemianopic field following a restricted occipital ablation. Brain 97: 709-728. [17] Weiskrantz L (1996) Blindsight revisited. Current Opinion in Neurobiology 6: 215-220.

[18] Cowey A (2010) The blindsight saga. Experimental Brain Research 200: 3-24.

the central visual pathways in man. Nature 243: 295-296.

reinterpreting optic ataxia. Nature Neuroscience 3: 729-736.

between perceiving objects and grasping them. Nature 349: 154-156.

perception. Journal of Cognitive Neuroscience 10: 122-136.

anatomy and function. Experimental Brain Research 153: 146-157.

126: 2463-2475

not the hand. Current Biology 5: 679-685.

Support for a planning/control model of action.

integrations. Neuropsychologia 44: 2734-2748.

Vision Research 51: 812-818.

field. Neuroreport 7: 793-797.

Academy of Sciences 104: 12669-12672.

Having demonstrated that actions can sometimes access visual information that perception does not, we went on to examine ways in which our actions can also dictate what our perceptual system sees. We discussed the link between eye movements and the pick up of visual information, and we provided evidence that many of our eye movements are directly driven by our plans for manual action. Moreover, visual attention for perception was shown to be bound to the saccade and/or the reach target. At the risk of overstating our case, we propose that one think of action as a tour guide to the gallery of the visual world; it dictates what the perceptual visitors get to see, and it has access to locked rooms that perception never enters.

## **Author details**

Brendan D. Cameron and Gordon Binsted

*School of Health and Exercise Sciences, University of British Columbia Okanagan, Canada* 

## **6. References**


[7] Goodale MA, Milner AD, Jakobson LS, Carey DP (1991) A neurological dissociation between perceiving objects and grasping them. Nature 349: 154-156.

156 Visual Cortex – Current Status and Perspectives

visuomotor adaptation.

**Author details** 

**6. References** 

25: 73-144.

Brendan D. Cameron and Gordon Binsted

Erlbaum Associates. 333p.

Trends in Neurosciences 15: 20-25.

Behavioral and Brain Sciences 27: 3-24.

Oxford University Press. 297p.

cortical areas that have been shown to provide direct links between incoming visual information and real-time motor output. Investigations of neurological conditions such as visual agnosia and blindsight (impaired visual awareness), optic ataxia (impaired control to peripheral targets), and hemispatial neglect (impaired attention and perceptual representation in one half of visual space) have furthered our understanding of visuomotor control, and many of the findings from these populations are consistent with the idea that visual processing in the PPC is action-related and inaccessible to conscious awareness. Behavioural studies in non-patient participants have also shown that vision-for-action can operate without any awareness on the part of the performer. For instance, masking studies reveal motor responses driven by unperceived stimuli; saccadic suppression studies show automatic responses to unperceived location changes; and motor learning studies show that awareness of a perturbation is not necessary for, and may even be detrimental to,

Having demonstrated that actions can sometimes access visual information that perception does not, we went on to examine ways in which our actions can also dictate what our perceptual system sees. We discussed the link between eye movements and the pick up of visual information, and we provided evidence that many of our eye movements are directly driven by our plans for manual action. Moreover, visual attention for perception was shown to be bound to the saccade and/or the reach target. At the risk of overstating our case, we propose that one think of action as a tour guide to the gallery of the visual world; it dictates what the perceptual visitors get to see, and it has access to locked rooms that perception never enters.

*School of Health and Exercise Sciences, University of British Columbia Okanagan, Canada* 

[1] Norman J (2002) Two visual systems and two theories of perception: An attempt to reconcile the constructivist and ecological approaches. Behavioural and Brain Sciences

[2] Gibson JJ (1986) The Ecological Approach to Visual Perception. New Jersey: Lawrence

[3] Mishkin M, Ungerleider LG, Macko KA (1983) Object vision and spatial vision: Two

[4] Goodale MA, Milner AD (1992) Separate visual pathways for perception and action.

[5] Milner AD, Goodale MA (2006) The Visual Brain in Action, 2nd edition. New York:

[6] Glover S (2004) Separate visual representations in the planning and control of action.

cortical pathways. Trends in Neurosciences 6: 414-417.


[26] Milner AD, Goodale MA (2008) Two visual systems re-viewed. Neuropsychologia 46: 774-785.

Visual Processing in the Action-Oriented Brain 159

[43] Prablanc C, Martin O (1992) Automatic control during hand reaching at undetected two-dimensional target displacements. Journal of Neurophysiology 67: 455-469. [44] Michel C, Pisella L, Prablanc C, Rode G, Rossetti Y (2007) Enhancing visuomotor adaptation by reducing error signals: single-step (aware) versus multiple-step (unaware) exposure to wedge prisms. Journal of Cognitive Neuroscience 19: 341-350. [45] Magescas F, Prablanc C (2006) Automatic drive of limb motor plasticity. Journal of

[46] Cameron BD, Franks IM, Inglis JT, Chua R (2011) Reach adaptation to online target

[47] Magescas F, Urquizar C, Prablanc C (2009) Two modes of error processing in reaching.

[48] Cameron BD, Franks IM, Inglis JT, Chua R (2010) Reach adaptation to explicit vs.

[49] Hopp JJ, Fuchs AF (2004) The characteristics and neuronal substrate of saccadic eye

[50] Henderson JM, Hollingworth A (1999) The role of fixation position in detecting scene

[51] Deubel H, Schneider WX, Bridgeman B (1996) Postsaccadic target blanking prevents

[52] Land M, Mennie N, Rusted J (1999) The roles of vision and eye movements in the

[53] Johansson RS, Westling G, Backstrom A, Flanagan JR (2001) Eye-hand coordination in

[54] Brouwer AM, Franz VH, Gegenfurtner KR (2009) Differences in fixations between

[55] Neggers SFW, Bekkering H (2001) Gaze anchoring to a pointing target is present during the entire pointing movement and is driven by a non-visual signal. Journal of

[56] Rensink RA, O'Regan JK, Clark JJ (1997) To see or not to see: The need for attention to

[57] Simons DJ, Chabris CF (1999) Gorillas in our midst: sustained inattentional blindness

[58] Allport A (1987) Selection for action: some behavioural and neurophysiological considerations of attention and action. In: Heuer H, Sanders AF, editors. Perspectives on Perception and Action. Hillsdale, NJ: Lawrence Erlbaum Associates. pp. 395-420. [59] Rizzolatti G, Riggio L, Dascola I, Umilta C (1987) Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention.

[60] Tipper SP, Lortie C, Baylis GC (1992) Selective reaching: evidence for action-centered attention*.* Journal of Experimental Psychology: Human Perception and Performance 18:

saccadic suppression of image displacement. Vision Research 36: 985-996.

Cognitive Neuroscience 18: 75-83.

error. Experimental Brain Research 209: 171-180.

implicit target error. Experimental Brain Research 203: 367-380.

movement plasticity. Progress in Neurobiology 72: 27-53.

changes across saccades. Psychological Science 10: 438-443.

control of activities of daily living. Perception 28: 1311-1328.

grasping and viewing objects. Journal of Vision 9(1): 18.

perceive changes in scenes. Psychological Science 8: 368-373.

for dynamic events. Perception 28: 1059-1074.

Neurophysiology 86: 961-970.

Neuropsychologia 25: 31-40.

891-905.

object manipulation. The Journal of Neuroscience 21: 6917-6932.

Experimental Brain Research 193: 337-350.


[43] Prablanc C, Martin O (1992) Automatic control during hand reaching at undetected two-dimensional target displacements. Journal of Neurophysiology 67: 455-469.

158 Visual Cortex – Current Status and Perspectives

Neuropsychologia 47: 3033-3044.

Neuropsychologia 44, 2717-2733.

Neuropsychologia 49: 2498-2504.

774-785.

972-980.

23: 1896-1901.

Performance 25: 976-992.

Nature 320: 748-750.

[26] Milner AD, Goodale MA (2008) Two visual systems re-viewed. Neuropsychologia 46:

[27] Pisella L, Sergio L, Blangero A, Torchin H, Vighetto A, Rossetti Y (2009). Optic ataxia and the function of the dorsal stream: Contributions to perception and action.

[28] McIntosh RD, Mulroue A, Blangero A, Pisella L, Rossetti Y (2011) Correlated deficits of

[30] Himmelbach M, Karnath HO (2003) Goal-directed hand movements are not affected by the biased space representation in spatial neglect. Journal of Cognitive Neuroscience 15:

[31] Himmelbach M, Karnath HO, Perenin MT (2007) Action control is not affected by spatial neglect: A comment on Coulthard et al. Neuropsychologia 45: 1979-81. [32] Coulthard E, Parton A, Husain M (2006) Action control in visual neglect.

[33] Rossit S, McIntosh RD, Malhotra P, Butler SH, Muir K, Harvey M (2011) Attention in

[34] Rossit S, Fraser JA, Teasell R, Malhotra PA, Goodale MA (2011) Impaired delayed but preserved immediate grasping in a neglect patient with parieto-occipital lesions.

[35] Milner AD, Dijkerman HC, Pisella L, McIntosh RD, Tilikete C, Vighetto A, Rossetti Y (2001) Grasping the past: delay can improve visuomotor performance. Current Biology

[36] Klotz W, Neumann O (1999) Motor activation without conscious discrimination in metacontrast masking. Journal of Experimental Psychology: Human Perception and

[37] Taylor JL, McCloskey DI (1990) Triggering of preprogrammed movements as reactions

[38] Cressman EK, Franks IM, Enns JT, Chua R (2007) On-line control of pointing is modified by unseen visual shapes. Consciousness and Cognition 16: 265-275. [39] Bridgeman B, Hendry D, Stark L (1975) Failure to detect displacement of the visual

[40] Bridgeman B, Lewis S, Heit G, Nagle M (1979) Relation between cognitive and motororiented systems of visual position perception. Journal of Experimental Psychology:

[41] Goodale MA, Pelisson D, Prablanc C (1986) Large adjustments in visually guided reaching do not depend on vision of the hand or perception of target displacement.

[42] Pelisson D, Prablanc C, Goodale MA, Jeannerod M (1986) Visual control of reaching movements without vision of the limb. Experimental Brain Research 62: 303-311.

perception and action in optic ataxia. Neuropsychologia 49: 131-137.

action: Evidence from on-line corrections in left visual neglect. http://dx.doi.org/10.1016/j.neuropsychologia.2011.10.004

to masked stimuli. Journal of Neurophysiology 63: 439-446.

Human Perception and Performance 5: 692-700.

world due to saccadic eye movements. Vision Research 15: 719-722.

[29] Harvey M, Rossit S (2011) Visuospatial neglect in action. http://dx.doi.org/10.1016/j.neuropsychologia.2011.09.030


[61] Schneider WX (1995) VAM: a neuro-cognitive model for visual attention control of segmentation, object recognition, and space-based motor action. Visual Cognition 2: 331-375.

**Chapter 7** 

© 2012 Cook et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2012 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

**Linking Neural Activity** 

**and Attentional Contributions** 

Additional information is available at the end of the chapter

and are some of the basic attributes that we can perceive.

http://dx.doi.org/10.5772/47270

**1. Introduction** 

contributions?

**to Visual Perception: Separating Sensory** 

Jackson E.T. Smith, Nicolas Y. Masse, Chang'an A. Zhan and Erik P. Cook

For each of the five basic senses, information about the external world begins as a physical representation in the brain. This representation exists in the structure of sensory neural activity, such as the flow of ions across neural membranes and the action potentials (or spikes) that neurons produce. At some point the brain achieves a transition – from tangible electrophysiology to something more. In other words, neural activity becomes a basic sensation that we are aware of and that we can name. For example, sensations like 'slow' or 'fast', 'far' or 'near', are some of the simplest features that we can assign to a visual stimulus

But the transition from neural activity to perception is not simple and remains largely unknown. This process is not intractable, however, and enormous effort has been made by neuroscientists to solve it. In particular, much progress has been made to reveal how small fluctuations in cortical activity are correlated with perceptual behavior. We refer to this correlation as 'behavioral sensitivity'. New observations suggest that both bottom-up sensory mechanisms (such as neural noise) and top-down processes (such as attention) have a role to play in establishing behavioral sensitivity. How do we separate these two

Figure 1 illustrates the problem of untangling the link between a visual cortical neuron's activity and a subject's perceptual behavior. In the simplest model (Figure 1A), a visual cortical neuron contributes in a bottom-up manner to downstream networks that underlie perceptual behavior. In this case, a neuron is behaviorally sensitive because its activity is directly linked to the perception of the visual stimulus. In the alternative extreme (Figure 1B), a visual cortical neuron has no direct influence on the perceptual decision, but is

and reproduction in any medium, provided the original work is properly cited.

