**1. Introduction**

68 Neuroimaging – Cognitive and Clinical Neuroscience

Tyvaert, L., Levan, P., Grova, C., Dubeau, F., & Gotman, J. (2008). Effects of fluctuating

Vaishnavi, S., Rao, V., & Fann, J.R. (2009) Neuropsychiatric problems after traumatic brain

van Leeuwen, T.M., Petersson, K.M., & Hagoort, P. (2010). Synaesthetic colour in the brain:

Vigneau, M., Beaucousin, V., Hervé, P.Y., Duffau, H., Crivello, F., Houdé, O., Mazoyer, B., &

phonology, semantics, and sentence processing. *Neuroimage, 30,* 1414–1432. Ward, J., & Mattingley, J.B. (2006). Synaesthesia: an overview of contemporary findings and

Weiskrantz, L., Warrington, E.K., Sanders, M.D., & Marshall, J. (1974). Visual capacity in the hemianopic field following a restricted occipital ablation. *Brain, 97,* 709–728. Woermann, F.G., Jokeit, H., Luerding, R., Freitag, H., Schulz, R., Guertler, S., Okujava, M.,

Wong, S.W.H., Jong, L., Bandur, D., Bihari, F., Yen, Y.F., Takahashi, A.M., Lee, D.H., Steven,

Yücel, M., Pantelis, C., Stuart, G.W., Wood, S.J., Maruff, P., Velakoulis, D., Pipingas, A.,

Zeki, S., & ffytche, D.H. (1998). The Riddoch syndrome: insights into the neurobiology of

Zihl, J., & Werth, R. (1984). Contributions to the study of "blindsight" – I. Can stray light

Zilmer, E.A., & Spiers, M.V. (2001). Principles of Neuropsychology. Wadsworth, Belmont,

injury: unraveling the silent epidemic. *Psychosomatics, 50,* 198–205.

synaesthetes and matched controls. *PLoS One, 5,* 1–12.

fMRI in 100 patients with epilepsy. *Neurology, 61,*699–701.

*Neurophysiology, 119,* 2762–2774.

controversies. *Cortex, 42,* 129–136.

conscious vision. *Brain, 121,* 25–45.

*Neuropsychologia, 22,* 1–11.

*73,* 518–525.

USA.

physiological rhythms during prolonged EEG-fMRI studies. *Clinical* 

Beyond colour areas. A functional magnetic resonance imaging study of

Tzourio-Mazoyer, N. (2006). Meta-analyzing left hemisphere language areas:

Wolf, P., Tuxhorn, I., & Ebner, A. (2003). Language lateralisation by Wada test and

D.A., Parrent, A.G., Pigott, S.E., & Mirsattari, S.M. (2009). Cortical reorganization in temporal lobe epilepsy patients following anterior temporal lobectomy. *Neurology,* 

Crowe, S.F., Tochon-Danguy, H.J., & Egan, G.F. (2002). Anterior cingulate activation during Stroop task performance: a PET to MRI coregistration study of individual patients with schizophrenia. *American Journal of Psychiatry, 159,* 251–254.

account for saccadic localisation in patients with postgeniculate field defects?

In this review we outline the range of functional processes involved in language comprehension and their anatomical underpinnings, including recent data on neural connectivity specifically wired for language, using magnetic resonance imaging (MRI) as main tool. A review of this type certainly implies such a large number of references that, for the sake of concision, we have selected the most outstanding and representative studies and reviews. Our interests in identifying possible cues for the evolutionary origins of language partially guided this selection; this review is actually intended as a contribution to better understand human language.

To start with, a description of language and its components appears necessary. In this regard, we will follow the proposal by Ray Jackendoff (2002), who provides one of the most comprehensive and valuable current accounts from the linguistics. Jackendoff proposes at least three structural layers in language, all of them working simultaneously in the processing of every utterance. These layers consist of a *phonological structure*, a *syntactic structure*, and a *semantic/conceptual structure*. Additionally, a number of processes -or subprocesses- coexist within each of these three structures, all of them again working simultaneously.

The phonological structure, which roughly refers to the "sounds" of language, is probably the most complex one, containing the largest number of subprocesses. The auditoryverbal nature of human language may not be alien to this complexity. The phonological structure is actually subdivided into a *prosodic* one -referring to the different intonations along the course of a general envelope covering an entire utterance- and more partial processes referring to *syllabic*, *segmental*, and *morpho-phonological* structures. These latter three structures refer to what most people would call "phonology" as such, and roughly cover the sounds of single syllables, larger word segments, and complete words, respectively.

Syntax refers to the structure of a sentence; that is, the way in which the different words or morphemes constituting a sentence are organized -most often hierarchically-, determining their mutual relationships and dependencies. The hierarchical structure achieved by syntax establishes what the main information is and its relationships with other, secondary items of information; that is, the concrete state of affairs described in an utterance in which the meaning of individual words and morphemes combine. This structure appears "desemantized", i.e., it can be entirely independent of the individual

Functional and Structural Magnetic Resonance Imaging of Human Language: A Review 71

Phonology has been less extensively studied using neuroimaging techniques than any other aspect of language. The perspective that phonology may not be as crucial in defining human language when compared to non-human forms of communication as other aspects of language, such as semantics or, particularly, syntax (Hauser et al., 2002), has probably biased the interests of the authors apart from this structure. However, human language is primarily an auditory-verbal process which, in turn, implies cerebral specializations at this level. On one hand, phonological aspects seem to be processed into specialized brain areas located within and around primary auditory ones (Brodmann Areas –BA- 41/42, Heschl's gyrus). In this regard, there is evidence of the use of extensive regions within the superior temporal gyrus largely specialized for these functions. These regions are mostly bilateral, though some degree of left-lateralization also emerges. Accordingly, a very first step in the processing of phonological information seems to be localized very dorsally in the temporal lobe, in Heschl's gyrus, where phonology would be already distinguished from nonlinguistic sounds (Price, 2000). Thereafter, an antero-lateral functional gradient starting in Heschl's gyrus and progressing toward the temporal pole seems involved in further integrating heard sounds, identifying and distinguishing concrete phonological sounds such as familiar vowels against single formants (Leff et al., 2009). Additional data complete this picture by adding more ventral -middle temporal gyrus- and posterior areas of the left

temporal lobe as involved as well in these functions (Specht et al., 2009).

claim against a "visual word-form area" -see below-).

An additional specialization for auditory language processing refers to whole words. This is known as "word-form" analysis, which means that, rather than the processing of single phonemes or longer auditory segments, what is processed and identified at this level is the overall specific sound of an entire word; a holistic analysis. There seem to be specialized cortical regions for the integration of phonological sounds into these larger and unitary sound chains, these regions corresponding to auditory association areas in the left hemisphere. A possible candidate for this process seems to be Wernicke's area. Its location next to primary auditory areas would favor such specialization. Wernicke's area is normally located in the posterior part of BA 22 within the superior temporal gyrus and sulcus (Wise et al., 2001). There are other alternatives for the location of Wernicke's area, however. Some of them spread the posterior part of BA 22 to also cover parts of BA 39 and 40 in the parietal lobe (Mesulam, 1998), whereas others locate Wernicke's area at the unimodal auditory association areas in the superior temporal gyrus just anterior to the primary auditory cortex (Démonet et al., 1992) –then covering portions that have been already mentioned here as participating in lower-level phonological analyses-. Indeed, irrespective of whether these more anterior regions can be considered or not as belonging to Wernicke's area, they have actually been claimed as the precise location for the "auditory word form area" (Cohen et al., 2004). Interestingly, however, it has been also claimed that there are no such specific cortical sites devoted to auditory word-form processing (Price et al., 2003; these authors also

In any event, the systemic nature of the brain becomes already patent even at these very primary stages of language comprehension. In other words, the perception of speech sounds would not be limited to the temporal auditory and surrounding cortical areas, but is also significantly involving frontal cortical regions and subcortical nuclei normally implied in production (i.e., motor) processes. Accordingly, in addition to the superior temporal cortex, the most posterior portions of the left inferior frontal regions -comprising parts of Broca's

**2. The sounds of language** 

meanings of its constituents, as in the classical example by Chomsky: "*Colorless green ideas sleep furiously*".

The semantic/conceptual structure of a linguistic utterance is probably the most central one. Indeed, the main aim of processing any linguistic message, regardless of its syntactic structure and transmission modality, is the realization of this semantic structure. This basically consists on the "meaning" of any whole sentence, that is, what it specifically means, or the idea in the mind of the speaker that she wants to elicit in the mind of the hearer. Although this information largely relies on syntax and phonology, the semantic/conceptual structure is completely independent of them –the same idea can actually be transmitted using the two other structures in many ways-. Although single words or morphemes in isolation convey semantic/conceptual information, the combination of these individual meanings by means of syntax, which in turn is achieved by means of phonology, gives place to a different, very specific meaning or semantic structure describing a concrete and detailed situation. It is not clear, however, to which extent the semantic/conceptual structure belongs to language as such, or whether it is a general process, common to other input options such as the non-linguistic interactions between the individual and her environment. In this regard, several authors still distinguish between semantic aspects specific of language and general semantic aspects common to any domain, and this distinction is particularly applicable at the level of the meaning of single words or morphemes. However, the distinction between semantics for language and general semantics appears difficult to embrace from the neural perspective, as we will see. Whatever the case, the semantic structure taps into reality, "*space* structure", i.e., the events in the real world a linguistic message refers to.

Semantics also applies to a layer not explicitly highlighted in Jackendoff's proposal but playing a significant role in language comprehension: the *discourse* level. This level refers to the situation in which two or more sentences are comprehended together, i.e., it is the semantic analysis beyond sentences. Indeed, many of the phenomena involved at this level are even less language-specific than those at the other layers or structures. In a discourse, although the hearer is attempting to get the whole comprehension of a longer message, the final picture does not depend for the most part on what is actually heard or read but, rather, on inferences and logical relationships between the ideas transmitted linguistically. These relationships are indeed extra information added by the hearer and based on her previous knowledge of the world. Although this might not be "language" as such, language would be useless if this level is not achieved.

All the processes described so far, i.e., the phonological, prosodic, syntactic, semantic, and discourse structures, may participate in sequential order –actually following this same order - or occur largely in parallel -mostly before the first 250 ms after stimulus onset (Pulvermüller et al., 2009b)-. In the literature, these two opposing views still remain. Whatever the case, the high degree of specialization and efficiency of the human brain for speech processing at all these levels is granted by most authors.

The fact that language can be transmitted using other than the auditory/verbal modality, as in the sign languages of deaf people, or, more frequently, in written form, also deserves some consideration. Consequently, a few lines in this review will be devoted to written language. Overall, most authors would agree that the linguistic machinery in the brain is largely common to any modality, with notable exceptions appearing only when specific peripheral mechanisms are engaged during the emission or decoding of a given message.

#### **2. The sounds of language**

70 Neuroimaging – Cognitive and Clinical Neuroscience

meanings of its constituents, as in the classical example by Chomsky: "*Colorless green ideas* 

The semantic/conceptual structure of a linguistic utterance is probably the most central one. Indeed, the main aim of processing any linguistic message, regardless of its syntactic structure and transmission modality, is the realization of this semantic structure. This basically consists on the "meaning" of any whole sentence, that is, what it specifically means, or the idea in the mind of the speaker that she wants to elicit in the mind of the hearer. Although this information largely relies on syntax and phonology, the semantic/conceptual structure is completely independent of them –the same idea can actually be transmitted using the two other structures in many ways-. Although single words or morphemes in isolation convey semantic/conceptual information, the combination of these individual meanings by means of syntax, which in turn is achieved by means of phonology, gives place to a different, very specific meaning or semantic structure describing a concrete and detailed situation. It is not clear, however, to which extent the semantic/conceptual structure belongs to language as such, or whether it is a general process, common to other input options such as the non-linguistic interactions between the individual and her environment. In this regard, several authors still distinguish between semantic aspects specific of language and general semantic aspects common to any domain, and this distinction is particularly applicable at the level of the meaning of single words or morphemes. However, the distinction between semantics for language and general semantics appears difficult to embrace from the neural perspective, as we will see. Whatever the case, the semantic structure taps into reality, "*space* structure", i.e., the events in the real

Semantics also applies to a layer not explicitly highlighted in Jackendoff's proposal but playing a significant role in language comprehension: the *discourse* level. This level refers to the situation in which two or more sentences are comprehended together, i.e., it is the semantic analysis beyond sentences. Indeed, many of the phenomena involved at this level are even less language-specific than those at the other layers or structures. In a discourse, although the hearer is attempting to get the whole comprehension of a longer message, the final picture does not depend for the most part on what is actually heard or read but, rather, on inferences and logical relationships between the ideas transmitted linguistically. These relationships are indeed extra information added by the hearer and based on her previous knowledge of the world. Although this might not be "language" as such, language would be

All the processes described so far, i.e., the phonological, prosodic, syntactic, semantic, and discourse structures, may participate in sequential order –actually following this same order - or occur largely in parallel -mostly before the first 250 ms after stimulus onset (Pulvermüller et al., 2009b)-. In the literature, these two opposing views still remain. Whatever the case, the high degree of specialization and efficiency of the human brain for

The fact that language can be transmitted using other than the auditory/verbal modality, as in the sign languages of deaf people, or, more frequently, in written form, also deserves some consideration. Consequently, a few lines in this review will be devoted to written language. Overall, most authors would agree that the linguistic machinery in the brain is largely common to any modality, with notable exceptions appearing only when specific peripheral mechanisms are engaged during the emission or decoding of a given message.

speech processing at all these levels is granted by most authors.

*sleep furiously*".

world a linguistic message refers to.

useless if this level is not achieved.

Phonology has been less extensively studied using neuroimaging techniques than any other aspect of language. The perspective that phonology may not be as crucial in defining human language when compared to non-human forms of communication as other aspects of language, such as semantics or, particularly, syntax (Hauser et al., 2002), has probably biased the interests of the authors apart from this structure. However, human language is primarily an auditory-verbal process which, in turn, implies cerebral specializations at this level. On one hand, phonological aspects seem to be processed into specialized brain areas located within and around primary auditory ones (Brodmann Areas –BA- 41/42, Heschl's gyrus). In this regard, there is evidence of the use of extensive regions within the superior temporal gyrus largely specialized for these functions. These regions are mostly bilateral, though some degree of left-lateralization also emerges. Accordingly, a very first step in the processing of phonological information seems to be localized very dorsally in the temporal lobe, in Heschl's gyrus, where phonology would be already distinguished from nonlinguistic sounds (Price, 2000). Thereafter, an antero-lateral functional gradient starting in Heschl's gyrus and progressing toward the temporal pole seems involved in further integrating heard sounds, identifying and distinguishing concrete phonological sounds such as familiar vowels against single formants (Leff et al., 2009). Additional data complete this picture by adding more ventral -middle temporal gyrus- and posterior areas of the left temporal lobe as involved as well in these functions (Specht et al., 2009).

An additional specialization for auditory language processing refers to whole words. This is known as "word-form" analysis, which means that, rather than the processing of single phonemes or longer auditory segments, what is processed and identified at this level is the overall specific sound of an entire word; a holistic analysis. There seem to be specialized cortical regions for the integration of phonological sounds into these larger and unitary sound chains, these regions corresponding to auditory association areas in the left hemisphere. A possible candidate for this process seems to be Wernicke's area. Its location next to primary auditory areas would favor such specialization. Wernicke's area is normally located in the posterior part of BA 22 within the superior temporal gyrus and sulcus (Wise et al., 2001). There are other alternatives for the location of Wernicke's area, however. Some of them spread the posterior part of BA 22 to also cover parts of BA 39 and 40 in the parietal lobe (Mesulam, 1998), whereas others locate Wernicke's area at the unimodal auditory association areas in the superior temporal gyrus just anterior to the primary auditory cortex (Démonet et al., 1992) –then covering portions that have been already mentioned here as participating in lower-level phonological analyses-. Indeed, irrespective of whether these more anterior regions can be considered or not as belonging to Wernicke's area, they have actually been claimed as the precise location for the "auditory word form area" (Cohen et al., 2004). Interestingly, however, it has been also claimed that there are no such specific cortical sites devoted to auditory word-form processing (Price et al., 2003; these authors also claim against a "visual word-form area" -see below-).

In any event, the systemic nature of the brain becomes already patent even at these very primary stages of language comprehension. In other words, the perception of speech sounds would not be limited to the temporal auditory and surrounding cortical areas, but is also significantly involving frontal cortical regions and subcortical nuclei normally implied in production (i.e., motor) processes. Accordingly, in addition to the superior temporal cortex, the most posterior portions of the left inferior frontal regions -comprising parts of Broca's

Functional and Structural Magnetic Resonance Imaging of Human Language: A Review 73

gyrus was originally proposed as playing this role by the very first (historical) neurolinguistic models, and indeed it has appeared as such occasionally in recent functional MRI (fMRI) studies (e.g., Bookheimer et al., 1995). However, the fact that this activation is not consistent, while this region seems better characterized as semantic, has encouraged researchers to look elsewhere. A number of studies locate this functional region into Wernicke's area. But this activation is common to both visual and auditory words (Price et al., 2003) and, indeed, the most plausible functional characterization of Wernicke's area as auditory associative is difficult to conform to a visual word-form area. Some portions of the occipito-temporal cortex appear as better candidates for this function. Specifically, the most outstanding in this regard is located within the fusiform gyrus and surrounding areas -such as the lingual gyrus- in the basal temporal cortex (Dehaene et al., 2002). Interestingly, these areas would be genetically prepared for the processing of faces and objects, these functions emerging as a result of natural selection. However, by virtue of education, a portion of these regions could turn into specifically devoted to the processing of letters and visual word-

Common to any input modality there are processes involved in understanding linguistic messages that appear of the highest interest. Syntactic processes may be among the most outstanding of these factors. As outlined above, syntax permits to determine the hierarchical structure of a sentence composed by a sequence of words (word-forms and their meanings). Studies in this regard have usually approached brain areas involved in syntactic processing using either of two procedures. On the one hand, the comparison between syntactically incorrect and correct material would enhance the activity of brain areas specialized in detecting grammatical errors. As an example, the activation during a sentence like "*the cake was eat*" is compared with its corresponding correct version. On the other hand, comparing grammatically complex sentences with simpler sentences would imply activations in areas particularly handling the complexity of syntactic structures and, hence, areas presumably involved in the hierarchical organization of the sentences. Complexity is usually increased either by embedding material within (e.g.) a main clause, rendering what is called a "recursive" structure, or by changing canonical order (usually, SVO: subject-verb-object) to a non-canonical one, as in the case of passive sentences. Examples of these situations imply comparing "*the child that my mother saw was small*" or "*the cake is being eaten by the children*", respectively, with their corresponding simpler versions (i.e., "*my mother saw a child; the child was small*", and "*the children are eating a cake*"). The case of complexity poses a problem on whether it is actually syntax what is being measured or, instead, working or short-term memory activations necessary to hold information active until the corresponding structural assignments are completed. However, it is also possible to accept that the brain areas specifically involved in working memory for syntactic structures in fact pertain to syntax processing properly, as it can be assumed that working memory for syntax implies the transient activation of circuits actually devoted to syntactic processing (e.g., Fuster, 1999;

Overall, both types of approaches to the study of human syntax have been comparable, yielding largely similar results. As one of the most consistent findings, the left inferior frontal gyrus (IFG), emerges as a central place involved in syntactic errors detection, grammatical complexity processing, and verbal working memory (e.g., Bornkessel-

forms (Dehaene, 2009).

**4. The structure of language** 

MacDonald & Christiansen, 2002).

area-, the left basal ganglia, and even the (right) cerebellum, seem to play a crucial role in identifying the phonemes and sounds used during speech processing (Bozic et al., 2010). Although specific roles for these neural circuits have still to be elucidated, their involvement has been proposed as a mechanism to better process speech sounds regardless of large variability in the input, a way to internally produce those sounds as if the hearer herself were the emitter (Lieberman, 2000). Kotz and Schwartze (2010) stress that these regions, particularly the basal ganglia and the cerebellum, process timing variables crucial for speech. Overal, this is an example of the conjoint action of perceptual and motor brain systems in cognitive processing, as supported by direct evidences as the mirror neurons (Rizzolatti & Craighero, 2004).

Fig. 1. Approximate locations of the phonological system

If, overall, phonology has been scarcely studied by means of MRI, the case is still worse specifically for prosody, even if this type of auditory information may be as relevant as to determining the syntactic structure of a linguistic message (Snedeker, 2008). There is evidence of the involvement of right fronto-lateral cortical areas (fronto-opercular portions in the right inferior frontal gyrus ) and the right superior temporal regions in main analyses of prosody, as has been found when comparing normal speech and pseudo-speech (i.e., speech with normal prosodic intonations but devoid of known words) with degraded speech (e.g., Meyer et al., 2004). Even though, the role of the counterparts regions in the left hemisphere for the processing of prosodic information cannot be obliterated. A common circuit for language, music, and song perception comprising mid and superior temporal gyri as well as inferior and middle frontal gyri, all bilaterally, has been described (Schön et al., 2010). It is true, nonetheless, that the main implication of either hemisphere appears a function of the phonological vs. melodical nature of the input material (corresponding to left vs. right side, respectively).
