**5.1. Joint attention**

The ability to share somebody else's focus of attention, known as joint attention, is critical for successful social interactions, setting up a context in which a child can learn from others. For example, a child needs to know what a person is looking at to understand what a new label might refer to. Indeed, joint attention has been studied extensively in relation to children's word learning (e.g., Baldwin, 1995; Mundy & Newell, 2007; Carpenter, Nagell, & Tomasello, 1998; Tomasello, 1995; see also Flom, Lee, & Muir, 2007). A common task involves presenting children with a set of objects, and an adult visibly looking at the one that is being named. Both the amount of time the participant follows the eye-gaze of the adult and the degree of labelling are thought to reflect the amount of joint attention that occurs between them.

Children with ASD have demonstrated difficulty following the gaze of an adult in joint attention tasks (for a review, see Meindl & Cannella-Malone, 2011). And this deficit is observed alongside difficulties with learning new object names (Baron-Cohen, Baldwin, & Crowson, 1997; McDuffie, Yoder, & Stone, 2006; Parish-Morris, Hennon, Hirsh-Pasek, Golinkoff, & Tager-Flusberg, 2007; Preissler & Carey, 2005). For example, there is a pronounced learning difference between children with ASD and typically developing children when the labeled object was held by the experimenter, versus by the child (Preissler & Carey, 2005). This difference cannot be attributed to general word-learning deficits because word learning did not differ between diagnostic groups when the labeled object was in the child's hand. Similarly, learning did not differ between diagnostic groups when the labeled object was the only novel object.

Yet, despite strong evidence in favor of ASD impairments in joint attention, findings from other research complicate the picture: participants with ASD appear perfectly capable of joint attention in some contexts, if not even more skilled than their typically developing counterparts (Chawarska, Klin, Volkmar 2003; Kylliainen & Hietanen, 2004; Vlamings, Strauder, van Son, Mottron 2005). Consider, for example, a task in which participants have to press a corresponding button as soon as they see a target appear either at the top left or the bottom right of a monitor. A face was also shown in the center of the monitor. The gaze of the face was straight ahead, averted to the top left, or averted to the bottom right, 200ms before the target appeared. Findings show faster reaction time on trials in which the target appeared on the same side of the screen as the face's gaze, with no difference between diagnostic groups (Kylliainen & Hietanen, 2004).

An argument could be made that different joint-attention tasks are not equally suited to capture the construct of joint attention. Maybe the reaction time task is a better reflection of joint-attention processes than a word-learning task. Such argument about what task might best reflect a stable factor is a common argument in the larger literature of cognition and cognitive development. However, it gets quickly overwhelmed as more context effects accumulate.
