The DSM-IV-TR (American Psychiatric Association, 2000) provided diagnostic criteria for the cluster of disabilities comprising autism spectrum disorders (ASD). However, heterogeneity in the expression and severity of core social and communication deficits, both within and across these subgroup clusters, has posed significant challenges for those attempting to identify their etiology and development. One striking example of variability among individuals with a diagnosis of ASD is in the extent that language onset and language competence is disturbed. There has been considerable discussion about whether Asperger syndrome (AS), the ASD subcategory for which significantly delayed language onset is not specified, should be retained in the DSM-5. A much cited reason for removing AS from the DSM-5 was that marked differences distinguishing those with AS and high-functioning autism at early stages of development may lessen over time (Howlin, Reference Howlin2003). However, an intriguing question, given universal deficits in social and communication skills in ASD, is why some individuals with normative IQ levels do not experience marked delays in language onset and development while others do. Neuroconstructivist models of development (e.g., Karmiloff-Smith, Reference Karmiloff-Smith2009) propose that diverse behavioral abnormalities, characteristic in many neurodevelopmental disorders, may in part originate from basic sensory or attentional abnormalities present in early infancy. Given this premise, an important research goal should be to determine which factors contribute to the pattern of heterogeneity in language onset and subsequent language development in ASD.
In typical development, basic attentional mechanisms are thought to orientate the infant toward faces and voices, allowing for the maturation of the social brain and the development of language (Johnson, Dziurawiec, Ellis, & Morton, Reference Johnson, Dziurawiec, Ellis and Morton1991; Nelson, de Haan, & Thomas, Reference Nelson, de Haan and Thomas2006; Pascalis, de Haan, & Nelson, Reference Pascalis, de Haan and Nelson2002). Social deficit models of development (Dawson et al., Reference Dawson, Toth, Abbott, Osterling, Munson and Estes2004; Klin, Jones, Schultz, Volkmar, & Cohen, Reference Klin, Jones, Schultz and Volkmar2005; Morton & Johnson, Reference Morton and Johnson1991) propose that early, reduced exposure to social stimuli leads to delayed and/or atypical development in areas related to social adaptation and language skills. This reduced exposure may result from an impoverished social environment, as evidenced in studies of institutionalised infants (Rutter et al., Reference Rutter, Andersen-Wood, Beckett, Bredenkamp, Castle and Groothues1999), or from the infant's own inability to attend selectively to relevant social stimuli. For example, children born blind have been shown to manifest both language and social delays in early childhood (Tadić, Pring, & Dale, Reference Tadić, Pring and Dale2010), and atypical patterns of face processing have been observed in adults born with congenital cataracts that were surgically removed during the first year of life (Le Grand, Mondloch, Maurer, & Brent, Reference Le Grand, Mondloch, Maurer and Brent2001; Mondloch, Le Grand, Maurer, Pascalis, & Slater, Reference Mondloch, Le Grand, Maurer, Pascalis, Slater, Pascalis and Slater2003). Institutionalized sighted infants who are deprived of social contact display social deficits (Rutter et al., Reference Rutter, Andersen-Wood, Beckett, Bredenkamp, Castle and Groothues1999) and early face processing abnormalities have been recorded in such children (Moulson, Westerlund, Fox, Zeanah, & Nelson, Reference Moulson, Westerlund, Fox, Zeanah and Nelson2009). Taken together, these studies suggest that impoverished social interactions, at an early stage of development, are associated with the types of social and communication deficits observed in ASD.
The findings from several studies show that individuals with ASD manifest atypical patterns of social orienting (Pelphrey et al., Reference Pelphrey, Sasson, Reznick, Paul, Goldman and Piven2002; Senju, Tojo, Dairoku, & Hasegawa, Reference Senju, Tojo, Dairoku and Hasegawa2004; Volkmar & Mayes, Reference Volkmar and Mayes1990), and it has been suggested that these may be particularly pronounced at the early stages of development (Chawarska, Klin, & Volkmar, Reference Chawarska, Klin and Volkmar2003; Dawson et al., Reference Dawson, Toth, Abbott, Osterling, Munson and Estes2004; Elsabbagh et al., Reference Elsabbagh, Mercure, Hudry, Chandler, Pasco and Charman2012, Reference Elsabbagh, Fernandes, Jane Webb, Dawson, Charman and Johnson2013; Maestro et al., Reference Maestro, Muratori, Cavallaro, Pecini, Cesari and Paziente2005). Abnormalities in visual fixations and dwell patterns to social stimuli have been observed in adolescents and adults with ASD. For example, in an eye-tracking study in which adolescents and young adults with ASD viewed a scene from the film Who's Afraid of Virginia Woolf? a high number of fixations to the body rather than to the head regions of the actors was observed (Klin, Jones, Schultz, Volkmar, & Cohen, Reference Klin, Jones, Schultz, Volkmar and Cohen2002). When fixations were made to the head, they were directed to the mouth rather than the eyes, and this pattern of fixation was positively associated with social skills. Norbury et al. (Reference Norbury, Brock, Cragg, Einav, Griffiths and Nation2009) used a similar eye-tracking paradigm to investigate social attention in teenagers with ASD, and they found that patterns of fixation were specifically associated with language phenotypes. These results, showing that typical patterns of attention to the eye region were associated with language impairment rather than with age-appropriate language skills, raise important questions about the consequences of atypical social attention in ASD.
The study reported in this paper also used an eye-tracking methodology to test the hypothesis that social attention in ASD would be associated with language ability and social adaptation in childhood. However, the main aims of this study differ from those of previous eye-tracking studies in several ways. Here we investigated visual attention to interacting and noninteracting groupings in perceptually matched competing static stimuli in children with ASD with and without a history of language delay. Thus, rather than addressing questions about the salience of specific facial features, or object versus human preferences, we aimed to determine whether an interacting grouping would elicit more attention than a noninteracting grouping and whether this would vary over different language phenotypes. In daily life, people are encountered in any number of settings and situations, and our ability to understand their attitudes and intentions greatly relies on the extent that we can allocate visual attention to specific relevant cues. Studies investigating low-level visual attention have demonstrated that paired objects containing related properties elicit attention (Gilchrist, Humphreys, & Riddoch, Reference Gilchrist, Humphreys and Riddoch1996). Working from this basis, we investigated whether the pairing of social objects would also elicit increased saliency. We reasoned that an image of two people facing each other would be perceived as a pair and that this effect would be lessened when the people stood with their backs toward each other. Data from a pilot study testing this new paradigm with neurotypical adults, showing increased viewing time in response to interacting over noninteracting figures, are described in the Methods section.
A main aim of our study was to draw links between atypical attention to social stimuli and early language history, and we subgrouped our ASD participants based on language onset age, while closely matching our two groups for chronological age, symptom severity, and nonverbal intelligence. For simplicity, our participants are described as high-functioning autistic with language delay (HFA-LD) and HFA with normal language onset (HFA-LN). Language and social attention are strongly linked at early stages of development in typical infants, and atypical patterns of attention have been associated with social and communication impairments in ASD. We therefore hypothesized that the HFA-LN group would spend longer looking at the human figures, and especially the head region of the figures, than would the HFA-LD group. We further hypothesized that when given a choice between interacting and noninteracting human figures, HFA-LN participants would look longer at the more socially salient interacting figures and would show a stronger preference for them than would HFA-LD participants. Our final prediction was that time spent viewing the interacting figures would be negatively associated with social deficits in the ASD sample as a whole and positively associated with current language competence.
Method
Pilot study
For the pilot, 23 adult participants participated (9 male, 13 female). This group had no history or cognitive or language delay and an average age of 22 years (SD = 5.0, range = 18–39 years).
Apparatus
Participants sat on a chair that could be adjusted for height, which was placed at a distance of approximately 1 m from the display screen. Stimuli were displayed on a 21-in. computer CRT monitor with each display comprising 800 × 600 pixels. Movements of the left eye were recorded with a sample rate of 500 Hz using a head mounted Eye-Link II tracking device (SR Research).
Stimulus material
Stimulus materials were created from photographs of real people transformed using Photoshop software to produce color images of cartoonlike figures that displayed on a gray background. It was hoped that these images would be appealing to children. Previous research has demonstrated that children with ASD spend longer looking at cartoonlike figures than at objects (van der Geest, Kemner, Camfferman, Verbaten, & van Engeland, Reference van der Geest, Kemner, Camfferman, Verbaten and van Engeland2002). Each stimulus contained two pairs of these cartoonlike figures, one pair standing face-to-face and the other back-to-back. The average height of a figure measured 4.1 degrees of visual angle, and an example stimulus is shown in Figure 1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921131146-11158-mediumThumb-S0954579414000108_fig1g.jpg?pub-status=live)
Figure 1. An example of experimental stimuli (originally presented in color).
There was an equal probability, in any stimulus, of a figure pair, in either of the two configurations, occurring in any of the four quadrants of the screen. The only constraint was that the two pairs presented in any stimulus occurred in diagonally opposite quadrants. There were 60 trials, in which a set of 30 stimuli was presented twice. In the second presentation, the sets of figures appeared in different quadrants that in the first.
Procedure
Before beginning the pilot study, each participant's eye movements were calibrated using a 9-point calibration procedure (EyeLink II). Because specific viewing instructions have been shown to influence looking patterns in eye-tracking studies (Birmingham, Bischof, & Kingstone, Reference Birmingham, Bischof and Kingstone2008), participants were told that they would be presented with images of people that they were free to look at as they wished. Each stimulus was presented for 10 s.
Pilot study results
Scores were calculated for fixations to whole figures in face-to-face and back-to-back configurations. We also analyzed fixations to the head and shoulder areas of the two figures in each of these configurations. Table 1 shows average “dwell time,” defined here as the cumulative duration of all fixations within the 10-s interval for each area of interest for the adult participants.
Table 1. Pilot study results (ms)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20160920233211879-0234:S0954579414000108:S0954579414000108_tab1.gif?pub-status=live)
Paired-sample t tests revealed that the adult participants looked significantly longer at the face-to-face figures than at the back-to-back figures, t (22) = 3.83, p < .001, d = 1.40. A similar pattern of results was noted when only the head region of the figures were analyzed. The adult participants looked significantly longer at the face-to-face head regions than at the back-to-back head regions, t (22) = 4.41, p < .001, d = 0.93. For neurotypical adults, the face-to-face figures proved to be more salient, and were viewed for longer, than the back-to-back figures. The effect size was large, suggesting the need for eight participants per group in order to achieve power of 0.80.
Experimental study
Participants
The participant groups in the experimental study included 23 children with a diagnosis of ASD and 16 typically developing (TD) children. All were recruited from local schools and through ASD support groups in South London, United Kingdom. The clinical participants had all previously received a diagnosis from a trained clinician, and their language histories were verified with reference to baby books in which their language onset histories had been documented. Criteria for inclusion in the HFA-LN group included a prior diagnosis of ASD with no language delay by a trained clinician, which was then confirmed using the Developmental, Dimensional, and Diagnostic Interview (3di; Skuse et al., Reference Skuse, Warrington, Bishop, Chowdhury, Lau and Mandy2004). This meant that the participants in this group had used single words before the age of 24 months and phrase speech before the age of 36 months. The participants in the HFA-LD group had received a prior diagnosis of autism disorder, and this was confirmed using the 3di. In total, 12 children met the criteria for HFA-LD and 11 children met the criteria for HFA-LN. Age and nonverbal IQ, measured by the Raven's Progressive Matrices (RPM; Raven, Reference Raven1941), were used as matching criteria for the three groups. The HFA-LD and HFA-LN groups were further matched on symptom severity, assessed using the 3di. The socialization composite score from the Vineland Adaptive Behavior Scales (VABS; Sparrow & Cicchetti, Reference Sparrow and Cicchetti1985) was used to investigate associations between social skills and measures on the social attention task. The British Vocabulary Picture Scale, Second Edition (BPVS-II; Dunn, Dunn, Whetton, & Burley, Reference Dunn, Dunn, Whetton and Burley1997) and the communication composite of the VABS (which is a composite score of receptive, expressive, and written socially orientated language skills) were used to assess whether or not performance in the experimental tasks would be associated with current levels of receptive language used within a social domain. Relevant psychometric data are shown in Table 2.
Table 2. Mean (standard deviation) matching criteria and language scores
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921131146-04185-mediumThumb-S0954579414000108_tab2.jpg?pub-status=live)
Note: TD, typically developing; HFA, high-functioning autistic; LD, language delay; LN, language normal; BPVS, British Vocabulary Picture Scale; 3di, Developmental, Dimensional, and Diagnostic Interview; VABS, Vineland Adaptive Behavior Scales.
*p ≤ .05. **p ≤ .001.
Measures
The 3di
The 3di (Skuse et al., Reference Skuse, Warrington, Bishop, Chowdhury, Lau and Mandy2004) is a computer-based diagnostic tool increasingly used to diagnose cases of ASD. Outcome measures on the 3di correlate highly with the Autism Diagnostic Interview (Lord, Rutter, & Le Couteur, Reference Lord, Rutter and Le Couteur1994) equivalent scores. It achieves an interrater reliability of 0.9 and a test–retest reliability of 0.9.
The VABS
The VABS (Sparrow & Cicchetti, Reference Sparrow and Cicchetti1985) is a parent-administered questionnaire used to assess children and adults on a range of practical social tasks. The VABS has been extensively used to compare behavioral measures of social functioning in autism and other developmental disorders (Volkmar & Mayes, Reference Volkmar and Mayes1990; Wishart, Cebula, Willis, & Pitcairn, Reference Wishart, Cebula, Willis and Pitcairn2007).
BPVS-II
The BPVS-II (Dunn et al., Reference Dunn, Dunn, Whetton and Burley1997) assesses receptive vocabulary through verbal comprehension and provides a measure of verbal mental age. It is a commonly used tool to determine intelligence in autism research (Mottron, Reference Mottron2004). Scores on the BPVS-II are highly correlated with mental age and IQ derived from the Wechsler Intelligence Scale (BPVS-II manual, pp. 35–36; Dunn et al., Reference Dunn, Dunn, Whetton and Burley1997).
The RPM
The RPM (Raven, Reference Raven1941) assesses nonverbal cognitive ability through a set of tasks in which the participant completes the missing part of a puzzle. The RPM is commonly used to test nonverbal cognitive ability in autism (Mottron, Reference Mottron2004).
Experimental study results
Analyses of variance (ANOVAs)
In order to check for any general abnormalities in gaze control, data for the mean number and duration of fixations and saccades is presented in Table 3. There was no significant difference between the three groups on any of these measures, suggesting that that the groups displayed similar ocular control when viewing the experimental stimuli.
Table 3. Details of eye movement data for the three groups
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921131146-88766-mediumThumb-S0954579414000108_tab3.jpg?pub-status=live)
Note: TD, typically developing; HFA, high-functioning autistic; LD, language delay; LN, language normal.
Table 4. Intercorrelations between dwell time and measures of age, socialization, current language levels, and language onset for the combined autism spectrum disorder group
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921131146-90137-mediumThumb-S0954579414000108_tab4.jpg?pub-status=live)
Note: BPVS-VMA, British Vocabulary Picture Scale verbal mental age; VABS, Vineland Adaptive Behavior Scales; F-F, face-to-face; B-B, back-to-back.
Table 3 shows average “dwell time,” defined here as the cumulative duration of all fixations within the 10-s interval for each area of interest identified in the pilot study. Dwell time was then analyzed using a 2 × 3 mixed ANOVA with configuration (face-to-face vs. back-to-back) as the within-group factor and group (HFA-LD, HFA-LN, or TD) as the between-group factor. Data were normally distrusted, and assumptions for ANOVA were met.
Analysis of the whole figures
With respect to dwell time for the whole figures, regardless of their configuration, the main effect of group was significant, F (1, 36) = 11.97, p < .01, η2 = 0.040. This effect is qualified by a significant Group × Configuration interaction reported below. The main effect of configuration was significant, F (1, 36) = 8.69, p < .01, η2 = 0.019, with the face-to-face figures receiving longer dwell times (M = 3578 ms) than the back-to-back figures (M = 3348 ms). This result further supported our pilot study in showing that interacting human figures elicit longer viewing times than noninteracting figures.
There was a significant Group × Configuration effect, F (2, 36) = 4.02, p < .05, η2 = 0.018. Figure 2a clearly shows that the interaction reflects a difference in the length of time the groups spent viewing the interacting figures. In relation to dwell times for the back-to-back figures, there was no significant difference among the three groups. However, there was a significant difference in dwell time to the face-to-face figures, F (2, 36) = 15.04, p < .001, η2 = 0.046. Simple comparisons revealed that the TD group looked significantly longer at the face-to-face figures than did the HFA-LD group (p < .01). This result was repeated when the HFA-LD group was compared with the HFA-LN group (p < .001). There was no significant difference between the TD and HFA-LN groups. Figure 2a suggests that while the TD and HFA-LN groups spent more time looking at the interacting figures than at the back-to-back figures, the HFA-LD group showed the reverse pattern.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary-alt:20160921131146-08497-mediumThumb-S0954579414000108_fig2g.jpg?pub-status=live)
Figure 2. (a) Dwell times (ms) and standard error of the mean (SEM) to the human figures for the high-functioning autistic with language delay (HFA-LD), HFA with normal language onset (HFA-LN), and typically developing (TD) groups. (b) Dwell times (ms) and SEM to the head regions of the figures for the HFA-LN, HFA-LD, and TD groups.
Within-group comparisons using Bonferroni corrections showed that there was no significant difference in time between the two experimental conditions for the TD group (p = .10) and the HFL-LD group (p = .81) or the HFL-LN group (p = .06).
Analysis of head region
With respect to dwell time for the head region of the figures, the main effect of group was not significant. This result did not support our hypothesis that the TD and HFA-LN groups would spend more time viewing the head region than would the HFA-LD group. The main effect of configuration was significant, F (1, 36) = 8.14, p < .01,η2 = 0.019, with the head region of face-to-face figures receiving longer dwell times (M = 1526 ms) than that of the back-to-back figures (M = 1430 ms). There was a significant Group × Configuration effect, F (1, 36) = 6.92, p < .01, η2 = 0.028. Similar to the results for the whole figures, only the face-to-face orientation differentiated the groups. Simple comparisons revealed that the TD group looked significantly longer at the face-to-face figures than did the HFA-LD group (p < .05). This result was repeated when the HFA-LD was compared with the HFA-LN group (p < .05). These effects are shown in Figure 2.
Within-group comparisons showed that there was a significant difference in time between the two experimental conditions for the TD group (p = .02), but comparisons were not significant for the HFL-LD group (p = .09) or the HFL-LN group (p = .22).
The outcome of the study suggests that the TD and HFA-LN groups performed similarly to each other. However, the pattern of the data suggests that the HFA-LN group appear to be overcompensating (when compared to the TD group) through the amount of time spent viewing the interacting figures. Table 3 compares the results in the form of a ratio of time spent viewing the back-to-back figures to time spent viewing the face-to-face figures. There was no significant difference for the figures among the three groups, F (2, 39) = 2.94, p = .07. For the faces, there was a significant difference among the three groups, F (2, 39) = 5.75, p = .006. Post hoc tests with Bonferroni corrections revealed a significant difference between the TD and HFA-LD groups (p = .005) and between the HFA-LN and HFA-LD groups (p = .01), but there was no significant difference between the TD and HFA-LN groups (p = .88).
Analysis of first fixations
The number of trials on which the first fixation was directed toward the face-to-face figures was analyzed in order to determine if the groups demonstrated a difference in initial attention allocation. A one-way ANOVA demonstrated that there was a significant difference in allocation of the first fixation between the groups, F (2, 36) = 4.58, p = .02. Post hoc analysis showed a significant difference between the HFA-LN and HFA- LD groups (p <. 02) but no significant differences between the HFA-LN and TD groups, or the HFA-LD and TD groups. The HFA-LN group directed more initial fixations toward the face-to-face figures (M = 32.0, SD = 3.67) than did the HFA-LD group (M = 28.0, SD = 3.56).
Correlational analysis
The small sample size and the number of correlations conducted meant that separate analysis of the HFA-LD and HFA-LN groups would be of little statistical benefit. Therefore, for the purpose of this analysis, we assumed a continuum model of ASD used in the DSM-5. All results are reported as one-tailed tests. Due to the number of correlations conducted, only those meeting an alpha level of below 0.01 are reported as being statistically significant.
Significant positive correlations were observed between the face-face configuration and receptive vocabulary mental age scores (BPVS-II r = .62, p < .01). Correlations carried out on the face-to-face configurations of the whole figures and receptive vocabulary mental age scores (BPVS-II r = .53, p < .01) were statistically significant. We did not find significant correlations between dwell times to the figures and communication or socialization scores on the VABS; although dwell time to the face-to-face head configuration and scores on the VABS socialization scale demonstrated a moderate effect size (r = .51, p = .02). Our hypothesis that time spent viewing the figures would be associated with improved social skills was not fully supported through the correlational analysis, but increased sample size might help confirmation of this preliminary finding. Age and verbal mental age were highly correlated. In order to determine the impact of age on the correlations, a partial correlation was performed controlling for age. In these analyses, the verbal mental age correlation with the face-to-face heads was reduced but remained significant for the head region, but did not reach the .01 alpha value (r = .51, p = .02), and the body region (r = .50, p < .02).
Discussion
The main aim of this study was to investigate social attention in groups of ASD children with and without a history of significant language delay. The most striking finding from the study was that both the pattern and the duration of attentional allocation to human figures distinguished the two groups of individuals with ASD. Regardless of configuration, the HFA-LD group spent less time viewing the sets of human figures than either the TD group or the HFA-LN group. In relation to the configuration of these figures, group differences emerged on time allocated to the face-to-face condition.
Current debate in ASD research focuses on the extent and nature of social deficits in high-functioning individuals. Previous studies have shown that some individuals with ASD demonstrate preferential attention to social stimuli when the stimuli are not dynamic (Sigman, Mundy, Sherman, & Ungerer, Reference Sigman, Mundy, Sherman and Ungerer1986; van der Geest et al., Reference van der Geest, Kemner, Camfferman, Verbaten and van Engeland2002; Willemsen-Swinkels, Buitelaar, Weijnen, & van Engeland, Reference Willemsen-Swinkels, Buitelaar, Weijnen and van Engeland1998). However, these studies typically present participants with a simple discrimination task, namely, object versus person. Such tasks may be too simple for high-functioning individuals with ASD. Fletcher-Watson, Leekam, Benson, Frank, and Findlay (Reference Fletcher-Watson, Leekam, Benson, Frank and Findlay2009) demonstrated that although adults with ASD preferred viewing people-present to people-absent images, their viewing patterns were nevertheless atypical. Our results support and extend previous findings by demonstrating that language ability is associated with specific patterns of social attention in ASD. Our paradigm extends the object versus person studies and suggests that social interactions have differing levels of saliency for viewers. In this respect, language-related differences in attention to social stimuli in intellectually able individuals with ASD may only become apparent when the saliency of competing social objects is the main component of analysis.
Our results, showing that patterns of visual attention are associated with language history, are consistent with those of Norbury et al. (Reference Norbury, Brock, Cragg, Einav, Griffiths and Nation2009). In response to their findings, showing that preserved language skills were not associated with attention to eyes, Norbury et al. proposed that an integration of different social cues might be more important in supporting communication than a reliance on one social cue, for example, the eyes. Further work suggests that social attention in autism may be intricately related to cognitive profiles, with verbally able children relying on different social cues than less able children (Rice, Moriuchi, Jones, & Klin, Reference Rice, Moriuchi, Jones and Klin2012). Our paradigm utilized positioning (interacting or noninteracting) as the primary social cue, and our results showed an association between good language skills and increased attention to the most socially salient configurations. Although strong conclusions about the causal relationship between attention and language development cannot be made on the basis of the current study, the results are consistent with the view that patterns of attention to social stimuli continue to be associated with language development beyond infancy (Dawson et al., Reference Dawson, Toth, Abbott, Osterling, Munson and Estes2004; Klin et al., Reference Klin, Jones, Schultz and Volkmar2005; Morton & Johnson, Reference Morton and Johnson1991). While language onset history may have small effects in adulthood in ASD (see Howlin, Reference Howlin2003), our findings suggest that language delay is associated with language skills and patterns of attention in childhood.
Across our ASD groups, we observed a positive correlation between verbal mental age and time spent viewing the interacting figures, but no such correlation was observed with the back-to-back figures. Inflexible or arbitrarily increased allocation of attention to any social stimuli would not be an effective way to gain pertinent social information. Research shows that social saliency serves to direct attention in neurotypical adults (Crosby, Monin, & Richardson, Reference Crosby, Monin and Richardson2008). Our study demonstrates that some children with ASD are capable of making these types of discrimination when shown simple examples of social situations and that this ability is associated with language competency. An interesting observation from the study was that the HFA-LN group viewed the interacting figures (as compared to the back-to-back figures) proportionally longer than did the TD group, although this comparison was not statistically significant. One theory for these results is that individuals with ASD who show milder language deficits may need to overcompensate for their social deficits in order to improve language performance. Such overcompensation has also been noted in siblings of children with ASD (Belmonte, Gomot, & Baron-Cohen, Reference Belmonte, Gomot and Baron-Cohen2010).
We have argued that the social saliency of the interacting figures was not recognized by the HFA-LD group. Recent research has suggested that time spent viewing social stimuli is positively associated with arousal (Dalton et al., Reference Dalton, Nacewicz, Johnstone, Schaefer, Gernsbacher and Goldsmith2005) and that social stimuli may be less arousing for ASD children with a history of significant language delay (Stagg, Davis, & Heaton, Reference Stagg, Davis and Heaton2013). Interpreted within a developmental framework (Dawson et al., 2005), our results are consistent with the suggestion that some infants with ASD may not find faces stimulating and may fail to develop brain circuitry that enhances the rewarding nature of social stimuli. A more parsimonious explanation of attentional difficulties characterizing ASD is offered by the enhanced perceptual functioning model (Mottron & Burack, Reference Mottron, Burack, Burack, Charman, Yirmiya and Zelazo2001; Mottron, Dawson, Soulières, Hubert, & Burack, Reference Mottron, Dawson, Soulières, Hubert and Burack2006). According to this model, the low-level perceptual qualities of our stimuli may have increased salience for some individuals with ASD, and this may limit the extent to which these individuals would attend to the “story” depicted in the visual image. In our study, attentional control may have been guided by bottom-up processes in the HFA-LD group. In support of this interpretation are results showing that language-delayed individuals with ASD demonstrate enhanced perceptual functioning, for example, superior pure-tone pitch discrimination (Bonnel et al., Reference Bonnel, McAdams, Smith, Berthiaume, Bertone and Ciocca2010) and superior frequency discrimination skills (Jones et al., Reference Jones, Happé, Baird, Simonoff, Marsden and Tregay2009). If this interpretation is correct, our stimuli may act as a visual analog to auditory tasks that are able to discriminate late and typical language onset in ASD.
The results from our study and that of Norbury et al. (Reference Norbury, Brock, Cragg, Einav, Griffiths and Nation2009) suggest that attention to the face region and to salient social configurations are associated with language skills in individuals with ASD. Taken together, these studies suggest that there may be two distinct elements related to social attention and language development. First, attention to the invariant features of the face such as the mouth region may serve very specific functions in language development, such as phoneme discrimination. Here, inattention to faces would directly result in delayed acquisition of this aspect of language. Second, as Klin et al. (Reference Klin, Jones, Schultz, Volkmar and Cohen2002) proposed, individuals need to embed communication within a social framework. Thus, for individuals with ASD who demonstrate reduced attention to socially relevant information, communication will not become socially contextualized. In our study, language skills correlated with increased attention to the socially relevant parts of scenes rather than increased attention to the head region, or the figures in general, and this suggests that for the ASD participants without language delay, communication is more socially contextualized.
It may then be expected that improved social attention skills would relate to a general advantage in social competency. Similarly to Norbury et al. (Reference Norbury, Brock, Cragg, Einav, Griffiths and Nation2009), we did not find strong evidence for an association between dwell times and scores on the VABS socialization scale. These results may be due to the relatively small sample size used in our study. In studies that have found a relationship between looking patterns and social ability, language and social context have been highly salient in the stimuli (e.g., Klin et al., Reference Klin, Jones, Schultz, Volkmar and Cohen2002). In contrast, our stimuli specifically depicted social and nonsocial interactions, and we analyzed gross looking patterns rather than fixation to finer details such as the mouth and eye regions.
Limitations
Sample size was small, and this limits the extent to which the results can be generalized to the wider ASD population. However, because pilot data indicated that we needed a minimum of eight participants to achieve a power of 0.80, we are confident that our null results can be interpreted with some degree of confidence. In order to hold the attention of the child participants, we manipulated the stimuli in Photoshop to make them more cartoonlike. While this reduced naturalistic elements from the stimuli, they retained their social characteristics. Many studies have utilized artificial and schematic social stimuli (Kuhn et al., Reference Kuhn, Benson, Fletcher-Watson, Kovshoff, McCormick and Kirkby2010; Ristic, Friesen, & Kingstone, Reference Ristic, Friesen and Kingstone2002; Ruffman, Garnham, & Rideout, Reference Ruffman, Garnham and Rideout2001), and these would appear to be processed in a similar manner to real-life social stimuli (Britton, Shin, Barrett, Rauch, & Wright, Reference Britton, Shin, Barrett, Rauch and Wright2008; Miall, Gowen, & Tchalenko, Reference Miall, Gowen and Tchalenko2009). The next step will be to replace the figures with actors embedded within social situations. In respect to the groups used in this study, they were matched on symptom severity and nonverbal intelligence; however, they differed significantly on verbal mental age scores. Despite this difference, the HFA-LD group's receptive language was at an age-appropriate level. A positive association between current language skills and language onset history has been demonstrated in a number of studies employing larger sample sizes (Koyama et al., Reference Koyama, Tachimori, Kanai, Kurita, Osada and Shimizu2004; Szatmari et al., Reference Szatmari, Bryson, Duku, Vaccarella, Zwaigenbaum and Bennett2009), and our results both confirm such an association and highlight the value of including both language onset and current language in research designs.
Conclusion
The study reported in this paper demonstrated an association among language onset timing, current language skills, and patterns of social attention in children with ASD. While current criteria used for subtyping ASDs has not been retained in the DSM-5, our results show that individuals with and without a history of significant language delay can be distinguished on the basis of their attentional deficits in childhood. Current research would appear to confirm that attentional abnormalities are present within the first year of development (Elsabbagh et al., Reference Elsabbagh, Fernandes, Jane Webb, Dawson, Charman and Johnson2013), and our results suggest that they persist into late childhood and are associated with language development. Thus, while the DSM-5 offers a more unified definition of ASD, it would be a concern if the causes and correlates of delayed language development were not fully investigated. Advances in this area of research may increase our understanding of the extraordinary heterogeneity characterizing ASD. The question of whether a continued association, after infancy, between patterns of social attention and language development is specific to ASD or generalizes to other types of language delay is an interesting question that warrants further investigation.