INTRODUCTION
Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT; Lovell, Reference Lovell2018) is the most commonly utilized computerized battery for the assessment of cognitive abilities pre- and post-concussion (Kerr et al., Reference Kerr, Snook, Lynall, Dompier, Sales, Parsons and Hainline2015). ImPACT is used to track the severity of sports concussion and monitor recovery to make return-to-play decisions following concussive injury. ImPACT includes the assessment of symptoms associated with concussion using self-report, as well as performance-based neurocognitive abilities. These neurocognitive abilities are evaluated using six different cognitive subtests that contribute to five composite scores (i.e., Verbal Memory, Visual Memory, Visual Motor Speed, Reaction Time, and Impulse Control). Clinical use of ImPACT has become increasingly common in all levels of sport participation, including professional (e.g., NFL, NHL), collegiate (e.g., NCAA), and high school sports, which are the focus of this study.
Baseline (pre-concussion) and post-concussion assessment are used for within-athlete comparisons in the event of a concussion. Because baseline testing establishes a data point for interpreting the magnitude of decline in cognitive functioning following a concussion, it is important that baseline scores are valid. There is much evidence that these scores are susceptible to response sets, with the (albeit relatively uncommon) dissimulation of cognitive deficit or sandbagging being of particular concern (Erdal, Reference Erdal2012; Higgins et al., Reference Higgins, Denney and Maerlender2017). For this reason, ImPACT includes a number of embedded validity indicators that are used to determine whether athletes exerted sufficient effort to produce valid baseline testing scores. In addition to sandbagging, contributors to variable effort identified in the literature include sleep disturbance the night before testing, general lack of appreciation of the importance of baseline testing, and unintentional fluctuations of effort due to other factors (Erdal, Reference Erdal2012; Higgins et al., Reference Higgins, Denney and Maerlender2017; McClure et al., Reference McClure, Zuckerman, Kutscher, Gregory and Solomon2013; Rabinowtitz et al., Reference Rabinowtitz, Merritt and Arnett2015; Schatz et al., Reference Schatz, Elbin, Anderson, Savage and Covassin2017; Walton et al., Reference Walton, Broshek, Freeman, Cullum and Resch2018).
Recent evidence also suggests that high school and collegiate athletes with a history of attention-deficit/hyperactivity disorder (ADHD) and/or academic difficulties (e.g., special education, learning disorder; LD) perform more poorly on ImPACT neurocognitive composite scores than their non-neurodevelopmental diagnosis peers. For example, Elbin et al. (Reference Elbin, Kontos, Kegel, Johnson, Burkhart and Schatz2013) reported that a large sample (n = 938) of high school and collegiate athletes with ADHD, LD, or both (ADHD/LD) demonstrated lower performance on all ImPACT composite scores compared to athletes without NDs. Kaye et al. (Reference Kaye, Sundman, Hall, Williams, Patel and Ketcham2019) also found that Division I and club sport athletes with a history of ADHD performed significantly worse on ImPACT Verbal Memory and Visual Memory Composites compared to athletes without neurodevelopmental diagnoses. Gardner et al. (Reference Gardner, Yengo-Kahn, Bonfield and Solomon2017) also reported that athletes (aged 10–21 years old) with ADHD performed significantly worse than matched controls on ImPACT Verbal Memory, Visual Memory, Visual Motor Speed, and Reaction Time composite scores at baseline and on post-concussion testing. Others also report similar findings for athletes with ADHD and LD (Kaye et al., Reference Kaye, Sundman, Hall, Williams, Patel and Ketcham2019; Manderino & Gunstad, Reference Manderino and Gunstad2018a; Peltonen et al., Reference Peltonen, Vartiainen, Laitala-Leinonen, Koskinen, Luoto, Pertab and Hokkanen2019; Poysophon & Rao, Reference Poysophon and Rao2018; Salinas et al., Reference Salinas, Dean, LoGalbo, Dougherty, Field and Webbe2016). Some suggest that athletes with neurodevelopmental disorders (NDs) demonstrate lower performance possibly due to the underlying attention and reading requirements of computerized neurocognitive testing (Lovell, Reference Lovell2018; Schatz et al., Reference Schatz, Scolaro Moser, Solomon, Ott and Karpf2012). This is important because current embedded validity indicators use cut scores that identify lower than expected levels of performance based on healthy, non-ADHD/LD samples. Therefore, use of these cut scores may result in increased estimates of invalid baseline performance when applied to those with neurodevelopmental conditions. Little research is available that examines this matter, but the studies that are available support this concern (Manderino & Gundstad, Reference Manderino and Gunstad2018a; Manderino et al., Reference Manderino, Zachman and Gunstad2019).
Manderino and Gunstad (Reference Manderino and Gunstad2018a) examined ImPACT standard embedded validity indicators in NCAA collegiate athletes with ADHD (n = 65), academic difficulties (n = 53), or ADHD and academic difficulty (n = 19). When compared to a control sample with no history of ADHD or academic difficulty, the ADHD groups were more likely to have invalid baseline scores. In a follow-up study examining NCAA athletes, Manderino et al. (Reference Manderino, Zachman and Gunstad2019) found that lowering the criterion scores for novel embedded validity indicators (Schatz & Glatts, Reference Schatz and Glatts2013) resulted in lower rates of invalid baseline scores for those with ADHD, academic difficulty, and ADHD with academic difficulty. They provide adjusted cut scores for these athletes that might be used in clinical practice to help decrease the identification of valid profiles as invalid. In addition to Schatz and Glatts (Reference Schatz and Glatts2013), other authors have proposed novel validity indicators (e.g., Higgins et al., Reference Higgins, Denney and Maerlender2017) all of which are designed to improve the detection of invalid scores. A recent systematic review by Gaudet and Weyandt (Reference Gaudet and Weyandt2017) indicates that these and other novel embedded indicators do increase the sensitivity of ImPACT to underperformance in analogue studies. The research suggesting increased invalid baseline performance associated with neurodevelopmental diagnosis history (ND) is concerning for obvious reasons, possibly most importantly, because it limits the accuracy of return-to-play decisions by increasing the likelihood of a valid baseline being incorrectly interpreted as unusable (invalid). In these instances, valid information which could be used is not considered in making return-to-play decisions, so an athlete may be held out of play for longer than necessary or returned to play too soon, which could result in a number of negative consequences such as another injury (i.e., second impact) or more severe post-concussive symptoms (Castile et al., Reference Castile, Collins, McIlvain and Comstock2012; Chrisman et al., Reference Chrisman, Rivara, Schiff, Zhou and Comstock2013; Covassin et al., Reference Covassin, Stearne and Elbin2008).
Based on these findings, the current study aims to investigate the following research questions: (1) Do the standard and novel embedded validity indicators identify higher rates of invalid baseline performance in high school athletes with ND compared to those with no ND history? (2) Based on previous literature suggesting age (Abeare et al., Reference Abeare, Messa, Zuccato, Merker and Erdodi2018), gender (Schatz et al., Reference Schatz, Scolaro Moser, Solomon, Ott and Karpf2012), and sport-based differences in invalid baseline performance, does the addition of ND predict invalid performance above and beyond these other demographic variables? In addition to these questions, multivariate base rates of criterion failure on validity indicators will be presented for athletes with neurodevelopmental history based on the current ImPACT embedded indicators. Currently, there is no research reporting multivariate base rates for ImPACT validity indicators in athletes with a history of ND. Multivariate base rates allow for consideration of the clinical significance of low scores across multiple tests. Multivariate, rather than a single test score, analysis aids in clinical interpretation of neuropsychological tests results by helping to protect against overinterpretation of one or two low scores within a battery (e.g., Brooks et al., Reference Brooks, Iverson, Holdnack, Holdnack, Drozdick, Weiss and Iverson2013; Cook et al., Reference Cook, Karr, Brooks, Garcia-Barrera, Holdnack and Iverson2019; Holdnack, Reference Holdnack, Goldstein, Allen and DeLuca2019), since normal non-impaired populations often obtain one score in the impaired range (Binder et al., Reference Binder, Iverson and Brooks2009). We examine multivariate base rates in this study, because they may be particularly useful in protecting against overinterpretation of ImPACT validity indicators in athletes with a history of ND. Table 1 provides an overview of the indicators examined in this study.
Note. % = percentage.
* Immediate and delayed conditions are summed for this cut score.
METHODS
Participants
Participants included 33,772 high school athletes aged 13–19 (mean age = 15.0, SD = 1.2; 43.7% female; mean education = 9.0, SD = 1.4) from a larger longitudinal, statewide sample of athletes of who completed ImPACT to establish a pre-concussion baseline prior to beginning their respective sport season between 2008 and 2016. Athletes from the larger longitudinal sample were not included in this study if they reported a history of epilepsy, brain surgery, meningitis, treatment for substance and/or alcohol use, and treatment for a psychiatric disorder (e.g., depression, anxiety). The athletes were then separated into the following distinct categories based on self-reported diagnostic history: ADHD only (3.8%), LD only (1.5%), comorbid ADHD/LD (0.7%), autism (inclusive of comorbid autism with ADHD/LD; 0.2%), special education history with no diagnosis reported (SpED; 1.3%), and healthy athletes with no diagnosis (92.5%). Overall, 7.1% of the baselines were judged to be invalid based on the standard embedded ImPACT indicators that flag an athlete’s report as “Baseline++”. Sport type was classified as collision, contact, limited contact, or no contact based on previous research denoting these categories for each sport reported (Brett & Solomon, Reference Brett and Solomon2017, Rice, Reference Rice2008). Collision sports (e.g., football) are those with a purposeful collision with other players/objects and consisted of 33.3% of the sample. Contact sports (e.g., soccer) involve routine contact with other players/objects but no purposeful collision (31.5% of the sample). Limited contact sports (e.g., volleyball) have less frequent contact with other players/objects and (18.9%), and noncontact sports (e.g., golf) have rare or unpredicted contact with other players/objects (16.4%).
Overall, 8.7% of the sample had a self-reported concussion history. Analyses were performed with and without athletes who report a concussion history, with the same substantive conclusions found both ways. Given this, the results reported here are from the full sample (including those with concussion history).
Measure
ImPACT utilizes six subtests – Word Memory, Design Memory, X’s and O’s, Symbol Match, Color Match, and Three Letters – to assess various aspects of cognitive performance. Scores from these subtests are averaged to form the five composite scores – Verbal Memory, Visual Memory, Visual Motor Speed, Reaction Time, and Impulse Control. In clinical settings, four of these composite scores (Verbal Memory, Visual Memory, Visual Motor Speed, and Reaction Time) are used to track the acute effects of the concussion as well as the speed of recovery following concussive injury. ImPACT has demonstrated adequate reliability with test–retest reliability with intraclass correlation coefficients between .62 and .85 (Elbin et al., Reference Elbin, Schatz and Covassin2011).
Athlete performance is considered to be invalid if one of the athlete’s scores falls above (or below depending on the cut score) any of the five standard ImPACT embedded invalidity indicators. See Table 1 for an overview of the indicators used in this study.
Procedure
ImPACT was administered by trained school personnel in group settings at the high schools where the athletes participated in their sports. School personnel was trained in how to properly administer ImPACT (e.g., keep a quiet testing environment, etc.) and standard instructions were utilized prior to the administration of the test (ImPACT includes instructions on the screen for the athlete). Baseline assessment was conducted in groups with spaces between each athlete completing testing. This study utilized de-identified archival data, which was deemed exempt by the local social/behavioral institutional review board for the protection of human subjects.
Statistical Analyses
To answer the first research question (Are baseline scores for athletes with a history of ND more likely to be flagged as invalid?), chi-square analysis was used to compare the rate of baseline performance that is deemed invalid between groups using standard ImPACT embedded indicators as well as the two additional novel indicators (Higgins et al., Reference Higgins, Denney and Maerlender2017; Manderino et al., Reference Manderino, Zachman and Gunstad2019). Post hoc comparisons were made using Z tests of two proportions with Benjamini and Hochberg (Reference Benjamini and Hochberg1995) correction to reduce the likelihood of Type 1 error due to multiple comparisons (the false discovery rate was set at .05). Odds ratios were computed for each of the diagnoses that were significantly different (p < .05) on post hoc testing.
To answer the second research question (Do demographic characteristics such as age, sex, sport, and neurodevelopmental history predict performance that is flagged as invalid?), a binomial logistic regression was conducted to determine if demographic characteristics (age, sex, sport, and neurodevelopmental history) predicted performance deemed invalid based on the standard ImPACT indicators.
Additionally, multivariate base rates were computed for the standard ImPACT indicators and were stratified by age, sex, and neurodevelopmental history. Multivariate base rates were computed following the dichotomization of clinical diagnosis (i.e., those with or without a history of ADHD or non-ADHD neurodevelopmental condition).
RESULTS
Frequency of Invalid Neurodevelopmental Baselines
Table 2 presents the results of the chi-square tests. Post hoc analyses and the odds ratios for each significant comparison are presented in Table 3.
Abbreviations: ADHD = attention-deficit/hyperactivity disorder; LD = learning disorder; ADHD/LD = comorbid ADHD and LD; SpED = special education history with no other neurodevelopmental disorder reported.
* p < .001.
Note. Odds ratios are only presented for those post hoc comparisons that were significant. Significance levels were adjusted based on the Benjamini–Hochberg correction for multiple comparisons.
Abbreviations: ADHD = attention-deficit/hyperactivity disorder; LD = learning disorder; ADHD/LD = comorbid ADHD and LD; SpED = special education history with no other neurodevelopmental disorder reported.
*p < .05. **p < .01. ***p < .001.
Standard ImPACT indicators
The rate of baseline performance that was flagged as invalid differed significantly based on diagnosis history (χ2 = 128.95, p < .001). Post hoc analyses revealed that when compared to athletes with no ND history, the baseline scores of athletes with a history of ADHD, LD, comorbid ADHD/LD, or SpED were significantly more likely to be flagged as invalid based on standard embedded ImPACT validity indicators (i.e., one or more cut score failure). As seen in Table 3, athletes with ADHD had more invalid baselines than did healthy athletes, but they also displayed significantly fewer invalid baselines than those with LD or ADHD/LD. All other groups did not differ significantly from each other.
Higgins et al. (Reference Higgins, Denney and Maerlender2017) indicators
The rate of baseline performance that was flagged as invalid differed significantly based on diagnosis history (χ2 = 206.34, p < .001). Post hoc analyses revealed that, when compared to athletes without ND history, athletes with ADHD, LD, comorbid ADHD/LD, or SpED were significantly more likely to be flagged as invalid based on the failure of one or more of the Higgins et al. (Reference Higgins, Denney and Maerlender2017) cut score indicators. With the exception of the autism group, all of the ND groups had significantly higher rates of invalid baseline performance compared to controls, with the ADHD group having significantly more invalid baselines scores than healthy athletes, but significantly fewer invalid baselines than the LD, ADHD/LD, or SpED groups (see Table 3). The autism group had significantly fewer invalid baselines than the ADHD/LD group, but did not differ significantly from any other groups. All other groups did not differ significantly from each other.
Manderino et al. (Reference Manderino, Zachman and Gunstad2019) indicators
The rate of baseline performance that was flagged as invalid differed significantly based on diagnosis history (χ2 = 128.95, p < .001). Post hoc analyses revealed that when compared to non-ND athletes, those with ADHD, LD, comorbid ADHD/LD, or SpED were significantly more likely to be flagged as invalid based on failure of one or more of the Manderino et al. (Reference Manderino, Zachman and Gunstad2019) cut score indicators. The athletes with ADHD had significantly more invalid baselines than did healthy athletes, but they also displayed significantly fewer invalid baselines than the LD, ADHD/LD, or SpED groups (see Table 3). All other groups did not differ significantly from each other.
Prediction of Invalid Baselines Using Demographic Variables
To assess the effect of demographic factors (i.e., age, sex, sport, and having a neurodevelopmental diagnosis) on the prediction of performance that is deemed invalid (using standard ImPACT indicators), a binomial logistic regression was conducted. The full model (using age, sex, sport, and neurodevelopmental history to predict invalid performance) was statistically significant (χ2(10) = 155.75, p < .001), but explained only 1.2% of the variance (Nagelkerke R 2). All four predictor variables were significant (see Table 4). Compared to females, males had 1.14 times higher odds to have performance flagged invalid. Decreasing age was associated with increased odds of invalid baseline. Collision sports had 1.25 times higher odds of invalid performance relative to noncontact sports. Logistic regression indicated neurodevelopmental diagnoses were associated with decreased rates of invalid performance; however, this result is based on the number of invalid tests rather than the proportion of invalid tests within each diagnostic group (which a chi-square test would assess as we have above). Specifically, each neurodevelopmental group’s sample size gets progressively smaller (as does the number of invalid cases), but the relative proportion of invalid to valid cases increases as described above in the chi-square results. Irrespective of this specific interpretation of the Wald Test and the odds ratio of receiving invalid performance, neurodevelopmental diagnosis, overall, was still predictive in the binary logistic regression model (p < .001).
Note. Gender is for males compared to females. For sport category, noncontact sports are the reference group. ND history = neurodevelopmental diagnosis. For ND history, healthy controls are the reference group.
Abbreviations: ADHD = attention-deficit/hyperactivity disorder; LD = learning disorder; ADHD/LD = comorbid ADHD and LD; SpED = special education history with no other neurodevelopmental disorder reported.
Multivariate Base Rates
Multivariate base rates are presented for the standard embedded ImPACT validity indicators. Currently, ImPACT automatically flags athletes who fall below or above the cut-off point for one or more indicators (Table 1 specifies these cut scores). In order to maximize sample size to increase the stability of multivariate base rates for the athletes with ND, we combined athletes across non-ADHD ND diagnoses and stratified rates for non-ADHD ND and healthy, non-ND athletes (see Table 5). Due to differences between ADHD and other ND groups, multivariate base rates for ADHD athletes are presented separately in Table 6. To determine stratification of age for multivariate base rates, chi-square tests (with post hoc analyses using the Benjamini–Hochberg correction) were conducted to assess significant differences between age groups (e.g., the proportion of failed cut scores in 13- vs. 14-year old, 14- vs. 15-year old, etc.). Results revealed that the rate of cut score failure differed between age groups (χ2 = 213.08, p < .001). Post hoc analyses indicated that rates of cut score failure did not differ for the following age groups – 13- and 14-year-old athletes (p > .05), 15- and 16-year-old athletes (p > .05), and 17- and 18-year-old athletes (p > .05). The following age groups did differ significantly in terms of frequency of cut score failure – 14- and 15-year-old athletes (p < .001) and 16- and 17-year-old athletes (p < .05). A chi-square test (with Bonferroni correction for Type 1 error) also indicated that stratification for multivariate base rates was needed for male versus female athletes (χ2 = 39.30, p < .001). Due to a low number of athletes (n = 34), the 19-year-old age group was excluded from these analyses. The 17- to 18-year-old females with non-ADHD ND history (n = 54) and ADHD-only history (n = 29) had small sample sizes. Due to this, the multivariate base rates for these groups are reported but should be interpreted with caution. In some cases, for the neurodevelopmental groups, cells were empty or values did not conform to the expectation of decreasing frequency with an increasing number of validity indicators. In these cases, the data was smoothed and cell values were replaced with estimates based on the average decrease expected based on adjacent cells.
Note. Multivariate base rates are presented as a percentage of the subsample that displayed a particular number of invalid embedded validity indicators. Frequencies presented are based on the failure of standard ImPACT validity criteria. ND = neurodevelopmental history including LD, ADHD/LD, or special education history with no other neurodevelopmental disorder reported (ADHD is presented separately in Table 6). NV IF = number of validity indicators failed. In some cases, for the neurodevelopmental groups, cells were empty or values did not conform to the expectation of decreasing frequency with an increasing number of validity indicators. In these cases, the data was smoothed and cell values were replaced with estimates based on the average decrease expected based on adjacent cells.
a 0.3 replaced with 1.1.
b 0.0 replaced with 2.0.
c 0.0 replaced with 3.8.
Note. Multivariate base rates are presented as a percentage of the subsample that displayed a particular number of invalid embedded validity indicators. Frequencies presented are based on the failure of standard ImPACT validity criteria. NV IF = number of validity indicators failed. In some cases, for the neurodevelopmental groups, cells were empty or values did not conform to the expectation of decreasing frequency with an increasing number of validity indicators. In these cases, the data was smoothed and cell values were replaced with estimates based on the average decrease expected based on adjacent cells.
a 0.0 replaced with 6.9.
Table 5 presents the multivariate base rates for the failure of standard embedded ImPACT validity indicators stratified by age, gender (where appropriate), and non-ADHD ND history (separate multivariate base rates are presented for the athlete with ADHD only in Table 6). For healthy athletes across all age stratifications and genders, the percentage of failed validity cut scores ranged from 0.1% (four or more cut scores failed) to 6.0% (one cut score failed). For athletes with LD, ADHD/LD, or SpED history across the groups, the percentage of failed validity cut scores ranged from 0.0% (four or more cut scores failed) to 12.4% (one cut score failed). For athletes with ADHD, the percentage of failed validity cut scores ranged from 0.0% (four or more cut scores failed) to 10.3% (one cut score failed). On average (across all NDs), having a neurodevelopmental diagnosis was associated with a 146% increase in the frequency of failed cut scores compared to healthy athletes.
DISCUSSION
Results of the current study demonstrate that athletes who have a history of ADHD, LD, comorbid ADHD/LD, or SpED history are significantly more likely than healthy athletes to obtain ImPACT baseline scores that are flagged as invalid by the standard ImPACT validity indicators as well as two novel indicators proposed in the literature (Higgins et al., Reference Higgins, Denney and Maerlender2017; Manderino et al., Reference Manderino, Zachman and Gunstad2019). Athletes with ADHD have more baselines flagged as invalid compared to healthy controls, but fewer when compared to those with LD, ADHD/LD, or SpED history, suggesting that there may be factors associated with ADHD that resulted in somewhat spared performance across all of the validity indicators we examined. For example, individuals with ADHD commonly take medications to ameliorate their cognitive and behavioral symptoms, which may result in fewer interfering effects on ImPACT validity indicators when compared to the cognitive deficits (e.g., reading impacts) for the other neurodevelopmental groups. Prior studies report mixed findings regarding medication use’s effects on ImPACT performance (Cook et al., Reference Cook, Karr, Brooks, Garcia-Barrera, Holdnack and Iverson2019; Gardner et al., Reference Gardner, Yengo-Kahn, Bonfield and Solomon2017; Polysophon & Rao, Reference Poysophon and Rao2018). Although medication data was not available in this study, it is expected that most athletes who reported a history of ADHD were also taking medications to improve attention, which contrasts with the untreated cognitive deficits associated with LD. These deficits may have suppressed overall performance on the ImPACT, which caused the increased frequency of baseline performance to be flagged as invalid in the ADHD/LD and LD groups. It is also important to note that we did not know the type of learning disability experienced by the athletes in this study, although language-based LDs are the most common in the general population (American Psychiatric Association, 2013) and in children diagnosed with ADHD (Mayes & Calhoun, Reference Mayes and Calhoun2006; Parke et al., Reference Parke, Thaler, Etcoff and Allen2015). Future research comparing athletes with different types of confirmed LDs would provide more definitive evidence regarding the influence of specific deficits on rates of invalid baseline performance.
While the overall sample in this study had an invalid baseline rate of 7.1%, when assessed by diagnostic group, athletes with NDs were between 1.2 and 2.7 times as likely as healthy athletes to have their baseline performance flagged as invalid, with between 9.7% and 54.3% flagged. Comorbid ADHD/LD consistently demonstrated the highest rates of invalid baselines across the standard ImPACT, Higgins, and Manderino indicators. The large discrepancies found between the validity indices in this study are consistent with a prior investigation that have found the Higgins indicators to identify substantially more baselines as invalid than the standard ImPACT indicators in naturalistic samples (Abeare et al., Reference Abeare, Messa, Zuccato, Merker and Erdodi2018). Because the standard ImPACT and Higgins indicators were created using healthy athletes, it may not be particularly surprising that athletes with NDs are more likely to have performance that is flagged as invalid, not necessarily because of performance validity issues, but possibly due to underlying attention and reading requirements of ImPACT (Lovell, Reference Lovell2018; Schatz et al., Reference Schatz, Scolaro Moser, Solomon, Ott and Karpf2012). Given that the Manderino indicators were created specifically for people with ADHD and those with academic difficulty, it is surprising, however, that athletes with these problems are still more likely than healthy athletes to have baseline performance flagged as invalid. Although these indicators (Higgins et al., Reference Higgins, Denney and Maerlender2017; Manderino et al., Reference Manderino, Zachman and Gunstad2019) have demonstrated higher positive predictive value than the standard ImPACT embedded validity indicators (Higgins et al., Reference Higgins, Denney and Maerlender2017; Manderino & Gunstad Reference Manderino and Gunstad2018b), it is concerning that the baselines for athletes with ND are still disproportionately identified as invalid. These results are consistent with recent literature suggesting high rates of invalid baselines among even athletes without ND in naturalistic and coached (i.e., when athletes are instructed to “fake bad” in laboratory settings) samples. Some of these studies report that between 28% and 83% of athletes demonstrate baselines performance that is flagged as invalid (Abeare et al., Reference Abeare, Messa, Zuccato, Merker and Erdodi2018; Gaudet & Weyandt, Reference Gaudet and Weyandt2017; Raab et al., Reference Raab, Peak and Knoderer2020). It has also been suggested that current standard embedded validity indicators may miss up to 20% or more of invalid baselines, which is obviously much higher than is ideal for effective management of sport concussions and well-founded return-to-play decisions (Gaudet & Weyandt, Reference Gaudet and Weyandt2017; Raab et al., Reference Raab, Peak and Knoderer2020). These rates are concerning given the potential complicating implications for clinical practice (including the need to reassess athletes who have an initial invalid performance; Schatz et al., Reference Schatz, Kelley, Ott, Solomon, Elbin, Higgins and Scolaro Moser2014) because they diminish the utility of baseline comparison scores. The current study, in conjunction with previous literature (Manderino & Gunstad, Reference Manderino and Gunstad2018a; Manderino et al., Reference Manderino, Zachman and Gunstad2019), suggests that the current method of identifying invalid baselines may not be appropriate for athletes with ND history. It is also possible that rates of invalid performance may be inherently increased in ND populations; however, additional research is needed using coached samples to examine the root of these issues.
Indeed, while athletes with ND are more likely to have performance flagged as invalid, the factors that contribute to this increase are not well understood. Our examination of key demographic factors indicated that age, sex, sport, and ND were predictive of invalid performance in a binomial logistic regression model. While significant predictors, the combination of these demographic factors accounted for a very small portion of the variance (1.2%) of invalid baseline performance suggesting that there are other more substantive factors to consider that have not yet been identified. One of the most obvious factors that have not been systematically investigated is the influence of cognitive deficits associated with ADHD and ND on increased rates of invalid baseline performance. Other investigators have suggested that performance validity is a multifaceted phenomenon and that factors such as overall cognitive abilities, gender, fatigue, level of supervision during testing, group versus individual testing, return-to-play incentives, and sport season can all impact baseline performances (Alsalaheen et al., Reference Alsalaheen, Stockdale, Pechumer and Broglio2016). Even given this, it is still concerning that neurodevelopmental history was predictive of invalid baseline performance.
Currently, invalidity on ImPACT is based on the failure of one or more of the embedded validity indicators (for both the standard and novel criteria). In the broader literature on performance validity, there has been extensive discussion about the number of failures that should constitute invalid performance (for a very comprehensive review, the interested reader is referred to Lippa, Reference Lippa2018). A number of authors have suggested that failure of one or more tests of performance validity is sufficient to deem a case as invalid (Inman & Berry, Reference Inman and Berry2002; Iverson & Franzen, Reference Iverson and Franzen1996; Vickery et al., Reference Vickery, Berry, Dearth, Vagnini, Baser, Cragar and Orey2004), while other authors argue that one failure is not enough and that two or more failures are required for determination of invalid responding (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Iverson, Holdnack, Holdnack, Drozdick, Weiss and Iverson2013). Indeed, the positive predictive values and specificity associated with utilizing two or more failures result in greater confidence in the designation of a profile as invalid (Chafetz, Reference Chafetz2011; Larrabee Reference Larrabee2014; Lippa, Reference Lippa2018). Due to normal intraindividual variability in test performance, it has been demonstrated that the more tests performed, the more likely it is that an individual will have one low score based on chance alone, and so utilization of only one cut score to indicate failure would likely result in false-positive errors (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Iverson, Holdnack, Holdnack, Drozdick, Weiss and Iverson2013). In this way, a criterion of two or more performance validity test failures (rather than a single failure alone) certainly makes more sense for identifying an invalid profile (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Iverson, Holdnack, Holdnack, Drozdick, Weiss and Iverson2013).
Results of the current study also coincide with the recommendation that two or more failures may be necessary for invalid performance designation. To our knowledge, our use of multivariate base rates of invalid classification is the first reported in the literature for athletes with ND. Our results demonstrated that (for standard ImPACT indicators) between 4.9% and 5.8% of healthy controls had one score that fell outside the range of cut-off thresholds (see Table 5). When athletes with ADHD only were considered (see Table 6), these rates were between 5.2% and 10.3%, and when other non-ADHD ND groups (i.e., LD, ADHD/LD, or SpED history) were considered, these rates rose to between 5.6% and 12.4%. Multivariate base rates can be used in clinical settings to protect against overinterpretation of a single low score, which commonly occurs in unimpaired individuals within a battery of tests (Brooks et al., Reference Brooks, Iverson, Holdnack, Holdnack, Drozdick, Weiss and Iverson2013; Cook et al., Reference Cook, Karr, Brooks, Garcia-Barrera, Holdnack and Iverson2019; Holdnack, Reference Holdnack, Goldstein, Allen and DeLuca2019). As an example, if a 15-year-old male athlete who has comorbid ADHD/LD takes ImPACT and his profile is designated as invalid based on one of the standard indicators, a neuropsychologist could use Table 5 to see that 10.4% of athletes in this demographic group had one invalid indicator score, which suggests that this is a somewhat common phenomenon based on the results of the current study that the standard embedded validity indicators are biased towards identifying this group as invalid more frequently. In this case, it is not clear whether this profile is truly invalid. On the other hand, if this 15-year-old male with comorbid ADHD/LD obtained two invalid indicators (base rate; 4.0%), it is much more likely that his performance is invalid due to this level of failure occurring infrequently. Using a two or more failure criteria for ImPACT scores, in conjunction with the current multivariate base rates, particularly for those with neurodevelopmental conditions, allows for increased clinical confidence in invalid profile designation given that, in most cases, less than 5% of the sample had more than two validity indicators flagged. The use of these two or more criteria (vs. the current one or more failure designation) still needs to be tested in a rigorous experimental investigation so that sensitivity, specificity, and positive predictive values can be provided. However, at this point, it is certainly clear that athletes with neurodevelopmental conditions are much more likely to be flagged as invalid by the current ImPACT validity indicators, which may be more of a testament to the underlying reading and/or attentional difficulties in this population rather than true performance validity concerns. Our multivariate base rate analyses utilized a combined ND sample in order to maximize sample size and increase the stability of the rates for athletes with ND. Given differences in invalid baselines within the ND groups, particularly the ADHD group compared to the other groups, future research with large samples would be quite useful in determining whether MVBRs should be separately calculated for each diagnostic category.
Given that those with NDs make up a large minority of the population of athletes around the country who are tested with ImPACT, it is critical that measurement and interpretation of baseline performance accurately portray cognitive functioning and performance validity in this population. The current study sheds light on the difficulties of invalidity classification in this population; however, future research is still needed to elucidate the differences in cognitive performance in athletes with neurodevelopmental conditions and to update clinical recommendations for baseline interpretation in this population. As is the case with the provision of separate ImPACT normative data for athletes with LD and ADHD, separate cut scores may need to be developed for more accurate classification of performance validity in groups with these and other NDs. The current study, along with others (Elbin et al., Reference Elbin, Kontos, Kegel, Johnson, Burkhart and Schatz2013; Gardner et al., Reference Gardner, Yengo-Kahn, Bonfield and Solomon2017; Manderino et al., Reference Manderino, Zachman and Gunstad2019; Manderino & Gunstad, Reference Manderino and Gunstad2018a; Zuckerman et al., Reference Zuckerman, Lee, Odom, Solomon and Sills2013), serve to spark future endeavors that can continue to enhance the measurement and outcomes for athletes with neurodevelopmental conditions worldwide.
ACKNOWLEDGMENTS
No funding was received for this research.
CONFLICT OF INTEREST
The authors have nothing to disclose.