Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-02-06T06:39:49.438Z Has data issue: false hasContentIssue false

Multivariate Base Rates of Low Scores and Reliable Decline on ImPACT in Healthy Collegiate Athletes Using CARE Consortium Norms

Published online by Cambridge University Press:  05 July 2019

Zac M. Houck*
Affiliation:
Department of Clinical and Health Psychology, University of Florida, P.O. Box 100165, Gainesville, Florida 32610, USA
Breton M. Asken
Affiliation:
Department of Clinical and Health Psychology, University of Florida, P.O. Box 100165, Gainesville, Florida 32610, USA Department of Psychiatry and Human Behavior, Warren Alpert Medical School of Brown University, Providence, Rhode Island 02903, USA
Russell M. Bauer
Affiliation:
Department of Clinical and Health Psychology, University of Florida, P.O. Box 100165, Gainesville, Florida 32610, USA
Anthony P. Kontos
Affiliation:
Department of Orthopaedic Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA
Michael A. McCrea
Affiliation:
Department of Neurology, Medical College of Wisconsin, Milwaukee, Wisconsin 53266, USA
Thomas W. McAllister
Affiliation:
Department of Psychiatry, Indiana University School of Medicine, Indianapolis, Indiana 46202, USA
Steven P. Broglio
Affiliation:
Department of Kinesiology, NeuroTrauma Research Laboratory, University of Michigan Injury Center, University of Michigan, Michigan 48109, USA
James R. Clugston
Affiliation:
Department of Community Health and Family Medicine, University of Florida, Gainesville, Florida 32610, USA
Care Consortium Investigators
Affiliation:
Department of Clinical and Health Psychology, University of Florida, P.O. Box 100165, Gainesville, Florida 32610, USA Department of Psychiatry and Human Behavior, Warren Alpert Medical School of Brown University, Providence, Rhode Island 02903, USA Department of Orthopaedic Surgery, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA Department of Neurology, Medical College of Wisconsin, Milwaukee, Wisconsin 53266, USA Department of Psychiatry, Indiana University School of Medicine, Indianapolis, Indiana 46202, USA Department of Kinesiology, NeuroTrauma Research Laboratory, University of Michigan Injury Center, University of Michigan, Michigan 48109, USA Department of Community Health and Family Medicine, University of Florida, Gainesville, Florida 32610, USA
*
Correspondence and reprint requests to: Zac M. Houck, Department of Clinical and Health Psychology, University of Florida, P.O. Box 100165, Gainesville, FL 32610, USA. E-mail: zhouck@phhp.ufl.edu
Rights & Permissions [Opens in a new window]

Abstract

Objectives: To describe multivariate base rates (MBRs) of low scores and reliable change (decline) scores on Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) in college athletes at baseline, as well as to assess MBR differences among demographic and medical history subpopulations. Methods: Data were reported on 15,909 participants (46.5% female) from the NCAA/DoD CARE Consortium. MBRs of ImPACT composite scores were derived using published CARE normative data and reliability metrics. MBRs of sex-corrected low scores were reported at <25th percentile (Low Average), <10th percentile (Borderline), and ≤2nd percentile (Impaired). MBRs of reliable decline scores were reported at the 75%, 90%, 95%, and 99% confidence intervals. We analyzed subgroups by sex, race, attention-deficit/hyperactivity disorder and/or learning disability (ADHD/LD), anxiety/depression, and concussion history using chi-square analyses. Results: Base rates of low scores and reliable decline scores on individual composites approximated the normative distribution. Athletes obtained ≥1 low score with frequencies of 63.4% (Low Average), 32.0% (Borderline), and 9.1% (Impaired). Athletes obtained ≥1 reliable decline score with frequencies of 66.8%, 32.2%, 18%, and 3.8%, respectively. Comparatively few athletes had low scores or reliable decline on ≥2 composite scores. Black/African American athletes and athletes with ADHD/LD had higher rates of low scores, while greater concussion history was associated with lower MBRs (p < .01). MBRs of reliable decline were not associated with demographic or medical factors. Conclusions: Clinical interpretation of low scores and reliable decline on ImPACT depends on the strictness of the low score cutoff, the reliable change criterion, and the number of scores exceeding these cutoffs. Race and ADHD influence the frequency of low scores at all cutoffs cross-sectionally.

Type
Regular Research
Copyright
Copyright © INS. Published by Cambridge University Press, 2019. 

INTRODUCTION

Sport concussion management routinely implements neurocognitive testing to groups of healthy athletes in a baseline setting and often repeats testing multiple times post concussion. Proper interpretation of both cross-sectional and longitudinal test–retest score variability requires a clear understanding of performance base rates. This is particularly important in the context of neuropsychological test batteries which produce multiple composite scores that bring issues related to multivariate base rates (MBRs) to the forefront (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Strauss, Sherman, Iverson and Slick2009). MBR methodology was initially applied to intelligence tests. Findings indicated that the frequency of low percentile scores increased as a function of the number of subtests administered, such that one or more low scores is common in a healthy sample (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Strauss, Sherman, Iverson and Slick2009). As it relates to baseline testing in athletes, the predominant concern is an inflated rate of false positive “impairment” associated with administration of more tests within a given battery.

Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a widely used computerized neurocognitive measure, which produces four composite scores that are converted to percentiles based on a normative distribution to allow for direct performance comparisons across domains. Importantly, interpreting rates of high or low percentile scores is a direct function of the number of tests administered (i.e., MBRs) (Binder et al., Reference Binder, Iverson and Brooks2009; Brooks et al., Reference Brooks, Strauss, Sherman, Iverson and Slick2009). For example, Iverson and colleagues expectedly found that approximately 15% of healthy subjects will score worse than 1 standard deviation below the mean (<16th percentile) on any single ImPACT composite score. However, 40% will obtain at least one <16th percentile score out of the four composite scores (Iverson & Schatz, Reference Iverson and Schatz2015). Obtaining two or more scores <16th percentile occurs far less frequently (15.2%) and three or more scores <16th percentile is relatively rare (4.8%) (Iverson & Schatz, Reference Iverson and Schatz2015). Put simply, the probability of low scores – Low Average (<25th percentile), Borderline (<10th percentile), and Impaired (≤2nd percentile) – in nonclinical, healthy populations increases as the number of tests administered increases. Understanding base rates of low scores equips clinicians with valuable data that guard against misattribution and false positive diagnoses. There is also a need for describing test–retest performance variability using similar MBR approaches.

Reliable change indices (RCIs) inform the interpretation of score fluctuations in serial assessments, such as annual baseline testing or from baseline to post injury (Barr & McCrea, Reference Barr and McCrea2001; Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018; Hinton-Bayre et al., Reference Hinton-Bayre, Geffen, Geffen, McFarland and Frijs1999; Iverson, Lovell, & Collins, Reference Iverson, Lovell and Collins2003). Statistically derived RCIs indicate whether performance changes (improvement or decline) exceed the estimated range of measurement error surrounding the test–retest difference scores. An RCI provides a test-specific indicator of meaningful performance change, but the meaning of that change is dependent on the number of tests administered (Iverson & Schatz, Reference Iverson and Schatz2015). Iverson and colleagues previously reported that the rate of healthy athletes demonstrating reliable decline within individual composite scores aligned with the expected rate based on the RCI confidence interval (e.g., 2.5% should reliably decline by chance alone using a two-tailed 95% confidence interval). A slightly higher rate (6.5%) of athletes show reliable decline on at least one score out of the four ImPACT composites (Iverson & Schatz, Reference Iverson and Schatz2015).

MBRs of low scores and reliable decline may additionally vary as a function of factors associated with normal test performance variability. Sex (Covassin et al., Reference Covassin, Swanik, Sachs, Kendrick, Schatz, Zillmer and Kaminaris2006; Mormile et al., Reference Mormile, Langdon and Hunt2018), African American race (Houck et al., Reference Houck, Asken, Clugston, Perlstein and Bauer2018; Kontos et al., Reference Kontos, Elbin, Covassin and Larson2010), previous concussions (Covassin et al., Reference Covassin, Elbin, Kontos and Larson2010), neurodevelopmental disorders such as learning disability (LD) and attention-deficit/hyperactivity disorder (ADHD) (Elbin et al., Reference Elbin, Kontos, Kegel, Johnson, Burkhart and Schatz2013; Houck et al., Reference Houck, Asken, Clugston, Perlstein and Bauer2018), as well as psychological distress such as anxiety and depression (Bailey et al., Reference Bailey, Samples, Broshek, Freeman and Barth2010; Weber et al., Reference Weber, Dean, Hoffman, Broglio, McCrea, McAllister, Schmidt, Hoy, Hazzard and Kelly2018) have all been associated with lower baseline neurocognitive performance. Understanding MBRs of low scores and reliable decline within these subgroups is important since “general population” base rate estimates may not generalize well to these clinical populations.

Given this background, this study had two aims. First, we describe the MBRs of Low Average, Borderline, and Impaired scores on ImPACT among a large, diverse sample of collegiate athletes at baseline using recently updated ImPACT normative data from the Concussion Assessment, Research, and Education (CARE) Consortium (Katz et al., Reference Katz, Kudela, Harezlak, McCrea, McAllister, Broglio and Investigators2018). Second, we describe the frequency of reliably declined test scores in healthy athletes using CARE Consortium test–retest reliability data (Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018). We further investigated possible demographic and medical history factors that may influence these base rates. These data may support improved clinical decision making that reduces the risk for over-interpreting individual low scores or single-domain score changes, thus increasing clinical certainty that “real” change has occurred.

METHODS

Data were obtained from the National Collegiate Athletic Association and Department of Defense CARE Consortium, which began prospectively collecting data in 2014. CARE currently consists of 26 colleges and universities as well as four US military service academies, where data on all consenting student-athletes and cadets is collected. The data for the current study consists of only student-athlete data from 24 participating institutions who used ImPACT as their neurocognitive battery (N = 16,512).

Participating sites were expected to enroll at least 90% of their eligible student-athletes and/or cadets and to obtain baseline assessments on at least 75% of enrolled individuals (Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018). The vast majority of consenting student-athletes at participating institutions completed preseason baseline assessments, which include, at a minimum, an assessment of concussion-associated symptoms, neurologic status, neurocognitive functioning, and postural stability. Data from across sites can be aggregated into a robust and diverse normative sample of college athletes. Baseline assessments were completed annually for each year the participant was eligible for the study until 2016, at which point baseline assessments were completed only the first year of participant eligibility. Participating student-athletes also self-reported information on demographics, current and previous sport history, concussion history, family and personal information, and pre-existing medical history.

Detailed descriptions of the CARE Consortium methods (Broglio et al., Reference Broglio, McCrea, McAllister, Harezlak, Katz, Hack and Hainline2017), as well as the normative (Katz et al., Reference Katz, Kudela, Harezlak, McCrea, McAllister, Broglio and Investigators2018) and test–retest samples (Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018), are provided elsewhere. The sample size in the current study varies from the above-referenced CARE Consortium studies for several reasons: 1) data collection is ongoing and we used an updated dataset provided upon approval of our data request and 2) exclusion criteria differed in the current study. IRB approval was obtained from all research sites, including the lead study site, with additional approval obtained from the Human Research Protection Office of the Department of Defense.

Measures

Our primary dependent measures, base rates of low scores and reliable decline, were derived from ImPACT composite scores. ImPACT is a computerized neurocognitive testing tool comprised of six subtests that provide the underlying data for the four composite scores: Verbal Memory, Visual Memory, Visual Motor Speed, and Reaction Time (Lovell, Collins, Podell, Powell, & Maroon, Reference Lovell, Collins, Podell, Powell and Maroon2000). Participating institutions provided only valid ImPACT data.

Aim 1: Base Rates of Low Scores in Year 1

Sex-specific percentile scores for each composite score were taken from the CARE consortium normative study (Katz et al., Reference Katz, Kudela, Harezlak, McCrea, McAllister, Broglio and Investigators2018). We will present the frequency at which collegiate student-athletes obtain 0, ≥1, ≥2, ≥3, and 4 “low” composite scores using the following low score cutoffs: <25th percentile (“Low Average”), <10th percentile (“Borderline”), and ≤2nd percentile (“Impaired”) (Katz et al., Reference Katz, Kudela, Harezlak, McCrea, McAllister, Broglio and Investigators2018). Descriptive results of the MBRs of low scores are presented for the total sample (N = 15,909) as well as stratified by demographic and medical covariates: attention-deficit/hyperactivity disorder and/or learning disability (ADHD/LD), anxiety/depression, previous concussions, and race. Concussion history was grouped as follows: no history of concussion, one concussion, or two or more concussions. Race was defined as White/Caucasian, Black/African American (AA), or “Other.” The “Other” category consisted of Asian, Indian-Alaskan, and student-athletes of multiple races, which were combined due to relatively small numbers per individual group (Katz et al., Reference Katz, Kudela, Harezlak, McCrea, McAllister, Broglio and Investigators2018). Participants with missing covariate data were excluded.

Aim 2: Base Rates of Reliable Decline from Year 1 to Year 2

Similar methods were used for reporting the base rate of reliable decline in a test–retest setting (N = 3,576). Broglio and colleagues derived reliable decline scores by applying one-tailed nonparametric confidence intervals (75%, 90%, 95%, and 99%) based on the assumption of non-normal distributions to estimate the degree of certainty of change on each domain (Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018): Verbal Memory (−5, −12, −17, −27), Visual Memory (−6, −14, −18, −28), Visual Motor Speed (−2.1, −4.8, −6.8, −11.2), and Reaction Time (.04, .08, .12, .22). For example, using the 75% confidence interval, a decline of 5 or more points in verbal memory is expected in 25% of the sample. We report the frequency of healthy collegiate athletes exhibiting reliable decline in 0, ≥1, ≥2, ≥3, and 4 ImPACT composite scores, using 75%, 90%, 95%, and 99% confidence intervals.

Statistical Methods

We calculated the percentage with 0, ≥1, ≥2, ≥3, and 4 low scores for sex-specific percentile scores: <25th percentile (“Low Average”), < 10th percentile (“Borderline”), and ≤2nd percentile (“Impaired”). Chi-square analyses were used to assess low score frequency differences across subpopulations. A p-value <.01 was used to determine statistical significance across analyses in an effort to reduce family-wise Type 1 error. Cramer’s V values of <.1, .1, .3, and .5 were considered very small, small, medium, and large effect sizes, respectively (Fritz et al., Reference Fritz, Morris and Richler2012). We also used descriptive statistics to show the frequency of scores that reliably decline in 0, ≥1, ≥2, ≥3, and 4 domains at the 75%, 90%, 95%, and 99% confidence intervals.

RESULTS

The initial dataset contained complete ImPACT baseline assessments from 16,512 student-athletes. See Figure 1 for description of case exclusion process. ImPACT performance did not differ between excluded and included participants. Our first aim evaluating base rates of low scores included 15,909 baseline assessments across 24 sports at 24 institutions (96.4% of the initial sample): 8,546 male (53.5%) and 7,393 female (46.5%). The most common sports were football (19.2%), cross country/track and field (12.3%), and soccer (11.2%). Athletes with multiple baseline assessments had only their first assessment included for the initial aim. Table 1 shows the sample distribution of covariates.

Fig. 1. Process of case removal. ADHD, attention-deficit/hyperactivity disorder; Hx, history; LD, learning disability.

Table 1. Sample descriptive statistics for covariates and outcome variables

AA, African American; ADHD, attention-deficit/hyperactivity disorder; ImPACT, Immediate Post-Concussion Assessment and Cognitive Testing; IQR, interquartile range; LD, learning disability.

Our second aim evaluating base rates of reliable decline across two test sessions included athletes with complete ImPACT baseline assessments in consecutive years (N = 3,576; 51.2% male; median days between testing = 361; IQR, 312.5–371). This sample contained data from 14 institutions and 22 sports, of which the most common were football (16.3%), cross country/track and field (15.4%), and soccer (9.8%).

A priori, approximately 25% of athletes are expected to score in the Low Average range, ~10% are expected to score in the Borderline range, and ~2% are expected to score in the Impaired range for each individual composite score. As seen in Table 2, the percentage of athletes with Low Average, Borderline, and Impaired scores within individual composite scores approximated this expected distribution.

Table 2. Prevalence of low scores on individual composite scores in healthy, uninjured college athletes

AA, African American; ADHD, attention-deficit/hyperactivity disorder; Hx, history; LD, learning disability; %ile, percentile.

Aim 1: Overall Base Rates of Low Scores in Year 1

Base rates below each cutoff score for the total sample and subgroups are presented in Table 3. Overall, athletes obtained ≥1 low composite score with frequencies of 63.4% (Low Average cutoff), 32.0% (Borderline cutoff), and 9.1% (Impaired cutoff). Obtaining ≥2 low scores occurred with frequencies of 34.6% (Low Average), 11.1% (Borderline), and 1.7% (Impaired). Obtaining ≥3 low scores occurred with frequencies of 15.0% (Low Average), 3.3% (Borderline), and .3% (Impaired). Obtaining 4 out of 4 low scores was rare, occurring with frequencies of 4.8% (Low Average), .7% (Borderline), and .0% (Impaired).

Table 3. Prevalence of low scores across subpopulations of healthy, uninjured college athletes

ADHD, attention-deficit/hyperactivity disorder; Hx, history; LD, learning disability.

a Very small effect size (Cramer’s V < .1).

b Small effect size (Cramer’s V = .1–.3).

Aim 1: Base Rates of Low Scores in Demographic and Medical History Subgroups in Year 1

Base rates of low scores varied by subpopulation (Table 3). Males had a slightly higher proportion of Low Average scores relative to females (χ 2(4) = 18.775, Cramer’s V = .034, p = .001). Males had a slightly higher proportion of Low Average scores across individual composites. The largest effect pertained to race, in which Black/AA athletes had a higher proportion of low scores across all low score cutoffs relative to White/Caucasian athletes (Low Average – χ 2(4) = 614.605, Cramer’s V = .211, p < .001; Borderline – χ 2(4) = 501.417, Cramer’s V = .191, p < .001; Impaired – χ 2(4) = 276.347, Cramer’s V = .142, p < .001) and athletes of other races (Low Average – χ 2(4) = 164.297, Cramer’s V = .187, p < .001; Borderline – χ 2(4) = 116.112, Cramer’s V = .157, p < .001; Impaired – χ 2(4) = 56.098, Cramer’s V = .109, p < .001). White/Caucasian athletes had fewer low scores across cutoffs compared to athletes of other races (Low Average – χ 2(4) = 58.103, Cramer’s V = .066, p < .001; Borderline – χ 2(4) = 44.009, Cramer’s V = .057, p < .001; Impaired – χ 2(4) = 30.882, Cramer’s V = .048, p < .001). Black/AA athletes exhibited higher rates of low speed scores compared to memory scores across cutoffs. Athletes with a history of ADHD/LD obtained a higher proportion of low scores across cutoffs than did athletes without a history of ADHD/LD (Low Average – χ 2(4) = 143.112, Cramer’s V = .095, p < .001; Borderline – χ 2(4) = 110.741, Cramer’s V = .083, p < .001; Impaired – χ 2(4) = 60.455, Cramer’s V = .062, p < .001). Athletes with ADHD/LD exhibited a higher proportion of low scores across individual composite scores than did athletes without ADHD/LD. Lastly, the proportion of low scores across cutoffs unexpectedly decreased as the number of previous concussions increased. Athletes with a history of two or more concussions obtained a significantly lower proportion of Low Average (χ 2(8) = 143.112, Cramer’s V = .095, p < .001) and Borderline scores (χ 2(8) = 108.686, Cramer’s V = .083, p < .001) relative to athletes with fewer previous concussions. A higher proportion of White/Caucasian athletes had a history of two or more concussions (7.0%) relative to black/AA athletes (3.6%) and athletes of combined races (5.9%). A higher proportion of athletes with a history of ADHD/LD reported a history of two or more concussions (9.3%) relative to athletes without a history of ADHD/LD (6.0%).

Aim 2: Overall Base Rates of Reliable Decline from Year 1 to Year 2

Data on the amount of score change (year 2 scores – year 1 scores) required for statistically reliable decline at one-tailed 75%, 90%, 95%, and 99% confidence intervals have been published previously using CARE Consortium data. Data in Table 4 demonstrate that the current sample showed a frequency of reliable decline in each individual composite score that generally agrees with expected frequencies.

Table 4. Frequency of reliable decline on individual domains in healthy, uninjured college athletes

Table 5 shows the base rates at which healthy athletes exhibit reliable declines across the four composite scores at each confidence interval. The rate at which profiles with ≥1 score that reliably declined was 66.8%, 32.2%, 18%, and 3.8% at 75%, 90%, 95%, and 99% confidence intervals, respectively. The base rate of ≥2 reliably declined scores was rare at the 90% (6.6%), 95% (2.9%), and 99% (.3%) confidence intervals. The rate of ≥3 and 4 reliably declined scores was exceedingly rare or non-existent across all confidence intervals. Interestingly, the proportion of athletes with scores that reliably declined did not vary as a function of medical or demographic factors.

Table 5. Prevalence of reliable decline in healthy, uninjured college athletes tested at 1 year interval

ADHD, attention-deficit/hyperactivity disorder; CI, confidence interval; Hx, history; LD, learning disability.

DISCUSSION

The purpose of this study was to describe base rates of low scores and reliable decline in healthy college athletes using updated ImPACT normative ranges and reliability data from the CARE Consortium. Rarely will neuropsychologists base clinical decisions on a single score or pair of retest scores. Instead, clinicians use data from across a test battery, which more likely includes sporadic low scores. Awareness of low score frequencies for a given battery provides essential context for interpreting scores as evidence for cognitive decline versus normal performance variability (i.e., reduce type II error risk). The current study demonstrates that, when considering all scores simultaneously, a single low score or reliable decline on at least one composite score is quite common among healthy test-takers. The frequency of one or more Impaired score was considerably lower (9.1%) than Borderline (32.0%) and Low Average scores (63.4%). Comparatively few athletes exhibit low scores and reliable decline on 2, 3, and 4 composites.

Awareness of low score base rates, independent of clinical or demographic status, prevents over-interpretation of an isolated low but valid score. The percentage of athletes with a low score on an individual ImPACT composite score approximated the normative distribution for each percentile cutoff. However, rates increase dramatically when examining low score frequencies on ≥1 out of the four possible scores. Previous work by Iverson and colleagues demonstrated that approximately 56%, 27%, and 7% of adolescent athletes had one or more ImPACT scores in the Low Average, Borderline, and Impaired ranges, respectively (Iverson & Schatz, Reference Iverson and Schatz2015). The current study, using updated ImPACT percentile ranges provided by CARE, demonstrated a similar trend with slightly higher rates of ≥1 Low Average (63.4%), Borderline (32%), and Impaired scores (9.1%). The percentage of athletes with low scores on ≥2 domains was less common, and low scores on ≥3 more domains was rare across cutoffs.

We also examined low score base rates in demographic and medical history subgroups. Black/AA athletes had a higher proportion of ≥1 low score across all cutoffs relative to White/Caucasian athletes and athletes of other races. Athletes with a history of ADHD/LD had a higher proportion of ≥1 low score across all cutoffs compared to those without a history of ADHD/LD. These results are consistent with previous work showing black/AA collegiate athletes and those with ADHD/LD score lower on ImPACT (Elbin et al., Reference Elbin, Kontos, Kegel, Johnson, Burkhart and Schatz2013; Houck et al., Reference Houck, Asken, Clugston, Perlstein and Bauer2018). Together these findings suggest that race-specific normative references are necessary if the goal is to reduce the risk of over-interpreting isolated low scores. This would closely align with traditional neuropsychological assessments that incorporate multiple demographic-specific adjustments when indicated (Heaton et al., Reference Heaton, Miller, Taylor and Grant2004). Specific normative references for clinical populations, like ADHD/LD athletes, may similarly improve characterization of the “expected” distribution of baseline scores.

An unexpected finding was that athletes with a history of ≥2 concussions had a lower proportion of ≥1 low scores across all cutoffs compared to athletes with a history of 1 or 0 concussion(s). It should be obvious that the inverse relationship between number of concussions and proportion of low scores should not be interpreted as suggesting that greater concussion history improves cognitive functioning. Instead, it may be that athletes with more previous concussions have more experience with computerized testing like ImPACT, thus potentially introducing a spurious practice effect. Further, a higher proportion of White/Caucasian athletes reported a history of two or more concussions, so the generally higher scores obtained by White/Caucasian athletes likely also contributed to this unexpected finding.

These findings highlight the importance of considering MBRs along with an athletes’ demographic and medical history when interpreting cross-sectional test scores (either in a baseline or post-concussion setting). Baseline neurocognitive testing is not considered necessary for appropriate sport concussion management (McCrory et al., Reference McCrory, Meeuwisse, Dvorak, Aubry, Bailes, Broglio, Cantu, Cassidy, Echemendia, Castellani and Davis2017). Without baselines, clinicians can utilize standard deficit-measurement logic by comparing post-concussion ImPACT scores to normative references. This applies whether using computerized testing to inform diagnosis or clinical recovery during return-to-sport protocols. There are no consensually agreed-upon rules dictating which low score cutoff or number of low scores should guide clinical decision making, and clinicians might employ several patient-specific considerations. As further discussed below, erring towards minimizing false positives (i.e., emphasizing specificity) versus false negatives (i.e., emphasizing sensitivity) shifts the definition of “low scores” from those <2nd %ile towards scores <25th %ile, respectively. Knowing, for example, that an individual achieving ≥3 scores <25th %ile occurs more frequently (~15% overall) than an individual achieving ≥1 score <2nd %ile (~9% overall) provides empirically derived support for clinician’s implementing patient-specific decisions without a baseline comparison. When a baseline comparison is available, MBR methodology should be considered within a test–retest context.

The current study demonstrated that reliable decline commonly occurs by chance alone on ≥1 composite score, particularly when using liberal confidence intervals (e.g., 75% vs. 95% or 99%). However, having ≥2 scores that reliably decline is less likely, especially at 90% or greater confidence intervals. Iverson and colleagues reported that 6.5% of athletes tested at a one-year interval had ≥1 score that reliably declined at the 95% confidence interval (Iverson & Schatz, Reference Iverson and Schatz2015) compared to 18.0% of athletes in the current study. This may be explained by methodological differences in deriving reliable change cutoffs. Iverson and colleagues based their reliable change scores on a theoretically normal two-tailed distribution in which 2.5% of athletes would be expected to have scores that reliably decline and 2.5% of athletes to have scores that reliably improve within the 95% confidence interval (Iverson & Schatz, Reference Iverson and Schatz2015). Reliable decline scores in the current study were based off a one-tailed distribution in which you would expect 5% of athletes to have scores that reliably decline using a 95% confidence interval (Broglio et al., Reference Broglio, Katz, Zhao, McCrea, McAllister and Investigators2018). Widening the percentage of athletes that were included at the lower end of the distribution likely contributed to the higher percentage of athletes with at least one score that reliably declined.

Based on the one-tailed confidence intervals, between 1% (99% confidence interval) and 25% (75% confidence interval) of healthy athletes are expected to show a statistically reliable decline on a single composite for no identifiable clinical reason. This directly informs false positive rates, or the percentage of individuals demonstrating reliable decline unrelated to concussion effects. For example, applying the 90% confidence interval for each individual composite score will result in a 10% false positive rate for each score. Considering multiple test–retest comparisons at the 90% confidence interval (i.e., the 10% false positive rate across all four scores), the presumed false positive rate ballooned to 32.2% in our study. Reliance on either one low score relative to a normative reference or one isolated reliable decline from baseline increases risk of an incorrect clinical inference of concussion-induced change.

Our data provide more fuel to the heated debates about baseline testing versus using normative comparisons. Interestingly, while race and ADHD/LD history increased rates of low scores based on sex-specific normative percentile cutoffs, these factors did not influence rates of reliable decline in a test–retest setting. This seemingly supports baseline testing in that longitudinal performance variability is consistent across subgroups, while inflated rates of low scores in certain populations likely reflect reliance on imperfect normative references. If, for example, the normative reference for black/AA athletes in this study was other black/AA athletes, rather than the combined sample including all races, then MBRs of low scores may have been more comparable to other race groups. Similarly, the use of an ADHD/LD-specific normative reference group would very likely reduce the rates of low scores in this population. We therefore propose that normative references are most appropriate, and likely equivocal to a baseline, when they are specific to all relevant demographic and medical history factors. Accumulating data indicate that, in addition to sex, both race/ethnicity- and ADHD/LD-specific normative references are necessary.

There are several limitations to consider when interpreting these findings. The reliable decline scores and normative ranges used in the current study were derived from the CARE Consortium. Findings are likely generalizable to primarily Division 1 college athletes, but these base rates may not be appropriate for use in other populations, such as adolescent athletes or non-athlete military cadets. Reliable decline methodology does not account for regression to the mean, so this methodology can be less accurate when applied to athletes who scored unusually low at the initial baseline evaluation. Therefore, groups that have higher rates of low scores at the initial baseline (Black/AA or ADHD/LD) may be less likely to demonstrate reliable decline due to floor effects. Other general limitations include the use of self-report data for covariates such as race, ADHD/LD, anxiety/depression, and concussion history. The potential bias of missing data, while presumably minimal in the present study, may also affect the magnitude or directionality of certain results.

CONCLUSIONS

The MBRs of low scores and reliable decline of a single ImPACT composite score are in accordance with expectations based on the normative distribution. However, the base rates of low scores and reliable decline increase when considering all scores simultaneously. Black/AA race and history of ADHD/LD were associated with higher rates of low scores across cutoffs. The use of these advanced psychometric methods can reduce false positives and strengthen clinical judgements.

SUPPLEMENTARY MATERIALS

To view supplementary material for this article, please visit https://doi.org/10.1017/S1355617719000729

ACKNOWLEDGMENTS

This publication was made possible, in part, with support from the Grand Alliance Concussion Assessment, Research, and Education (CARE) Consortium, funded, in part, by the National Collegiate Athletic Association (NCAA) and the Department of Defense (DOD). The U.S. Army Medical Research Acquisition Activity, 820 Chandler Street, Ford Detrick MD 21702-5014 is the awarding and administering acquisition office. This work was supported by the Office of the Assistant Secretary of Defense for Health Affairs through the Psychological Health and Traumatic Brain Injury Program under Award NO W81XWH-14-2- 0151. Opinions, interpretations, conclusions and recommendations are those of the author(s) and are not necessarily endorsed by the Department of Defense (DHP funds).

CONFLICTS OF INTEREST

The authors have nothing to disclose.

References

REFERENCES

Bailey, C.M., Samples, H.L., Broshek, D.K., Freeman, J.R., & Barth, J.T. (2010). The relationship between psychological distress and baseline sports-related concussion testing. Clinical Journal of Sport Medicine, 20(4), 272277. doi: 10.1097/JSM.0b013e3181e8f8d8 CrossRefGoogle ScholarPubMed
Barr, W.B., & McCrea, M. (2001). Sensitivity and specificity of standardized neurocognitive testing immediately following sports concussion. Journal of the International Neuropsychological Society, 7(6), 693702.CrossRefGoogle ScholarPubMed
Binder, L.M., Iverson, G.L., & Brooks, B.L. (2009). To err is human:“Abnormal” neuropsychological scores and variability are common in healthy adults. Archives of Clinical Neuropsychology, 24(1), 3146.CrossRefGoogle Scholar
Broglio, S.P., Katz, B.P., Zhao, S., McCrea, M., McAllister, T., & Investigators, C.C. (2018). Test–Retest Reliability and Interpretation of Common Concussion Assessment Tools: Findings from the NCAA-DoD CARE Consortium. Sports Medicine, 48(5), 12551268.CrossRefGoogle ScholarPubMed
Broglio, S.P., McCrea, M., McAllister, T., Harezlak, J., Katz, B., Hack, D., Hainline, B., & CARE Consortium Investigators. (2017). A National Study on the Effects of Concussion in Collegiate Athletes and US Military Service Academy Members: The NCAA–DoD Concussion Assessment, Research and Education (CARE) Consortium Structure and Methods. Sports Medicine, 47(7), 14371451.CrossRefGoogle Scholar
Brooks, B.L., Strauss, E., Sherman, E., Iverson, G.L., & Slick, D.J. (2009). Developments in neuropsychological assessment: Refining psychometric and clinical interpretive methods. Canadian Psychology/Psychologie Canadienne, 50(3), 196.CrossRefGoogle Scholar
Covassin, T., Elbin, R., Kontos, A., & Larson, E. (2010). Investigating baseline neurocognitive performance between male and female athletes with a history of multiple concussion. Journal of Neurology, Neurosurgery and Psychiatry, 81(6), 597601. doi: 10.1136/jnnp.2009.193797 CrossRefGoogle ScholarPubMed
Covassin, T., Swanik, C.B., Sachs, M., Kendrick, Z., Schatz, P., Zillmer, E., & Kaminaris, C. (2006). Sex differences in baseline neuropsychological function and concussion symptoms of collegiate athletes. British Journal of Sports Medicine, 40(11), 923927; discussion 927. doi: 10.1136/bjsm.2006.029496 CrossRefGoogle ScholarPubMed
Elbin, R.J., Kontos, A.P., Kegel, N., Johnson, E., Burkhart, S., & Schatz, P. (2013). Individual and combined effects of LD and ADHD on computerized neurocognitive concussion test performance: evidence for separate norms. Archives of Clinical Neuropsychology, 28(5), 476484. doi: 10.1093/arclin/act024 CrossRefGoogle ScholarPubMed
Fritz, C.O., Morris, P.E., & Richler, J.J. (2012). Effect size estimates: Current use, calculations, and interpretation. Journal of Experimental Psychology: General, 141(1), 2.CrossRefGoogle ScholarPubMed
Heaton, R., Miller, S.W., Taylor, M.J., & Grant, I. (2004). Revised comprehensive norms for an expanded Halstead–Reitan Battery: Demographically adjusted neuropsychological norms for African American and Caucasian adults. Lutz, FL: Psychological Assessment Resources.Google Scholar
Hinton-Bayre, A.D., Geffen, G.M., Geffen, L.B., McFarland, K.A., & Frijs, P. (1999). Concussion in contact sports: Reliable change indices of impairment and recovery. Journal of Clinical and Experimental Neuropsychology, 21(1), 7086.CrossRefGoogle ScholarPubMed
Houck, Z., Asken, B., Clugston, J., Perlstein, W., & Bauer, R. (2018). Socioeconomic status and race outperform concussion history and sport participation in predicting collegiate athlete baseline neurocognitive scores. Journal of the International Neuropsychological Society, 24(1), 110.CrossRefGoogle ScholarPubMed
Iverson, G.L., Lovell, M.R., & Collins, M.W. (2003). Interpreting change on ImPACT following sport concussion. The Clinical Neuropsychologist, 17(4), 460467.CrossRefGoogle ScholarPubMed
Iverson, G.L., & Schatz, P. (2015). Advanced topics in neuropsychological assessment following sport-related concussion. Brain Injury, 29(2), 263275.CrossRefGoogle ScholarPubMed
Katz, B.P., Kudela, M., Harezlak, J., McCrea, M., McAllister, T., Broglio, S.P., & Investigators, C.C. (2018). Baseline performance of NCAA athletes on a concussion assessment battery: A report from the CARE consortium. Sports Medicine, 48(8), 115.CrossRefGoogle ScholarPubMed
Kontos, A.P., Elbin, R.J., Covassin, T., & Larson, E. (2010). Exploring differences in computerized neurocognitive concussion testing between African American and White athletes. Archives of Clinical Neuropsychology, 25(8), acq068.CrossRefGoogle ScholarPubMed
Lovell, M., Collins, M., Podell, K., Powell, J., & Maroon, J. (2000). ImPACT: Immediate post-concussion assessment and cognitive testing. Pittsburgh, PA: NeuroHealth Systems, LLC.Google Scholar
McCrory, P., Meeuwisse, W., Dvorak, J., Aubry, M., Bailes, J., Broglio, S., Cantu, R.C., Cassidy, D., Echemendia, R.J., Castellani, R.J., & Davis, G.A. (2017). Consensus statement on concussion in sport—the 5th international conference on concussion in sport held in Berlin, October 2016. British Journal of Sports Medicine, 51(11), 838847.Google Scholar
Mormile, M.E.E., Langdon, J.L., & Hunt, T.N. (2018). The role of gender in neuropsychological assessment in healthy adolescents. Journal of sport rehabilitation, 27(1), 1621.CrossRefGoogle ScholarPubMed
Weber, M.L., Dean, J.-H.L., Hoffman, N.L., Broglio, S.P., McCrea, M., McAllister, T.W., Schmidt, J.D., CARE Consortium Investigators, Hoy, AR, Hazzard, J.B., & Kelly, L.A. (2018). Influences of mental illness, current psychological state, and concussion history on baseline concussion assessment performance. The American journal of sports medicine, 46(7), 17421751.CrossRefGoogle ScholarPubMed
Figure 0

Fig. 1. Process of case removal. ADHD, attention-deficit/hyperactivity disorder; Hx, history; LD, learning disability.

Figure 1

Table 1. Sample descriptive statistics for covariates and outcome variables

Figure 2

Table 2. Prevalence of low scores on individual composite scores in healthy, uninjured college athletes

Figure 3

Table 3. Prevalence of low scores across subpopulations of healthy, uninjured college athletes

Figure 4

Table 4. Frequency of reliable decline on individual domains in healthy, uninjured college athletes

Figure 5

Table 5. Prevalence of reliable decline in healthy, uninjured college athletes tested at 1 year interval

Supplementary material: File

Houck et al. supplementary material

Houck et al. supplementary material
Download Houck et al. supplementary material(File)
File 13.9 KB