Inclusion of performance validity tests (PVTs) as part of neuropsychological evaluations is now the standard of care in clinical neuropsychology (Bush et al., Reference Bush, Ruff, Troster, Barth, Koffler, Pliskin and Silver2005; Heilbronner et al., Reference Heilbronner, Sweet, Morgan, Larrabee, Millis and Participants2009; Larrabee, Reference Larrabee2012). These measures supplement standard cognitive tests by providing an indication of the extent to which the respondent gave a full effort to obtain valid scores. Failed performance validity testing generally indicates that the respondent did not give a full effort on cognitive testing. Note that failed performance validity is not synonymous with malingering as the latter requires the “presence of a substantial external incentive” (Slick, Sherman, & Iverson, Reference Slick, Sherman and Iverson1999). While PVTs are most relevant to objective cognitive testing, persons who fail PVTs may report excessive complaints on self-report measures (Lippa, Pastorek, Romesser, et al., Reference Lippa, Pastorek, Romesser, Linck, Sim, Wisdom and Miller2014).
Researchers have identified factors that may influence PVT performance. Such factors include malingering, somatoform disorder, psychological distress, perceived unfairness of the evaluation, boredom, and inadequate language proficiency among others (Bashem, Rapport, Miller, et al., Reference Bashem, Rapport, Miller, Hanks, Axelrod and Millis2014; Greher & Wodushek, Reference Greher and Wodushek2017; Lippa, Reference Lippa2018). Henry et al. (Reference Henry, Heilbronner, Suhr, Gornbein, Wagner and Drane2018) found that patient health beliefs were associated with validity test failure. Patients who endorsed beliefs that cognitive effort makes symptoms worse were more likely to show poor performance validity. Patients who endorsed beliefs that symptoms are due to an illness or injury as opposed to being normal experiences were also more likely to show poor performance validity. Thus, failure on PVTs is complex and different factors may result in a range of patterns of results (Chafetz, Williams, Ben-Porath, et al., Reference Chafetz, Williams, Ben-Porath, Bianchini, Boone, Kirkwood, Larrabee and Ord2015).
In a treatment context, the clinician has ongoing responsibility for the patient even if the patient shows poor performance validity. Some persons with clear medical documentation of moderate or severe traumatic brain injury (TBI) fail PVTs. Some of these individuals have true neurocognitive deficits (Larrabee, Reference Larrabee and Larrabee2014) that may contribute to poor performance validity. Dismissal of the need for treatment in such cases can be deleterious; yet, it can be challenging to develop an appropriate treatment plan. In many cases, persons without true cognitive impairment may be experiencing emotional distress related to an injury. Of course, in some cases, persons referred for treatment may simply be malingering for secondary gain. Thus, the interpretation of failure on PVT tests can be complex, and the interaction of cognitive impairment, emotional distress, and motivational factors must be considered.
Investigations of patterns of cognitive test results indicate that persons who show invalid performance have cognitive test results that are distinct from persons who give valid effort (Frazier, Youngstrom, Naugle, et al., Reference Frazier, Youngstrom, Naugle, Haggerty and Busch2007). Also, as found by Lippa et al. (Reference Lippa, Pastorek, Romesser, Linck, Sim, Wisdom and Miller2014), persons with invalid performance on cognitive tests report more complaints on self-report measures. These findings suggest that there may be different patterns of cognitive test scores and self-report among persons with invalid performance.
In our previous work (Sherer, Sander, Nick, et al., Reference Sherer, Sander, Nick, Melguizo, Tulsky, Kinsala, Hanks and Novack2015; Sherer, Nick, Sander, et al., Reference Sherer, Nick, Sander, Melguizo, Hanks, Novack and Tang2017; Sherer, Ponsford, Hicks, et al., Reference Sherer, Ponsford, Hicks, Leon-Novelo, Ngan and Sander2017), we identified 12 dimensions of signs and symptoms of TBI that can be derived from the 18 tests and questionnaires. The combinations of test scores and questionnaire results that composed each dimension are described in the Method section. Cluster analysis of the 12 dimension scores for 504 persons with TBI identified 5 subgroups (Sherer, Nick, et al., Reference Sherer, Nick, Sander, Melguizo, Hanks, Novack and Tang2017). Scores on the 12 dimensions were plotted to generate profiles of TBI subgroups. These five subgroups differed on cognitive performance, subjective complaints, environmental supports, and performance validity. Almost all persons with failed performance validity were clustered in the same subgroup so that it was not possible to examine patterns of dimension scores within subgroups with invalid performance.
Morin and Axelrod (Reference Morin and Axelrod2017) examined subclasses (subgroups) of a large cohort of participants who were administered both cognitive tests and measures of emotional functioning and personality. Four subgroups were identified, with members of one subgroup generally showing invalid performance, markedly impaired cognitive performance, and numerous emotional complaints. The three other subgroups had low rates of poor performance validity with less extreme scores on cognitive tests and fewer emotional complaints. As with the Sherer et al. (Reference Sherer, Nick, Sander, Melguizo, Hanks, Novack and Tang2017) study, this analysis did not reveal subgroups of persons with invalid performance.
The present investigation sought to extend prior findings by identifying subgroups of persons who showed poor performance validity. We carried out this aim by studying a cohort of persons with TBI who failed performance validity testing when administered a battery of cognitive tests, self-report measures, and scales assessing environmental supports. We hypothesized that clinically meaningful subgroups with distinct patterns of test and questionnaire performance could be identified. We expected that the clinical profiles of these subgroups would have implications for clinical management.
METHOD
Participants
This study is a secondary analysis of a dataset originally developed for a study that identified dimensions of objective cognitive test scores and self-reported cognitive, emotional, and physical symptoms as well as environmental supports (Sherer, et al., Reference Sherer, Sander, Nick, Melguizo, Tulsky, Kinsala, Hanks and Novack2015; Sherer, Nick, et al., Reference Sherer, Nick, Sander, Melguizo, Hanks, Novack and Tang2017). Participants in the original studies were recruited at rehabilitation centers in Houston, Birmingham, and Detroit in the USA as well as, Melbourne, Australia (Sherer, Ponsford, et al., Reference Sherer, Ponsford, Hicks, Leon-Novelo, Ngan and Sander2017). This research was carried out in accord with all relevant human subjects’ protection guidelines including the Helsinki Declaration. The research was reviewed and approved by ethics committees at all participating sites including approval by the Baylor College of Medicine IRB at the primary study site.
Participants who were included for study: (a) had definitive, medical documentation of TBI occurring greater than 6 months prior to assessment, (b) were 18 to 64 years old, (c) had capacity to give informed consent, and (d) had ability to complete study measures in English. Diagnosis of TBI was based on history of head trauma with one or more of (a) observed loss of consciousness, (b) post-traumatic amnesia (PTA), and (c) trauma-related findings on CT scan. Persons with a preexisting medical or psychiatric condition that would affect performance on the assessment were excluded from study. For this investigation, only those persons who failed performance validity testing were retained.
Procedures
For the parent study for this secondary analysis, medical and research records were examined for demographic (age, education, and race/ethnicity) and injury characteristics (PTA, length of stay, first available Glasgow Coma Scale (GCS)). Participants completed 36 cognitive tests and self-report questions. Using data reduction techniques and cluster analyses, 12 dimensions of participant experience after TBI were derived from 18 measures. These dimensions and the tests/questionnaires that defined them were as follows: (1) Memory – Wechsler Adult Intelligence Scale – IV Letter-Number Sequencing (Wechsler, Reference Wechsler2008), Rey Auditory Verbal Learning Test (RAVLT) total words for learning trials 1–5 (Rey, Reference Rey1958), (2) Cognitive Processing Speed – Trail Making Test Part A (Reitan & Wolfson, Reference Reitan and Wolfson1985), Wechsler Adult Intelligence Scale – IV Coding (Wechsler, Reference Wechsler2008), (3) Verbal Fluency – FAS (Gladsjo et al., Reference Gladsjo, Schuman, Evans, Peavy, Miller and Heaton1999), (4) Self-reported Cognitive Symptoms – Traumatic Brain Injury-Quality of Life (TBI-QOL; Tulsky et al., Reference Tulsky, Kisala, Victorson, Carlozzi, Bushnik, Sherer and Cella2016) Cognition-General Concerns, (5) Independence and Self-esteem – TBI-QOL Self-evaluation, TBI-QOL Independence, (6) Resilience – TBI-QOL Resilience, (7) Emotional Distress – TBI-QOL Anxiety, TBI-QOL Emotional and Behavioral Dyscontrol, (8) Post-concussive Symptoms – Neurobehavioral Symptom Inventory (Cicerone & Kalmar, Reference Cicerone and Kalmar1995), (9) Physical Symptoms – TBI-QOL Headache, TBI-QOL Pain Interference, (10) Physical Functioning – TBI-QOL Upper Extremity, (11) Economic and Family Support – Economic Quality of Life (Tulsky et al., Reference Tulsky, Kisala, Lai, Carlozi, Hammel and Heinemann2015), Family Assessment Device General Functioning (Epstein, Baldwin, & Bishop, Reference Epstein, Baldwin and Bishop1983), and (12) Performance Validity – Word Memory Test (WMT; Green, Reference Green2007). A more detailed description of data extracted from records, cognitive tests, questionnaires, and the PVT can be obtained by referring to Sherer, Sander, Nick, et al. (Reference Sherer, Sander, Nick, Melguizo, Tulsky, Kinsala, Hanks and Novack2015).
For neuropsychological tests, raw scores were converted to standardized scores utilizing adjustment for age, education, sex, and race/ethnicity as available in the WAIS-IV (Wechsler, Reference Wechsler2008), Heaton, Miller, Taylor, and Grant (Reference Heaton, Miller, Taylor and Grant2004), and Schmidt (Reference Schmidt1996) norm sets. For the TBI-QOL, look-up tables were used to covert raw scores to T-scores (Tulsky et al., Reference Tulsky, Kisala, Victorson, Carlozzi, Bushnik, Sherer and Cella2016). The reference sample for TBI-QOL measures was composed of persons with TBI. A similar look-up table was used for the Economic Quality of Life scale. Raw scores were used for the Neurobehavioral Symptom Inventory and the Family Assessment Device. An index of performance validity was calculated by averaging the scores from the three easy subtests of the WMT as recommended by Green (Reference Green2007). Persons retained for this investigation were those who “failed” the WMT based on standard cutoffs.
Statistical Analysis
Given the similarity in results with regard to demographic and injury characteristics as well as subgroup profiles shown in an earlier investigation (Sherer, Ponsford, et al., Reference Sherer, Ponsford, Hicks, Leon-Novelo, Ngan and Sander2017), the US and Australian samples were combined. Dimension scores from the combined sample were submitted to K-means cluster analysis with Euclidean distance. The number of clusters was selected using both variance ratio criterion (Caliński & Harabasz, Reference Caliński and Harabasz1974) and “global max” gap statistics (Tibshirani, Walther, and Hastie, Reference Tibshirani, Walther and Hastie2001). Both approaches indicated that three was the optimal number of clusters. Profiles for subgroups were created by plotting the 12 mean dimensions scores for persons in each subgroup. Note that dimension scores were based on normalized z-scores for the 18 tests and questionnaires for the original cohort of 504. Summary statistics for identified subgroups were calculated for demographic and injury severity data as well as for the individual test and questionnaire scores for the 18 study measures.
RESULTS
Study Sample Descriptions and Comparisons
The total cohort included 170 persons with TBI from Australia and 504 from the USA for a combined sample of 674. There were 21 persons with missing WMT scores (USA n = 13, Australia n = 8) resulting in a sample size of 653. Of these, 143 (USA n = 117, Australian n = 26) showed poor performance validity as indicated by failure of the WMT as determined by application of the standard cutoffs. These 143 participants formed the study cohort for this investigation.
K-means cluster analysis of the 12 dimension scores for the 143 participants resulted in identification of 3 clusters (subgroups). Subgroups were labeled 1, 2, and 3 for convenience. Note that for each dimension score, lower scores indicated poorer performance (e.g., more impaired cognitive abilities, more complaints on self-report measures). Comparisons for demographic, injury severity, and performance validity values for the three subgroups are shown in Table 1. The three subgroups did not differ on gender, race, injury severity (GCS, duration of PTA), age, years of education, or time since injury. While all three subgroups showed invalid performance, a difference was found in the overall WMT score (p < .001). Tukey’s multiple comparison testing showed that subgroup 3 showed worse overall performance on the WMT than subgroups 1 or 2.
Abbreviations: GCS = Glasgow Coma Scale; SD = standard deviation; y = years; PTA = post-traumatic amnesia; d = days; RAVLT = Rey Auditory Verbal Learning Test; TBI-QOL = Traumatic Drain Injury Quality of Life; Emot Behav Dys = emotional and behavioral disturbance
For all cognitive tests, higher scores indicate more intact function. For the TBI-QOL Anxiety, Emotional and Behavioral Dyscontrol, Headache, and Pain Interference scales, higher scores indicate greater severity of symptoms. Also, for the Neurobehavioral Symptom Inventory and Family Assessment Device, higher scores indicate greater severity of symptoms. For all other symptom measures, higher scores indicate more intact function.
Pairwise comparisons were only reported when the overall Anova was significant and effect sizes were only reported when the pairwise comparison was significant.
As subgroup 3 obtained the most impaired scores for all cognitive measures and the most symptomatic scores for all self-report measures, this group was used as the reference group for calculation of the effects sizes (Cohen’ d). Effects sizes ≤.20 are small, >.20 but less than .80 moderate, and ≥.80 large (Cohen, Reference Cohen1988).
Profile plots for the three subgroups are presented in Figure 1. Persons in subgroup 1 and subgroup 2 scored similarly on cognitive tests. However, subgroup 2 participants reported higher levels of complaints on cognitive, emotional, physical, and environmental support dimensions. Subgroup 3 participants scored more poorly on cognitive tests and reported more complaints than either subgroup 1 or subgroup 2.
To further demonstrate between group differences, we examined mean scores for the individual cognitive tests and self-report measures. Using greater than one standard deviation below the normative mean as indicating impairment, examination of the scores for the RAVLT Trials 1–5 revealed that all three subgroups were markedly impaired with scores greater than two standard deviations below the normative mean. Subgroup 1 scored grossly within normal limits for all remaining cognitive tests. Subgroup 2 also showed impairment on the Wechsler Adult Intelligence Scale – IV Coding subtest. Subgroup 3 obtained scores that were impaired or markedly impaired on all cognitive tests. For self-report measures, subgroup 1 scored similarly to a large cohort of persons with TBI from the TBI-QOL calibration sample (Tulsky et al., Reference Tulsky, Kisala, Victorson, Carlozzi, Bushnik, Sherer and Cella2016). Subgroup 2 scored similarly to other persons with TBI on most self-report measures, but reported concern about cognition and elevated neurobehavioral complaints. Subgroup 3 showed high levels of concern regarding cognitive function, anxiety, and emotional and behavioral control. This group also reported disruption of activities by pain and decreased upper extremity function. They reported a moderate level of neurobehavioral complaints. These results are shown in Table 1 including Anova’s comparing the three groups, pairwise comparisons, and key effect sizes.
DISCUSSION
The primary aim of this investigation was achieved as distinct subgroups of persons with failed performance validity were identified. These three subgroups differed in clinically meaningful ways on cognitive performance and symptom complaints. One group scored within one standard deviation of the normative mean on all cognitive tests except for the test that is clearly a memory test (RAVLT Trials 1–5 Total) and within one standard deviation of the mean scores from TBI-QOL calibration sample on all TBI-QOL measures. A second group scored in the impaired range for memory and the Wechsler Coding subtest while reporting much greater concern about cognitive function than the TBI-QOL calibration sample. A final group scored in the impaired range on all cognitive tests and reported substantial concern regarding cognitive function, self-esteem, and upper extremity motor function on the TBI-QOL.
While the cluster profiles provide information regarding various patterns of cognitive scores and complaints, they do not reveal factors that might contribute to these different profiles. Clinical examination of persons with failed performance validity should include a comprehensive history of factors, such as litigation status, applications for disability compensation, pre-injury psychiatric history, family, or other relationship dynamics that might provide secondary gain, apparent change in personal adjustment, and/or degree of psychological distress following injury. Test-taking behavior should be carefully observed to detect evidence of boredom or irritation with testing procedures.
Clinical intervention with persons who fail performance validity may be challenging. It is possible that factors that contribute to invalid performance may make patients more difficult to engage in treatment. Clinicians should be careful to listen to and acknowledge the patient’s point of view. Presentation of medical information that shows that patient complaints are improbable should be carried out with attention to the patient’s ability and willingness to entertain this information.
While admittedly speculative, we offer possible treatment implications of the subgroup profiles. For subgroup 1, interventions might focus on providing the sorts of recommendations that are applicable for the general population. These could include memory and organizational strategies using smart phone or other technologies as well as a focus on regular exercise, proper diet, and sleep hygiene depending on the patient’s complaints. The emphasis should be on self-management of perceived impairments and complaints.
For subgroup 2, the initial focus of intervention could be on emotional distress using education or psychotherapy. Treatments that help persons with TBI to cope with their emotional distress and work toward realistic, value-oriented goals may be helpful.
Subgroup 3 members are likely to be the most difficult to engage in treatment. Here, creation of a strong therapeutic alliance with family or close others may be helpful. The social influence that can be brought to bear by the patient’s social support network may be effective in enhancing engagement in treatment.
The treatment recommendations above are based on the clinical experience of the authors and require validation by additional investigation. Before this effort, the subgroup structure identified in this investigation should be cross-validated by additional research.
The present investigation has several limitations. The study cohort was drawn from persons with TBI who were living in the community and were not seeking treatment. The average interval from injury to study evaluation was 6.8 years. The majority of study participants sustained moderate or severe TBI rather than mild TBI. Information regarding whether participants were engaged in litigation or disability claims related to their injuries was not obtained. Persons with deficits due to prior neurologic illness or injury and those with severe psychiatric disturbance were excluded from this study. These factors may limit the applicability of study findings to other groups of persons with TBI and failed performance validity. Participants were not administered symptom validity tests and these tests may have contributed to understanding of the patterns of scores on subjective complaints. In addition, for this investigation, participants were only administered one performance validity measure. Best practice for clinical evaluations is administration of multiple measures of performance validity to provide converging evidence regarding adequacy of effort and to avoid reliance on a single measure that might be failed due to actual cognitive impairment rather than poor performance validity (Board of Directors, 2007).
The findings of this investigation are an initial step toward a more sophisticated and individualized approach to managing patients with medically proven TBI and invalid performance on neuropsychological evaluation. Effective management of these persons may lead to improved quality of life for the patient and family/close others as well as decreased burden on healthcare resources. Further investigation is needed to build on this initial step. Future studies would benefit from examining participants who were seeking treatment.
ACKNOWLEDGMENTS
This research was partially supported by a grant from the National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR grant number 90DPTB0016). NIDILRR is a Center within the Administration for Community Living (ACL), Department of Health and Human Services (HHS). The contents of this article do not necessarily represent the policy of NIDILRR, ACL, HHS, and you should not assume endorsement by the Federal Government.
CONFLICT OF INTEREST
All authors declared no conflict of interest regarding this manuscript.