INTRODUCTION
Neurofeedback therapies (NFTs) are a family of interventions involving the application of neurofeedback protocols within a brain–computer interface to achieve therapeutic goals. Based on shared principles of implicit learning, operant conditioning, and neuroplasticity, NFTs are employed to teach individuals how to modify their brain’s electrical activity in order to better distinguish and self-regulate their psychophysiological state. In turn, it is argued that lasting neural changes emerge, which support the recovery or enhancement of targeted neurocognitive functions (Hammond, Reference Hammond2007; Kamiya, Reference Kamiya1968; Thibault & Raz, Reference Thibault and Raz2016). Although the potential therapeutic uses of neurofeedback have been discussed for decades (Kamiya, Reference Kamiya1968), NFTs have only recently become a fixture in the clinical milieu. Interest in NFTs has grown rapidly in recent years, no doubt encouraged by the decreasing costs of the requisite technology, increasing public interest in technologically mediated treatment options, and growing support from third-party health care programs. As the number of specialized clinics, service providers, and academic publications has multiplied over the past decade (QY Research, 2018; Thibault & Raz, Reference Thibault and Raz2017), so too has the number of conditions purportedly treated by NFTs. Nevertheless, empirical support for NFTs’ numerous applications has lagged.
Clinical Applications of NFT
There is evidence to suggest that NFTs may impart general benefits, such as encouraging better regulation of physiological arousal (Fragedakis & Toriello, Reference Fragedakis and Toriello2014), decreasing anxiety (Moore, Reference Moore2000), increasing positive mood (Raymond, Varney, Parkinson, & Gruzelier, Reference Raymond, Varney, Parkinson and Gruzelier2005), and improving attentional function (Norris, Lee, Burshteyn, & Cea-Aravena, Reference Norris, Lee, Burshteyn and Cea-Aravena2008). Given that these domains are nonspecific and common across a number of conditions, it is not surprising that NFTs have been applied to conditions as diverse as fibromyalgia (Kayiran, Dursun, Dursun, Ermutlu, & Karamürsel,, Reference Kayiran, Dursun, Dursun, Ermutlu and Karamürsel2010), posttraumatic stress disorder (PTSD; Fragedakis & Toriello, Reference Fragedakis and Toriello2014; Gapen et al., Reference Gapen, van der Kolk, Hamlin, Hirshberg, Suvak and Spinazzola2016), schizophrenia (Surmeli, Ertem, Eralp, & Kos, Reference Surmeli, Ertem, Eralp and Kos2012), and attention-deficit/hyperactivity disorder (ADHD; Gevensleben et al., Reference Gevensleben, Kleemeyer, Rothenberger, Studer, Flaig-Röhr, Moll, Rothenberger and Heinrich2014; Meisel, Servera, Garcia-Banda, Cardo, & Moreno, Reference Meisel, Servera, Garcia-Banda, Cardo and Moreno2014).
Despite their broad application, however, empirical support for NFTs has remained variable in both quality and quantity. The use of NFTs for ADHD has amassed the most consistent empirical support, though the findings have been far from unanimous (Thibault & Raz, Reference Thibault and Raz2016). A recent systematic review and meta-analysis (Van Doren et al., Reference Van Doren, Arns, Heinrich, Vollebregt, Strehl and K Loo2018) concluded that NFTs show promise as an effective treatment for the full range of ADHD symptoms and that attentional gains persisted over time. Nevertheless, these promising findings were tempered by concerns of reporting bias and suboptimal study design. Similar shortcomings have been documented by others (Albert, Sánchez-Carmona, Fernández-Jaén, & López-Martín, Reference Albert, Sánchez-Carmona, Fernández-Jaén and López-Martín2017; Thibault & Raz, Reference Thibault and Raz2016). When limited to more rigorous study designs, NFTs have demonstrated less convincing outcomes for ADHD. For instance, Pahlevanian et al.’s (Reference Pahlevanian, Alirezaloo, Naghel, Alidadi, Nejati and Kianbakht2015) NFT demonstrated objective improvements on testing but failed to manifest in functional/behavioural changes, while Bink, van Nieuwenhuizen, Popma, Bongers, and van Boxtel’s (Reference Bink, van Nieuwenhuizen, Popma, Bongers and van Boxtel2014) protocol showed little effect on ADHD symptoms beyond treatment as usual. Further suggesting null effects of NFTs, Thibault and Raz (Reference Thibault and Raz2017) argue that the majority of reported NFT effects are derived from uncontrolled pre–post comparisons and may well reflect placebo effects rather than treatment efficacy.
Ironically, the mixed evidence for NFTs may reflect their potential strengths. Given their general base principles, NFTs may be uniquely capable of delivering adaptive and tailored interventions for a variety of conditions. However, this very flexibility may provide little clarity regarding mechanisms of action or guidance for how to begin establishing one. An understood mechanism of action would substantively strengthen the evidence for NFTs’ utility and guide the refinement of specific NFT protocols. This understanding is absent at present. Nevertheless, the sheer number of reported NFT benefits may suggest therapeutic plausibility.
Rationale for NFTs in Brain-Injured Populations
Acquired brain injury (ABI) refers to any injury to the brain that is sustained after birth and is not due to congenital, hereditary, or degenerative conditions. Owing to their wide range of possible severity, aetiology, and lesion location, cognitive impairments following ABI exhibit a high degree of variability. Nevertheless, typically affected cognitive domains include processing speed, attention, working memory, memory and learning, executive functioning, and self-regulation of emotions and behaviour (Cattelani, Zettin, & Zoccolotti, Reference Cattelani, Zettin and Zoccolotti2010) – each of which have been proposed to respond well to NFTs (Egner & Gruzelier, Reference Egner and Gruzelier2004; Gray, Reference Gray2017; Thomas & Smith, Reference Thomas and Smith2015; Thompson, Thompson, & Reid-Chung, Reference Thompson, Thompson and Reid-Chung2015; Vernon et al., Reference Vernon, Egner, Cooper, Compton, Neilands, Sheri and Gruzelier2003).
For instance, NFTs have been reported to improve attentional control among healthy adults (Egner & Gruzelier, Reference Egner and Gruzelier2004; Vernon et al., Reference Vernon, Egner, Cooper, Compton, Neilands, Sheri and Gruzelier2003). Several recent studies have demonstrated improvements in working memory and episodic memory performance in healthy adults following NFT (Guez et al., Reference Guez, Rogel, Getter, Keha, Cohen, Amor, Gordon, Meiran and Todder2015; Hsueh, Chen, Chen, & Shaw, Reference Hsueh, Chen, Chen and Shaw2016). As a complementary treatment, Hosseini, Pritchard-Berman, Sosa, Ceja, and Kesler (Reference Hosseini, Pritchard-Berman, Sosa, Ceja and Kesler2016) found that their NFT enhanced the effects of traditional cognitive training for episodic memory and executive functioning in healthy adults. NFTs have also shown promise as a treatment for emotional dysregulation in persons with PTSD (Gapen et al., Reference Gapen, van der Kolk, Hamlin, Hirshberg, Suvak and Spinazzola2016; Gerin et al., Reference Gerin, Fichtenholtz, Roy, Walsh, Krystal, Southwick and Hampson2016; Nicholson et al., Reference Nicholson, Rabellino, Densmore, Frewen, Paret, Kluetsch, Schmahl, Théberge, Neufeld, McKinnon, Reiss, Jetly and Lanius2017) and various anxiety disorders (Moore, Reference Moore2000; Zilverstand, Sorger, Sarkheil, & Goebel, Reference Zilverstand, Sorger, Sarkheil and Goebel2015), and for behavioural dysregulation among those with obsessive–compulsive disorder (Kopřivová et al., Reference Kopřivová, Congedo, Raszka, Praško, Brunovský and Horáček2013). Other evidence suggests NFTs may effectively treat depression and fatigue in healthy (Raymond et al., Reference Raymond, Varney, Parkinson and Gruzelier2005) and neurologically impaired adults (e.g., multiple sclerosis: Choobforoushzadeh, Neshat-Doost, Molavi, & Abedi, Reference Choobforoushzadeh, Neshat-Doost, Molavi and Abedi2015). Results such as these suggest that NFTs may prove effective for treating various impairments common among those with ABI.
Renton, Tibbles, and Topolovec-Vranic (Reference Renton, Tibbles and Topolovec-Vranic2017) recently conducted a systematic review of NFTs’ utility for stroke rehabilitation and concluded that, despite some indication of cognitive improvement following intervention, it was challenging to support NFTs as an evidence-based treatment. While a laudable first step towards a consolidated understanding of NFT–ABI’s efficacy, Renton et al. failed to implement any exclusionary criteria based on experimental design. While this maximized their sample size, they conceded that the lack of control and generally low study quality throughout the literature made it difficult to determine the validity of their results. Thus, a more rigorous review of well-controlled studies was considered necessary to evaluate the evidence supporting NFTs’ effectiveness for cognitive rehabilitation following ABI, including participants with both traumatic and nontraumatic aetiologies.
Objectives of the Current Review
Given the recent growth of mainstream awareness, institutional support, and clinical implementation of NFTs, a systematic review of the literature regarding their efficacy for rehabilitation of ABI-related cognitive impairment was considered timely. The primary objective of this review was to evaluate whether the application of NFTs leads to better objective cognitive outcomes in those with ABI compared to other interventions or adequate control conditions. Based on the conflicting literature and the application of NFTs to a wide range of cognitive deficits, this review was conducted as broadly as possible to account for a variety of potential cognitive outcomes.
METHODS
This review was conducted in line with Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines (Moher et al., Reference Moher, Shamseer, Clarke, Ghersi, Liberati, Petticrew, Shekelle and Stewart2015).
Eligibility Criteria
Only English language studies including full text were considered eligible for inclusion. Eligible studies included adult samples (18+ years) with ABI and the use of neurofeedback intervention/training for cognitive rehabilitation. Only studies that were clearly case controlled based on the abstract were considered for further review.
Search Strategy
The PICO framework (Moher et al., Reference Moher, Shamseer, Clarke, Ghersi, Liberati, Petticrew, Shekelle and Stewart2015) was used to design and implement our search protocol across PubMed and EBSCOhost databases, including Biomedical Reference Collection: Comprehensive, CINAHL Complete, Cochrane Database of Systematic Reviews, Cochrane Central Register of Controlled Trials, MEDLINE with Full Text, and PsycINFO. These databases were searched using a combination of relevant Boolean terms agreed upon by the authors. Search delimiters were activated for language (English language only), subjects (human), and age (18+ years) across databases. The search was executed separately on the same day (15 August 2018) by each author to promote reliability. This automated search was supplemented with manual retrieval of articles known to the investigators. Following an initial screen to exclude irrelevant articles and duplicates, the suitability of remaining articles was independently adjudicated by two authors. Where there was disagreement, the third author’s blind rating determined a given study’s inclusion.
Outcome Measures
Tabulated p-values per study and cognitive domain were reported as provided by study authors or, where between-group comparisons were not provided by a given study, as calculated by secondary analyses. An integrative summary of NFT results per cognitive domain was attempted where sufficient data were available.
Study Quality
Study quality was rated according to the PEDro tool (Maher, Sherrington, Herbert, Moseley, & Elkins, Reference Maher, Sherrington, Herbert, Moseley and Elkins2003). PEDro scoring requires that data be clearly and explicitly stated to receive credit. In this way, the PEDro tool may be considered an assessment of rigorous study design as well as overall reporting quality. It does not, however, assess specific aspects of statistical procedure or interpretation. Articles are assigned a rating of 0 (absent) or 1 (present) on 11 items related to blinded administration, random selection, and equivalence of samples. A total score of 11 represents an ideal randomized controlled trial (RCT). Study quality was assessed independently by two authors. Where the initial ratings for a given study conflicted, the third author’s blind rating determined the final PEDro score.
Data Analysis and Synthesis
For each eligible study, the cognitive domain(s) of interest, mean sample age and sex composition, and ABI sample characteristics were provided alongside PEDro ratings of methodological strength in Table 1. Available between-group comparisons were reported in Table 2. Secondary unidirectional independent samples t-tests (α = .05) were conducted to compare rate of pre- to postintervention change between control and NFT groups where these comparisons were not reported by source articles (Table 2). Meaningful calculation of effect sizes was not tenable, given the incompatible study designs (e.g., ANCOVA, t-test, Wilcoxon) and limited available data.
n.s., not significant; bold signifies statistically significant p <. 05.
AVLT = Auditory Verbal Learning Test; CFT = Complex Figure Test; COWAT = Controlled Oral Word Association Test; FTT = Finger Tapping Test; MFI = Multidimensional Fatigue Scale; MVPT = Motor-Free Visual Perception Test; PASAT = Paced Auditory Serial Addition Test; RCFT = Rey–Osterrieth Complex Figure Test; RHIFQ = Rivermead Head Injury Follow-up Questionnaire; RPQ = Rivermead Post-Concussion Symptoms Questionnaire; SCL-90-R = Symptom Checklist-90-Revised; TAP = Test of Attentional Performance; TMT = Trail Making Test; ToF = Tower of London; WAIS-R = Wechsler Adult Intelligence Scale-Revised; WCST = Wisconsin Card Sorting Test.
a Outcomes may be overestimated due to potential bias, inequivalent groups, or other issues related to statistical procedure (noted in text).
b Results calculated by primary author (JIA). Results reflect the outcomes of unidirectional independent samples t-test comparisons (α =.05) based on available data.
RESULTS
Study Selection
The search protocol initially identified n = 135 articles (MEDLINE with Full Text n = 46, CINAHL Complete n = 33, Cochrane Central Register of Controlled Trials n = 14, PubMed n = 41, and PsycINFO n = 1). Following automated and manual removal of duplicates (n = 16), studies with inadequate controls (n = 9), conference abstracts with no associated full-text article (n = 2), and entries irrelevant to the topic of interest (n = 22), the search yielded n = 86 unique articles. Several articles were found that were relevant to the topic of interest. The majority consisted of position papers (n = 19), articles related to assessment but not the treatment of ABI or other disorders (n = 18), or uncontrolled pre–post case studies (n = 10) and, thus, were excluded. Of the remaining articles, n = 12 full-text articles were selected for comprehensive screening. A final sample of n = 4 eligible studies were identified after screening. The study selection process is summarized in Figure 1. Although this final sample was significantly smaller than initially hoped for, it was not entirely unexpected given the paucity of rigorous NFT–ABI studies documented by others (May, Benson, Balon, & Boutros, Reference May, Benson, Balon and Boutros2013; Novo-Olivas, Reference Novo-Olivas, Cantor and Evans2014; Thomas & Smith, Reference Thomas and Smith2015).
Study Characteristics
Study characteristics are summarized in Table 1.
Participants
Three studies applied NFTs to traumatic brain injury (TBI) populations (Keller, Reference Keller2001; Reddy, Rajeswaran, Devi, & Kandavel, Reference Reddy, Rajeswaran, Devi and Kandavel2013; Schoenberger, Shiflett, Esty, Ochs, & Matheis, Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001), while one applied NFT to a poststroke population (Cho, Kim, Lee, & Jung, Reference Cho, Kim, Lee and Jung2015). TBI severity ranged from mild to severe across study samples. Keller (Reference Keller2001) and Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) reported no differences in brain injury severity between their treatment and control groups.Cho et al. (Reference Cho, Kim, Lee and Jung2015) reported equivalent MMSE performance between their treatment and control groups at baseline. Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) reported group differences in brain injury severity and time since injury at baseline.
Control condition
Control paradigm differed between studies. Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) and Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) employed wait-list control, Keller (Reference Keller2001) employed alternative treatment control, and Cho et al. (Reference Cho, Kim, Lee and Jung2015) employed treatment as usual.
Equipment, targets, protocol
Each study utilized different software and hardware, protocols, and EEG targets. Keller (Reference Keller2001) did not disclose their software package and Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) did not disclose any details of their NFT suite. Details of Schoenberger et al.’s (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) and Cho et al.’s (Reference Cho, Kim, Lee and Jung2015) NFT suites are provided in Table 1. The details regarding the number and length of sessions for all protocols are available in Table 1.
Perhaps most notable is that each study differed with regard to target frequency band and intervention design. Keller’s intervention (Reference Keller2001) focused on training attentional ability via beta activity modulation, though they concede that there is no clear mechanistic link between beta activity and specific attentional performance. Beta activity was displayed as a bar graph on a monitor and participants were asked to keep the bars above a target mark. When beta activity fell below the mark, participants were asked to perform mental arithmetic or an auditory word recognition task until the bars were above the target again. No rationale was provided for their selection of feedback modality.
Like Keller (Reference Keller2001), Cho et al. (Reference Cho, Kim, Lee and Jung2015) also targeted beta activity with the goal of increasing visual perception. These targets were selected based on the premise that beta-wave activation improves concentration and reaction time; however, Cho et al. neglected to provide supporting evidence for this claim or clarify how beta activation might relate to visual perception in particular. They used a beta–SMR method utilizing both visual and auditory rewards for video game performance. No rationale was provided for their selection of feedback modality.
Somewhat different from ‘traditional’ NFT methods (e.g., EEG biofeedback), Schoenberger et al.’s (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) proprietary Flexyx Neurotherapy System utilized subthreshold photic stimulation to train ‘a balance of activity across the EEG spectrum’ without participants’ conscious control. EEG amplitude and variability were recorded at the alpha and delta bands to indicate the range of activity. The goal of this intervention was broadly reported as ‘improvement on measures of cognitive and emotional functioning’. In support of their approach, Schoenberger et al. argue that cognitive impairments among those with ABI, ADHD, and other pathological conditions reflect a common EEG pattern (i.e., increased activity in the 4–8 Hz range and decreased activity in the 12–18 Hz range) and that this pattern has proven sensitive to alteration by rhythmic photic stimulation; however, the authors provide little evidence to support these premises. Moreover, the few supporting articles cited by Schoenberger et al. utilize different methods in different clinical populations for different ends. The authors make no further attempt to outline a clear mechanistic relationship between their specific intervention approach and neurocognitive functioning.
Similarly to Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001), Reddy et al.’s intervention (Reference Reddy, Rajeswaran, Devi and Kandavel2013) did not rely upon participants’ deliberate efforts. Their intervention targeted alpha- and theta-wave activity, though their specific goals for cognitive improvement were unclear. No rationale was provided for these EEG target bands, though it appears this study was an expansion of a protocol used for an earlier case study which was unavailable for review. Participants were presented with a task and were provided scores on a screen. Rather than being directed to increase or decrease any specific activity, participants were instructed to relax with the assumption that neural activity would naturally adjust itself to match the reward range. No rationale was provided for their selection of feedback modality.
Study quality
The overall PEDro score across studies was median = 5.5 (out of a possible 11), min = 4, and max = 8. Keller (Reference Keller2001) was deducted points for unclear eligibility criteria, unclear report of random assignment, unconcealed group allocation, and lack of blinding. Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) were deducted points for lack of clarity regarding prognosis at baseline, unconcealed assignment, and lack of blinding. Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) were deducted points for unclear eligibility criteria, potentially dissimilar prognoses at baseline, unconcealed assignment, and lack of blinding. Finally, Cho et al. (Reference Cho, Kim, Lee and Jung2015) were deducted points for lack of blinding only.
Although these scores seem relatively low, the PEDro tool typically penalizes nonpharmacological studies that are difficult or impossible to double-blind. Given that typical/maximum PEDro score may differ depending on clinical setting, population, or method, the PEDro score may be more useful as a relative metric within the context of similar work rather than a standalone measure. Compared to other ABI rehabilitation studies, the median PEDro scores for this review actually approach or exceed those for other reviews (e.g., Cascaes da Silva et al., Reference Cascaes da Silva, da Rosa Iop, Domingos dos Santos, Aguiar Bezerra de Melo, Barbosa Gutierres Filho and da Silva2016; Spencer, Aldous, Williams, & Fahey, Reference Spencer, Aldous, Williams and Fahey2018; Vanderbeken & Kerckhofs, Reference Vanderbeken and Kerckhofs2017).
While this is encouraging, it bears restating that the PEDro tool does not evaluate more complex issues related to academic rigour, such as provision of a clear treatment rationale/mechanism of action or clear reporting of study protocols. These aspects of study quality were more problematic among the reviewed studies. Although these issues are not exclusive to the NFT literature – indeed, inadequate reporting, poor replicability, and a lack of proposed mechanisms have been identified as shortcomings endemic to rehabilitation literature as a whole (Dijkers et al., Reference Dijkers, Kropp, Esper, Yavuzer, Cullen and Bakdalieh2002; Whyte & Hart, Reference Whyte and Hart2003) – the reviewed studies may represent particularly striking examples of these broader inadequacies.
NFT Outcomes per Cognitive Domain
Given the low number of studies identified and the various threats to validity found throughout, the authors of this review opted not to report the results by cognitive domain as originally planned. It was determined that compiling and comparing the available data between studies would not yield meaningful or interpretable results due to significant limitations in the designs and data available across the few studies included. Instead, we have limited discussion of the review’s results to a brief overall summary in favour of an expanded discussion regarding the methodological shortfalls among the reviewed studies. Nevertheless, the results per study and cognitive domain are presented in Table 2.
Summary of results
Our review found limited evidence to suggest any effect of the reviewed NFTs on processing speed, attention, language, executive inhibition, or psychomotor functioning. Given the mixed findings and questionable validity of study outcomes, it remains unclear whether the reviewed NFTs might improve general memory functioning or executive shifting. In contrast, convergent evidence suggested that updating/working memory, organization/planning, and problem-solving may benefit from NFTs targeting alpha activity and/or NFTs that do not require conscious cognitive effort. That said, both of the studies supporting this conclusion were severely undermined by methodological issues. Numerous improvements to self-reported symptoms and quality of life were noted across the studies; however, it is unclear to what extent these results may have been due to conflating factors, statistical inflation, placebo effects, or true treatment effects.
Issues Concerning Validity
Test selection and interpretation
Several studies based their conclusions on questionable interpretations of cognitive test performance. Most notably, Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) described improvements on the fastest trial of the Paced Auditory Serial Addition Test as evidence for increased processing speed but neglected to address why any such increase would fail to affect simpler conditions of the same task. Similarly, they concluded that improved performance on the Auditory Verbal Learning Test Interference Trial and Delayed Recall suggests memory benefits of NFTs, but they did not account for the lack of effect on the actual learning trials. Trailmaking Test B was also considered a test of sustained attention in this study, where it is most commonly considered a measure of attentional switching and inhibitory control (Lezak, Howieson, Bigler, & Tranel, Reference Lezak, Howieson, Bigler and Tranel2012). Finally, there was little overlap in the tests administered or cognitive outcomes assessed across studies. Together with the small n for this review, the lack of replication across studies makes it difficult to determine whether and to what extent there may be convergent evidence for NFT efficacy.
Sample heterogeneity
Though three studies focused on TBI, there remained significant variance between these samples. Notably, Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) included those with mild to moderate TBI, and Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) included those with mild to severe TBI. While this variance in TBI severity is sufficiently problematic unto itself, the inclusion of individuals with mild TBI (mTBI) further complicates comparison between studies. Given the relatively mild cognitive sequelae of mTBI and their typically temporary nature (Arciniegas, Anderson, Topkoff, & McAllister, Reference Arciniegas, Anderson, Topkoff and McAllister2005; McCrea et al., Reference McCrea, Iverson, McAllister, Hammeke, Powell, Barr and Kelly2009), it is argued that mTBI may comprise a qualitatively different condition than moderate or severe TBI. The unqualified inclusion of mTBI with more severe TBI calls the validity of the outcomes into question.
NFT equipment and protocols
There was substantial variability in the tools and methods employed by each study. These technical considerations make direct comparison problematic since differences in equipment and recording protocols have been shown to garner inconsistent results (Vernon et al., Reference Vernon, Dempster, Bazanova, Rutterford, Pasqualini and Andersen2009). Furthermore, each study employed different NFT protocols targeting unique frequency bands, making it impossible to determine the utility of any specific NFT protocol for a given condition.
Lack of clarity
Some studies were particularly vague in their reports of their analyses and findings. For instance, Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) alluded to covariates yet neglected to comment on what these were or their relative contribution to study outcomes. Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) opted not to perform between-group comparisons at all. Although they may have foregone such comparisons due to baseline group differences, it was never explicitly stated that between-group comparisons had been removed from the analyses. To the contrary, the study’s headings and style suggest that between-group comparisons had been conducted and that the reported scores reflect said comparisons. Keller’s study (Reference Keller2001) suffered from similar issues. Such reporting issues make accurate review or replication challenging.
Distorted outcomes
Schoenberger et al. (Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) reported that they refrained from conducting post hoc adjustments to preserve their found effects. As a consequence, their significant results may well result from Type I error. It should also be noted that Schoenberger et al. hold a proprietary stake in the protocol used in their study, raising concerns about unacknowledged conflicts of interest. Further, it is possible that the secondary t-tests conducted for the purposes of pre–post comparisons also distorted outcomes. For instance, Keller (Reference Keller2001) conducted nonparametric analyses but did not remark on their reasoning. Without greater detail regarding the characteristics of their data, it is difficult to determine whether how, and to what extent, the results of the secondary t-tests may misestimate the degree of change attributable to their NFT intervention. Likewise, Reddy et al. (Reference Reddy, Rajeswaran, Devi and Kandavel2013) reported baseline group differences that we were unable to control for given the available data. These differences largely favoured the cognitive performance/recovery of the NFT group over controls. Therefore, our secondary analyses are qualified ‘with all things equal…’, although we are aware that all things were not. By assuming the best-case scenario, it is likely that the results of our secondary analyses may somewhat misrepresent the true effects of the reviewed NFTs.
DISCUSSION
The initial goal of this systematic review was to determine whether there was evidence to suggest that NFTs are an effective approach for the rehabilitation of ABI-related cognitive impairments. Despite finding a large number of ostensibly relevant studies, only four were found to include NFTs as an intervention, a sample over 18 years of age, and a meaningful control condition with clear between-group comparisons of cognitive–neuropsychological outcomes (or sufficient data to allow for calculation of said comparisons). Of the four articles reviewed, the significant results of three (Keller, Reference Keller2001; Reddy et al., Reference Reddy, Rajeswaran, Devi and Kandavel2013; Schoenberger et al., Reference Schoenberger, Shiflett, Esty, Ochs and Matheis2001) were considered at risk for biased reporting and inflated significance. Far from unique, the methodological issues found during this review may well represent the state of NFT literature as a whole (Thibault & Raz, Reference Thibault and Raz2016). These issues notwithstanding, the outcomes of this review provided limited convergent evidence for the effects of NFTs on any single cognitive domain or outcome measure. Although disappointing, this lack of cohesion may have been anticipated by the dearth of adequately controlled studies and numerous threats to validity apparent throughout NFT literature (Janssen et al., Reference Janssen, Bink, Geladé, van Mourik, Maras and Oosterlaan2016; Rossiter, Reference Rossiter2004; Thibault & Raz, Reference Thibault and Raz2016). Consequently, there is insufficient evidence at this time to recommend NFTs for cognitive rehabilitation following ABI. Given the general lack of NFT literature and the numerous methodological issues throughout the literature that is available, the field may benefit from a strategic shift to first establishing a sound conceptualization of NFTs’ mechanisms and outcomes in healthy populations and only then extending said conceptualizations to atypical clinical populations. We discuss several challenges to NFT–ABI practice and research below to extend this argument.
Clinical and Research Challenges for NFT–ABI
Individual differences
Individual differences in baseline cognitive control may stymie the effectiveness of NFTs. Previous research has estimated that only about 30% of healthy adults (Allison & Neuper, Reference Allison, Neuper, Tan and Nijholt2009) and stroke survivors (Kober et al., Reference Kober, Schweiger, Witte, Reichert, Grieshofer, Neuper and Wood2015) are capable of learning to modulate their brain activity via neurofeedback interfaces. Further, natural and acquired variation in skull physiology and neuroanatomy are likely to provide incongruous EEG signals (Thibault & Raz, Reference Thibault and Raz2017).
ABI as a heterogenous condition
The inherent heterogeneity of ABIs themselves serves as an obstacle to effective clinical implementation of NFTs and clinical research. ABIs may arise from numerous aetiologies, may range in severity, and may affect any number of neuroanatomical regions and cognitive domains. Further, unlike longstanding developmental or psychological disorders, demographic variables and posttraumatic amnesia are known to strongly influence the degree and nature of cognitive impairment following injury (Katz & Alexander, Reference Katz and Alexander1994; Novack, Bush, Meythaler, & Canupp, Reference Novack, Bush, Meythaler and Canupp2001; Rabinowitz, Hart, Whyte, & Kim, Reference Rabinowitz, Hart, Whyte and Kim2017; Spitz et al., Reference Spitz, Ponsford, Rudzki and Maller2012). Given these multiple sources of variance, designing and applying any singular NFT protocol for ABI may be difficult from conception. It may prove more sensible to take a transdiagnostic approach and construct individualized and dynamic interventions that address specific areas of cognitive deficit that cut across diagnoses and injury types (e.g., attention deficits; Racer & Dishion, Reference Racer and Dishion2012).
Another concern is that the approach to NFT–ABI appears to parallel the NFT approaches taken to address other cognitive issues due to PTSD, anxiety, or depression. While principles of implicit learning and shaping may be equally applicable across various conditions and populations, the theorized substrates of NFT action (e.g., neural plasticity and connectivity) are predicated on neurological typicality and intactness. However, persons with ABI are neurologically atypical by definition; they are most likely to exhibit long-term disruption to neural tracts, potentially reorganized functional topography, and altered neural structure (Davis, Reference Davis2000; Sharp, Scott, & Leech, Reference Sharp, Scott and Leech2014). It is known that volumetric changes and disfiguration of brain tissue may alter the EEG signal generation and volume conduction necessary for NFTs (Van Van Den Broek, Reinders, Donderwinkel, & Peters, Reference Van Den Broek, Reinders, Donderwinkel and Peters1998). In other words, even if the efficacy of training brainwaves into ‘normal range’ was well-established for neurologically intact individuals, it would remain unclear how this would generalize to specific persons with ABI.
Lack of standardization
Yet another challenge to conducting rigorous research on NFT–ABI lies in the inherent heterogeneity of the method itself. Practitioners employ different neurofeedback equipment, imaging methods, recording arrays, EEG targets, feedback modalities, and treatment protocols. Although each of these variables have been shown to produce different outcomes (Vernon et al., Reference Vernon, Dempster, Bazanova, Rutterford, Pasqualini and Andersen2009), little consideration appears to have been given to these effects. Even if the logic behind the selection of specific targets was clarified, imaging methods, equipment, and intervention protocols remain inconsistently and often vaguely specified. The absence of clearly delineated protocols and rationale hinders the replication of study outcomes necessary for the synthesis of practice standards. Thus, while evidence-based standards for clinical NFT practices and research are beginning to converge for select clinical populations (e.g., ADHD: Arns, Heinrich, & Strehl, Reference Arns, Heinrich and Strehl2014; Van Doren et al., Reference Van Doren, Arns, Heinrich, Vollebregt, Strehl and K Loo2018; addictions: Luigjes, Segrave, de Joode, Figee, & Denys, Reference Luigjes, Segrave, de Joode, Figee and Denys2018), comparable cohesion among NFT–ABI literature has yet to emerge.
Lack of theoretical consensus
In hindsight, the lack of research related to our question may have been augured by the lack of agreement regarding the mechanisms of NFTs more generally. Despite some points of apparent theoretical consensus (Marzbani, Marateb, & Mansourian, Reference Marzbani, Marateb and Mansourian2016), a broader survey of the NFT literature reveals little agreement regarding therapeutic targets, anticipated outcomes, or clear rationale for specific interventions. Where rationale is provided, clinical reasoning typically relies on tenuous correlations between unspecific cortical markers and specific cognitive states, symptoms, or functional capacities (e.g., Schabus et al., Reference Schabus, Griessenberger, Gnjezda, Heib, Wislowska and Hoedlmoser2017). This lack of theoretical cohesion is further illustrated – and perpetuated – by an absence of a priori hypotheses, unclear operational definitions of treatment success, and extreme diversity of outcome measures throughout the literature. Lacking this understanding of how exactly NFTs confer specific benefits, it is unsurprising that well-established therapeutic targets for specific pathologies have failed to emerge. Although these criticisms are far from novel (Rossiter, Reference Rossiter2004), a review of the broader NFT literature reveals little attempt to remedy these shortcomings.
Lack of control
Half of the articles screened for this review suffered from a lack of clear and/or meaningful experimental control. RCTs are frequently considered the evidentiary ‘gold standard’ for intervention efficacy research; however, only one eligible NFT–ABI RCT was found. Instead, we encountered numerous quasi-experimental designs, uncontrolled pre–post comparison studies, and case studies. While there is precedent for using rigorous case studies to establish an evidence base (Chambless et al., Reference Chambless, Baker, Baucom, Beutler, Calhoun, Crits-christoph, Daiuto, DeRubeis, Detweiler, Haaga, Johnson, McCurry, Mueser, Pope, Sanderson, Shoham, Stickle, Williams and Woody1998; Chambless & Hollon, Reference Chambless and Hollon1998; Tate et al., Reference Tate, Perdices, Rosenkoetter, Shadish, Vohra, Barlow, Horner, Kazdin, Kratochwill, McDonald, Sampson, Shamseer, Togher, Albin, Backman, Douglas, Evans, Gast, Manolov, Mitchell, Nickels, Nikles, Ownsworth, Rose, Schmid and Wilson2016), adequate and meaningful control was distinctly lacking among the case studies identified (e.g., A-B-A-B design). One possible contributor to this may be ex posto facto selection of clinical cases for publication (e.g., Thornton, Reference Thornton2000). Rather than illustrating the utility of NFTs, the uncontrolled and narrative nature of these studies may do little more than add noise to already indeterminate literature. While this lack of control precludes the determination of treatment efficacy in general, it may be especially problematic for research on NFT effects that are argued to be especially prone to nonspecific influences, such as demand characteristics and placebo effects (Thibault, Lifshitz, & Raz, Reference Thibault, Lifshitz and Raz2016; Thibault & Raz, Reference Thibault and Raz2017).
Ethical considerations
Despite a glut of available publications focused on neurofeedback, the vast majority of the literature employed studies that either implemented neurofeedback as an assessment modality or an outcome measure unto itself (e.g., Ibric, Dragomirescu, & Hudspeth, Reference Ibric, Dragomirescu and Hudspeth2009). These uses are somewhat at loggerheads with NFTs’ widespread community use as therapeutic interventions for ABI-related functional and cognitive impairment. Despite the limited evidence that NFT–ABI provides superior outcomes, NFT practitioners continue to advertise their approach as empirically supported. Whether NFTs are being administered by certified neurofeedback technicians, counsellors, psychologists, or other professional clinicians, providing costly services without verifiable benefits constitutes unethical, and potentially harmful, practice (Canadian Psychological Association, 2000). Such ethical concerns are further elevated considering the vulnerability of the ABI population – persons who may be highly motivated to find a ‘cure’ for their cognitive and emotional difficulties yet, by virtue of their cognitive impairment, may have difficulty ascertaining the evidence in support of therapies such as NFTs.
Absent codified ethical guidelines for research interpretation and publication, the validity of reported NFTs’ outcomes is also suspect. In light of the many methodological concerns outlined previously, the significant effects reported by at least one of the reviewed studies raises concerns regarding the potentially partisan motivations of much NFT research (Thibault & Raz, Reference Thibault and Raz2017). This concern is not only restricted to researchers but also extends to the selective publication of research in support of NFTs’ efficacy on a broader scale. Academic journals with a vested interest in presenting impactful work may demonstrate a greater willingness to publish studies with significant results, even if these are misrepresentative of the current corpus of knowledge (Rosenthal, Reference Rosenthal1979). This concern is particularly relevant to the current review as half of the reviewed studies were published by an NFT-specific journal. Although this issue is clearly not relegated to NFT literature, it may be particularly harmful when applied to clinical topics and more again when addressing treatment for a vulnerable population with a time-limited postacute window for optimal recovery (Christensen et al., Reference Christensen, Colella, Inness, Hebert, Monette, Bayley and Green2008; Jaffe, Polissar, Fay, & Liao, Reference Jaffe, Polissar, Fay and Liao1995). Thus, to the extent that patients engage in unsupported therapies to the exclusion of other interventions, NFTs may actually impart an iatrogenic effect by interfering with better-supported rehabilitation during the most sensitive period of post-ABI recovery (e.g., ‘opportunity cost’: Lilienfield, Lynn, & Lohr, Reference Lilienfield, Lynn and Lohr2003).
Recommendations for Future Research
The principle of ‘training one’s brain’ through EEG is not an unworthy premise. Unfortunately, the current state of the evidence precludes any firm conclusions about whether and how NFTs may actually benefit cognitive functioning. As public interest in NFTs is unlikely to diminish any time soon, it behooves researchers and clinicians to conduct more rigorous research to determine for whom NFTs are effective, under what conditions, and via which mechanisms of action. In support of this, we provide some recommendations below.
Mechanism of action
Despite the agglomeration of basic and clinical studies over the past several decades, there remains little clarity about how, why, or whether NFTs are effective for ABI rehabilitation. Given that these foundational questions persist unabated, an alternative strategy for establishing a cohesive literature may be advised. Rather than continue to generate exploratory clinical trials, a more fruitful aim for future studies may be to present a clear and unified rationale as to what the mechanisms of action for given NFTs are within a given population, and how applying a particular NFT approach would address the relevant clinical issue. For example, a sizeable body of literature indicates problems with attentional bias in persons with PTSD (Mozzambani et al., Reference Mozzambani, Fuso, Malta, Ribeiro, Pupo, Flaks and Mello2017; Russman Block et al., Reference Russman Block, King, Sripada, Weissman, Welsh and Liberzon2017); as such, NFTs may be applied to promote attentional flexibility by training more frontally oriented bands of EEG activity.
Paralleling Whyte et al.’s (Reference Whyte, Dijkers, Hart, Zanca, Packel, Ferraro and Tsaousides2014) recommendations for developing theoretically informed rehabilitation practices, NFT–ABI researchers are encouraged to engage in a theory-driven deductive empirical process as opposed to further advancement along the more inductive exploratory path that predominates the literature currently. At its core, this would require the generation of provisional a priori frameworks founded on an understanding of clinically relevant diagnostic criteria, psychopathology, neuropsychological functioning, and neuroanatomical/electrophysiological sequelae. In practice, this would likely require the measurement of brain-derived metrics (e.g., EEG, fMRI) in addition to cognitive, behavioural, or functional outcomes to establish a clearer association between given NFT protocols, their potential impact on neuroanatomical functioning, and clinical outcomes. Such an approach would not only allow for more rigorously evaluable hypotheses, but also more ecologically valid outcomes and successive refinement of specific NFT protocols.
Study design
Adequately controlled investigations are necessary to establish any evidence-based practice. While RCTs are encouraged, the RCT model may not be the most appropriate for patients and treatments that are highly individual by definition. Instead, controlled case studies may provide more nuanced and ecologically relevant information on the effectiveness of NFT–ABIs. Single-case designs may prove particularly expedient for refining inchoate theoretical frameworks due to their relatively low resource demands. Unfortunately, single-case studies are rarely conducted in a manner that is amenable to drawing causal conclusions. Rather than discourage single-case studies, it is recommended that future attempts be approached with the same degree of intention and a priori reasoning as more standard experimental approaches. In support of this recommendation, future researchers are encouraged to consult the Single-Case Reporting guidelines In BEhavioural interventions guidelines (Tate et al., Reference Tate, Perdices, Rosenkoetter, Shadish, Vohra, Barlow, Horner, Kazdin, Kratochwill, McDonald, Sampson, Shamseer, Togher, Albin, Backman, Douglas, Evans, Gast, Manolov, Mitchell, Nickels, Nikles, Ownsworth, Rose, Schmid and Wilson2016) in order to maximize their methodological rigour and potential clinical impact.
Control conditions
In our initial pool of candidate studies, even where experimental control was ostensibly present (e.g., Kober et al., Reference Kober, Schweiger, Witte, Reichert, Grieshofer, Neuper and Wood2015; Thornton & Carmody, Reference Thornton and Carmody2013), closer analysis frequently revealed only partially reported group comparisons or poorly defined comparison groups that represented inadequate control of demographic variables. Further research employing rigorous methodology and meaningful comparison groups is needed. Naturally, this will be aided by clearer clinical targets and, perhaps, a focus on specific symptom remediation rather than overall ‘recovery’. Beyond the call for control in general, sham treatment controls may be of particular use in distinguishing the specific impact of NFTs from the effects of placebo effects (e.g., Chow, Javan, Ros, Frewen, Reference Chow, Javan, Ros and Frewen2017).
Given that significant spontaneous recovery is expected to occur within the first year post brain injury (Rabinowitz, Hart, Whyte, & Kim, Reference Rabinowitz, Hart, Whyte and Kim2017; Schretlen & Shapiro, Reference Schretlen and Shapiro2003), researchers need to account for the fact that any observed NFT effects may be simply due to natural recovery processes. Therefore, any NFT–ABI studies conducted within the first year of recovery may be best served by including an active control group to deal with spontaneous recovery effects, while also avoiding the ethical issue of treatment denial during the most sensitive window of recovery. Finally, given the confounding and significant heterogeneity in NFT protocols, it is difficult to determine whether the relatively low observed success rate of NFTs is due to inadequate treatment or individual neuroanatomical differences, psychological factors, or cognitive strategies (Kober et al., Reference Kober, Schweiger, Witte, Reichert, Grieshofer, Neuper and Wood2015; Ninaus et al., Reference Ninaus, Kober, Witte, Koschutnig, Stangl, Neuper and Wood2013; Wood, Kober, Witte, & Neuper, Reference Wood, Kober, Witte and Neuper2014). Future studies should measure such parameters as they may serve as important moderators of treatment response.
Potential bias
There is a need for clearer acknowledgement and counteraction of clinician–researcher and journal publication bias. While this tendency is apparent in other fields as well (e.g., mindfulness: Van Dam et al., Reference Van Dam, van Vugt, Vago, Schmalzl, Saron, Olendzki, Meissner, Lazar, Kerr, Gorchov, Fox, Field, Britton, Brefczynski-Lewis and Meyer2018), the high monetary cost and remarkable level of institutional support provided for NFT–ABIs presents a greater opportunity for exploitation based on misinformation versus other interventions. Future studies may consider instituting ‘adversarial collaborations’ as a matter of course, whereby skeptical co-investigators are appointed to balance the potentially optimistic interpretations provided by NFT-endorsing researchers (see Matzke et al., Reference Matzke, Nieuwenhuis, van Rijn, Slagter, van der Molen and Wagenmakers2015).
Community-based research considerations
Independent clinical researchers in the community may contend with distinct obstacles. One major challenge facing community-based researchers is that of representative sampling. In addition to having limited access to nonclinical control samples, community-based researchers often rely on convenience sampling methods, which may introduce systematic bias based on premorbid participant characteristics that are difficult to account for. Moreover, clients presenting for treatment already have the intrinsic motivation to participate in NFTs, the type of selection bias that would typically be addressed in an RCT design. Where purposeful sampling is impractical, community-based researchers are recommended to consider conducting rigorous single-case studies instead. These require less research infrastructure and provide unique insight into treatment effectiveness (vs. efficacy; Chambless & Hollon, Reference Chambless and Hollon1998).
Another challenge is that community-based researchers may have limited expertise in the analysis of EEG data beyond their typical NFT practice, limited understanding of neuroanatomy or neuropsychological functions, or limited experience with the design and conduct of rigorous empirical research. Given these challenges, it is strongly advised that clinicians seeking to conduct NFT–ABI research consult closely with colleagues who are formally trained and experienced in clinical research methods and statistical/EEG analysis (Thibault & Raz, Reference Thibault and Raz2017). Close partnership with larger academic organizations may provide the infrastructure necessary for the coordination of large-scale multisite studies as well as access to larger clinical participants pools. Indeed, cooperation between community-based researchers and research institutions may be the most expedient and economical means to establish the ethical and practice guidelines necessary for NFT–ABI.
CONCLUSION
The goal of this review was to evaluate the efficacy of NFTs for cognitive rehabilitation following ABI by means of systematic review; however, it became clear over the course of the attempt that a systematic review for this application of NFTs is premature. Consistent with documented issues in other NFT literature, the authors of the current review noted concerns regarding potential reporting bias, inadequate or absent control, and other methodological issues among NFT–ABI research. These problems not only raise questions regarding the quality of NFT–ABI research but, by extension, raise considerable concerns regarding the ethical merits and advisability of providing NFT–ABI to the greater public at this early juncture. In light of these obstacles, it is recommended that the provision of NFT–ABI be suspended until a larger evidence base for such treatment is provided.
ACKNOWLEDGEMENTS
JIA receives financial support from the Alzheimer Society of Canada (Dr. and Mrs. Albert Spatz Award, 18-29). JV receives financial support from the National Sciences and Engineering Research Council of Canada (NSERC; CGSD3-518123-2018).
CONFLICT OF INTEREST
The authors have nothing to disclose.
Appendix A Search Terms
(‘neurofeedback’ or ‘neurotherapy’ or ‘EEG therapy’ or ‘QEEG’ or ‘quantitative EEG’)
and (‘*brain injury’ or ‘TBI’ or ‘ABI’ or ‘stroke’)
and (‘*rehabilitation’ or ‘*training’ or ‘intervention’ or ‘therap*’)
and (‘neuropsych*’ or ‘memory’ or ‘attention*’ or ‘executive’ or ‘processing’ or ‘cogni*’ or ‘arousal’)
Note: Search terms related to brain–computer interfacing or ‘BCI’ were excluded to reduce noise following an initial pilot search. Articles identified by these keywords were found to pertain exclusively to physical functioning and/or conditions unrelated to acquired brain injury.