Introduction
It used to be rather simple. A patient presents with depression – prescribe an antidepressant based on evidence from clinical trials against placebo going back to the 1960s. Now, however, there is increasing questioning of the benefits of antidepressants due to a range of factors ranging at one end from distrust of ‘big Pharma’ and concerns about medicalizing distress (Moncrieff, Reference Moncrieff2008) through to questioning how well antidepressants actually work in the treatment of depression (e.g. Kirsch et al. Reference Kirsch, Deacon, Huedo-Medina, Scoboria, Moore and Johnson2008; Pigott et al. Reference Pigott, Leventhal, Alter and Boren2010). The last is a deceptively simple question that is so hard to answer it raises the suspicion that there is something not quite right with its formulation. Depression is defined at a syndromal level but it is difficult to get away from the fact that patients who meet the syndromal diagnosis are heterogeneous. Randomized controlled clinical trials to test antidepressant efficacy (antidepressant efficacy trials; AETs) in contrast aim for a homogeneous and low-risk patient group, usually in order to obtain results that can be used for licensing purposes. One of the problems about asking how well antidepressants work is in defining the population we mean. Is it the population/s studied in AETs or the one/s seen in the clinic or community – do they differ, by how much, and does it matter in terms of outcome? If there are major differences, and it does matter, then the almost exclusive reliance on AETs to inform clinical treatment guidelines could pose a major problem.
Does AET eligibility affect outcome?
In their paper van der Lem et al. (Reference van der Lem, van der Wee, van Veen and Zitman2010) set out to address this issue by studying a large cohort of depressed patients treated in secondary care in the Netherlands. They compare the outcome in those who would have been eligible for a ‘typical’ AET with those who would have been excluded from such a trial. In agreement with other studies (e.g. Zimmerman et al. Reference Zimmerman, Mattia and Posternak2002; Wisniewski et al. Reference Wisniewski, Rush, Nierenberg, Gaynes, Warden, Luther, McGrath, Lavori, Thase, Fava and Trivedi2009) they found that only a minority of patients (17–25%, depending on the stringency of criteria) would have been eligible for an AET. The two most important factors in their cohort were severity of depression and co-morbid (mainly anxiety) disorders. The results in brief were that AET eligibility did not influence outcome 5 months later and factors involved in AET inclusion or exclusion (and the type of treatment) accounted for less than 5% of the variance in outcome. The authors conclude that the influence of eligibility for AETs on treatment outcome in clinical practice seems to be small.
This result appears at first sight to be at odds with those from the other main study to examine this, the large naturalistic Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study set in the USA (Wisniewski et al. Reference Wisniewski, Rush, Nierenberg, Gaynes, Warden, Luther, McGrath, Lavori, Thase, Fava and Trivedi2009). In this study the AET-eligible, compared with -ineligible, group, had considerably higher rates of response (51.6% v. 39.1%) and remission (34.4% v. 24.7%) when they received open-label treatment with citalopram. There are, however, considerable differences between the two studies that are likely to explain the apparent discrepancy. I shall concentrate on the two most important. First, patients in the STAR*D study were all treated with a single antidepressant agent over an average of 8 weeks in a standardized manner whereas those in the van der Lem et al. study had a variety of treatments (adequacy unknown) over 5 months and only just over half (54%) of those followed up received antidepressants. Second, there was a high drop-out rate in the van der Lem et al. study, with only 46% having a follow-up assessment compared with over 92% in the STAR*D study. Although there did not appear to be systematic baseline differences between those followed up and those lost to follow-up the truth is, with such a large percentage lost, we just cannot be sure what happened to them and how representative the results are. Overall, these limitations make the van der Lem et al. study difficult to interpret with regard to telling us about the influence of AET eligibility on outcome, and certainly on response to any specific treatment.
Can clinical trials ever be generalizable?
One of the striking features from the van der Lem et al. study is an even worse outcome than found in the STAR*D study, with only 28% responding and 21% remitting in the former study compared with 47% responding and 28% remitting in the latter (Trivedi et al. Reference Trivedi, Rush, Wisniewski, Nierenberg, Warden, Ritz, Norquist, Howland, Lebowitz, McGrath, Shores-Wilson, Biggs, Balasubramani and Fava2006). This probably reflects another, almost certainly important, difference between the studies – their patient populations. The Netherlands has a primary-care gatekeeper system so that referral to secondary care will be ‘filtered’ by factors such as treatment non-response, risk or treatment availability, making the patients not representative of depressed patients as a whole, or even those presenting to primary care. The USA does not have primary-care gatekeepers but the study required explicit consent from patients to participate in a clinical trial and there were exclusion criteria based on co-morbidity and previous treatment response (although not as strict as for an AET) so that only 65% of those agreeing to be screened entered into the study (Trivedi et al. Reference Trivedi, Rush, Wisniewski, Nierenberg, Warden, Ritz, Norquist, Howland, Lebowitz, McGrath, Shores-Wilson, Biggs, Balasubramani and Fava2006). Therefore the STAR*D patients were also only a subgroup of those seen in clinical practice, but a different subgroup to the Netherlands sample.
Where does this leave us regarding the question as to whether we can generalize from AETs to clinical practice? We can be certain that patients in AETs are not representative overall of depressed patients seen in clinical practice (the problem of what is a representative sample is of course another matter as it will vary by setting and healthcare system). The STAR*D study is the best evidence that we have so far that AET eligibility does influence treatment response but this should be no surprise given what we already know about predictors of outcome (O'Leary et al. Reference O'Leary, Costello, Gormley and Webb2000; Van Henricus et al. Reference Van Henricus, Schoevers and Dekker2008; Kim et al. Reference Kim, Kim, Stewart, Kim, Yoon, Jung, Lee, Yim and Jun2011). What we cannot tell from studies such as this is the degree to which we can use the results from AETs to guide practice. The efficacy rationale for antidepressants is based on separation from placebo, not overall outcome. From the limited evidence available the factors that influence response to antidepressants may also influence response to placebo in AETs (Stewart et al. Reference Stewart, McGrath, Quitkin, Harrison, Markowitz, Wager and Leibowitz1989; Angst et al. Reference Angst, Scheidegger and Stabl1993). The truth is that we just do not know whether the pharmacological effect (i.e. effect size against placebo) that we see in AETs reflects what occurs in other patient populations. The disappointingly poor response in these more general populations does, however, emphasize that antidepressants on their own have limited benefits and that other approaches are required.
Matching patient to treatment
We will never be able to ethically randomize a fully representative patient sample to placebo-controlled, or probably even comparator-controlled, trials (assuming we know what we mean by representative). Even if we could, patients will never be treated in clinical practice in the same way they are in a randomized clinical trial. On the other hand, the more naturalistic and representative a study is, the more outcome will affected by a myriad of factors, making the contribution of an individual treatment hard to detect (in the absence of large sustained effects and ‘hard’ objective outcomes such as death or hospitalization). The global question as to whether we can extrapolate AETs to clinical practice therefore becomes somewhat meta-physical in that we can never answer it. AETs can provide evidence for a pharmacological antidepressant effect of a drug in groups of patients with syndromally defined depression but over-interpretation and over-generalization of the precision of the size of effect to an individual depressed patient seen in a clinical setting makes little sense. AETs do provide useful, but limited, information about when it may be helpful to treat with an antidepressant, what to use and the likely duration of benefit (Anderson et al. Reference Anderson, Ferrier, Baldwin, Cowen, Howard, Lewis, Matthews, McAllister-Williams, Peveler, Scott and Tylee2008). However, applying group effects to individual patients is always going to be inherently uncertain, and the less eligible a patient is for an AET the less certain we are able to be. This uncertainty and wide individual variation in response to specific treatments have led to current interest in whether or not it may be possible to personalize treatment for depression based on developing predictors for response such as sociodemographic and clinical characteristics, and biological markers (e.g. neuroimaging or genetic variation) (Simon & Perlis, Reference Simon and Perlis2010). Currently this variability is unpredictable, and if largely stochastic like the weather could remain so, but given the poor outcomes for depression we owe it to our patients to try and find useful predictors. At present, however, good clinical treatment of depressed patients has to remain a skilful art which uses best evidence to inform individualized treatment trials guided by careful evaluation of outcome, not forgetting that it needs to be in the context of full clinical assessment, patient education and negotiation and integration of psychosocial and drug treatment approaches.
Introduction
It used to be rather simple. A patient presents with depression – prescribe an antidepressant based on evidence from clinical trials against placebo going back to the 1960s. Now, however, there is increasing questioning of the benefits of antidepressants due to a range of factors ranging at one end from distrust of ‘big Pharma’ and concerns about medicalizing distress (Moncrieff, Reference Moncrieff2008) through to questioning how well antidepressants actually work in the treatment of depression (e.g. Kirsch et al. Reference Kirsch, Deacon, Huedo-Medina, Scoboria, Moore and Johnson2008; Pigott et al. Reference Pigott, Leventhal, Alter and Boren2010). The last is a deceptively simple question that is so hard to answer it raises the suspicion that there is something not quite right with its formulation. Depression is defined at a syndromal level but it is difficult to get away from the fact that patients who meet the syndromal diagnosis are heterogeneous. Randomized controlled clinical trials to test antidepressant efficacy (antidepressant efficacy trials; AETs) in contrast aim for a homogeneous and low-risk patient group, usually in order to obtain results that can be used for licensing purposes. One of the problems about asking how well antidepressants work is in defining the population we mean. Is it the population/s studied in AETs or the one/s seen in the clinic or community – do they differ, by how much, and does it matter in terms of outcome? If there are major differences, and it does matter, then the almost exclusive reliance on AETs to inform clinical treatment guidelines could pose a major problem.
Does AET eligibility affect outcome?
In their paper van der Lem et al. (Reference van der Lem, van der Wee, van Veen and Zitman2010) set out to address this issue by studying a large cohort of depressed patients treated in secondary care in the Netherlands. They compare the outcome in those who would have been eligible for a ‘typical’ AET with those who would have been excluded from such a trial. In agreement with other studies (e.g. Zimmerman et al. Reference Zimmerman, Mattia and Posternak2002; Wisniewski et al. Reference Wisniewski, Rush, Nierenberg, Gaynes, Warden, Luther, McGrath, Lavori, Thase, Fava and Trivedi2009) they found that only a minority of patients (17–25%, depending on the stringency of criteria) would have been eligible for an AET. The two most important factors in their cohort were severity of depression and co-morbid (mainly anxiety) disorders. The results in brief were that AET eligibility did not influence outcome 5 months later and factors involved in AET inclusion or exclusion (and the type of treatment) accounted for less than 5% of the variance in outcome. The authors conclude that the influence of eligibility for AETs on treatment outcome in clinical practice seems to be small.
This result appears at first sight to be at odds with those from the other main study to examine this, the large naturalistic Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study set in the USA (Wisniewski et al. Reference Wisniewski, Rush, Nierenberg, Gaynes, Warden, Luther, McGrath, Lavori, Thase, Fava and Trivedi2009). In this study the AET-eligible, compared with -ineligible, group, had considerably higher rates of response (51.6% v. 39.1%) and remission (34.4% v. 24.7%) when they received open-label treatment with citalopram. There are, however, considerable differences between the two studies that are likely to explain the apparent discrepancy. I shall concentrate on the two most important. First, patients in the STAR*D study were all treated with a single antidepressant agent over an average of 8 weeks in a standardized manner whereas those in the van der Lem et al. study had a variety of treatments (adequacy unknown) over 5 months and only just over half (54%) of those followed up received antidepressants. Second, there was a high drop-out rate in the van der Lem et al. study, with only 46% having a follow-up assessment compared with over 92% in the STAR*D study. Although there did not appear to be systematic baseline differences between those followed up and those lost to follow-up the truth is, with such a large percentage lost, we just cannot be sure what happened to them and how representative the results are. Overall, these limitations make the van der Lem et al. study difficult to interpret with regard to telling us about the influence of AET eligibility on outcome, and certainly on response to any specific treatment.
Can clinical trials ever be generalizable?
One of the striking features from the van der Lem et al. study is an even worse outcome than found in the STAR*D study, with only 28% responding and 21% remitting in the former study compared with 47% responding and 28% remitting in the latter (Trivedi et al. Reference Trivedi, Rush, Wisniewski, Nierenberg, Warden, Ritz, Norquist, Howland, Lebowitz, McGrath, Shores-Wilson, Biggs, Balasubramani and Fava2006). This probably reflects another, almost certainly important, difference between the studies – their patient populations. The Netherlands has a primary-care gatekeeper system so that referral to secondary care will be ‘filtered’ by factors such as treatment non-response, risk or treatment availability, making the patients not representative of depressed patients as a whole, or even those presenting to primary care. The USA does not have primary-care gatekeepers but the study required explicit consent from patients to participate in a clinical trial and there were exclusion criteria based on co-morbidity and previous treatment response (although not as strict as for an AET) so that only 65% of those agreeing to be screened entered into the study (Trivedi et al. Reference Trivedi, Rush, Wisniewski, Nierenberg, Warden, Ritz, Norquist, Howland, Lebowitz, McGrath, Shores-Wilson, Biggs, Balasubramani and Fava2006). Therefore the STAR*D patients were also only a subgroup of those seen in clinical practice, but a different subgroup to the Netherlands sample.
Where does this leave us regarding the question as to whether we can generalize from AETs to clinical practice? We can be certain that patients in AETs are not representative overall of depressed patients seen in clinical practice (the problem of what is a representative sample is of course another matter as it will vary by setting and healthcare system). The STAR*D study is the best evidence that we have so far that AET eligibility does influence treatment response but this should be no surprise given what we already know about predictors of outcome (O'Leary et al. Reference O'Leary, Costello, Gormley and Webb2000; Van Henricus et al. Reference Van Henricus, Schoevers and Dekker2008; Kim et al. Reference Kim, Kim, Stewart, Kim, Yoon, Jung, Lee, Yim and Jun2011). What we cannot tell from studies such as this is the degree to which we can use the results from AETs to guide practice. The efficacy rationale for antidepressants is based on separation from placebo, not overall outcome. From the limited evidence available the factors that influence response to antidepressants may also influence response to placebo in AETs (Stewart et al. Reference Stewart, McGrath, Quitkin, Harrison, Markowitz, Wager and Leibowitz1989; Angst et al. Reference Angst, Scheidegger and Stabl1993). The truth is that we just do not know whether the pharmacological effect (i.e. effect size against placebo) that we see in AETs reflects what occurs in other patient populations. The disappointingly poor response in these more general populations does, however, emphasize that antidepressants on their own have limited benefits and that other approaches are required.
Matching patient to treatment
We will never be able to ethically randomize a fully representative patient sample to placebo-controlled, or probably even comparator-controlled, trials (assuming we know what we mean by representative). Even if we could, patients will never be treated in clinical practice in the same way they are in a randomized clinical trial. On the other hand, the more naturalistic and representative a study is, the more outcome will affected by a myriad of factors, making the contribution of an individual treatment hard to detect (in the absence of large sustained effects and ‘hard’ objective outcomes such as death or hospitalization). The global question as to whether we can extrapolate AETs to clinical practice therefore becomes somewhat meta-physical in that we can never answer it. AETs can provide evidence for a pharmacological antidepressant effect of a drug in groups of patients with syndromally defined depression but over-interpretation and over-generalization of the precision of the size of effect to an individual depressed patient seen in a clinical setting makes little sense. AETs do provide useful, but limited, information about when it may be helpful to treat with an antidepressant, what to use and the likely duration of benefit (Anderson et al. Reference Anderson, Ferrier, Baldwin, Cowen, Howard, Lewis, Matthews, McAllister-Williams, Peveler, Scott and Tylee2008). However, applying group effects to individual patients is always going to be inherently uncertain, and the less eligible a patient is for an AET the less certain we are able to be. This uncertainty and wide individual variation in response to specific treatments have led to current interest in whether or not it may be possible to personalize treatment for depression based on developing predictors for response such as sociodemographic and clinical characteristics, and biological markers (e.g. neuroimaging or genetic variation) (Simon & Perlis, Reference Simon and Perlis2010). Currently this variability is unpredictable, and if largely stochastic like the weather could remain so, but given the poor outcomes for depression we owe it to our patients to try and find useful predictors. At present, however, good clinical treatment of depressed patients has to remain a skilful art which uses best evidence to inform individualized treatment trials guided by careful evaluation of outcome, not forgetting that it needs to be in the context of full clinical assessment, patient education and negotiation and integration of psychosocial and drug treatment approaches.
Declaration of Interest
I.M.A. has received grant support and honoraria for speaking from pharmaceutical companies marketing antidepressants and chaired the Development Guideline Group for the National Institute for Health and Clinical Excellence Depression Guideline Update.