Introduction
Depression frequently accompanies physical illness (Vos et al. Reference Vos, Flaxman, Naghavi, Lozano, Michaud and Ezzati2012). It is clinically important because it is associated with worse physical symptoms, poorer quality of life and greater functional disability (Katon, Reference Katon1996; Egede, Reference Egede2007). Patients with depression also spend more time in hospital, are less likely to adhere to medical treatments, and consequently incur higher healthcare costs (DiMatteo et al. Reference DiMatteo, Lepper and Croghan2000). Despite its importance, the management of depression comorbid with physical illness is often suboptimal with low rates of detection and treatment (Katon & Sullivan, Reference Katon and Sullivan1990; Hirschfeld et al. Reference Hirschfeld, Keller, Panico, Arons, Barlow and Davidoff1997; Kessler et al. Reference Kessler, Lloyd, Lewis and Gray1999; Balestrieri et al. Reference Balestrieri, Bisoffi, Tansella, Martucci and Goldberg2002; Cepoiu et al. Reference Cepoiu, McCusker, Cole, Sewitch, Belzile and Ciampi2008; Walker et al. Reference Walker, Hansen, Martin, Symeonides, Ramessur and Murray2014).
Admission to a general hospital therefore provides an important opportunity to improve the management of comorbid depression. Detection can be improved by incorporating systematic screening into hospital admission procedures and treatment can be initiated where appropriate (Beach et al. Reference Beach, Walker, Celano, Mastromauro, Sharpe and Huffman2015). However, in order to plan such a management strategy we need to know how common depression is in general hospital inpatients. Surprisingly, we currently lack a clear answer to this basic question. This is largely because previous systematic reviews of the prevalence of depression have focussed on study populations with specific physical diseases such as cancer, rather than on specific clinical settings such as general hospital wards (Thombs et al. Reference Thombs, Bass, Ford, Stewart, Tsilidis and Patel2006a, Reference Thombs, Bresnick and Magyar-Russellb; Delville & McDougall, Reference Delville and McDougall2008; Craig et al. Reference Craig, Tran and Middleton2009; Davydow et al. Reference Davydow, Gifford, Desai, Bienvenu and Needham2009; Poynter et al. Reference Poynter, Shuman, Diaz-Granados, Kapral, Grace and Stewart2009; Kouwenhoven et al. Reference Kouwenhoven, Kirkevold, Engedal and Kim2011; Mitchell et al. Reference Mitchell, Chan, Bhatti, Halton, Grassi and Johansen2011; Shanmugasegaram et al. Reference Shanmugasegaram, Russell, Kovacs, Stewart and Grace2012; Castro-de-Araujo et al. Reference Castro-de-Araujo, Barcelos-Ferreira, Martins and Bottino2013; Singer et al. Reference Singer, Szalai, Briest, Brown, Dietz and Einenkel2013; Walker et al. Reference Walker, Holm Hansen, Martin, Sawhney, Thekkumpurath and Beale2013; Wilder Schaaf et al. Reference Wilder Schaaf, Artman, Peberdy, Walker, Ornato and Gossip2013; Wiseman et al. Reference Wiseman, Foster and Curtis2013).
We therefore aimed to conduct a systematic review of studies of the prevalence of depression in general hospital inpatients. We focussed on interview-based studies in order to estimate the proportion of patients with a definite diagnosis of depressive illness.
Methods
Search strategy and selection criteria
We performed a systematic review of studies of the prevalence of depression in general hospital inpatients, using procedures that accorded with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (Moher et al. Reference Moher, Liberati, Tetzlaff, Altman and Group2009). We identified studies by searching EMBASE, Medline and PsycINFO (from 1974, 1946 and 1967 respectively) to December 2015. Searches were run for the combination of ‘prevalence’, ‘general hospital inpatient’ and ‘depression’ using both standardised subject terms and free text terms, including synonyms and alternative spellings. We provide full details of the searches used in the online appendix. We also manually searched the reference lists of review articles obtained through the electronic searches.
We judged studies to be relevant to the review if they met all the following criteria: (1) the study clearly aimed to estimate the prevalence of depression (i.e. studies that were designed to address a different research question but happened to include a prevalence estimate, such as clinical trials or questionnaire validation studies, were not included); (2) all study participants were adults (aged 16 or older); (3) all study participants (or a clearly defined subgroup for which there was an estimate of depression prevalence) were general hospital inpatients at the time of depression assessment; (4) the presence of depressive illness was determined using a diagnostic interview. We only included primary studies (i.e. not reviews) for which we could obtain the full paper to allow data extraction. No language restrictions were applied.
When selecting publications to include in the review, we also applied quality criteria to the reported study methods. In order to ensure a consistent and transparent approach to this quality assessment, we used a checklist based on the work of Loney et al. (Loney et al. Reference Loney, Chambers, Bennett, Roberts and Stratford1998; Walker et al. Reference Walker, Holm Hansen, Martin, Sawhney, Thekkumpurath and Beale2013). We used a checklist rather than a continuous scale to ensure that all the key aspects of the study methods met basic quality criteria (Juni et al. Reference Juni, Witschi, Bloch and Egger1999). The basic methodological standards required for inclusion were: (1) the study sample was obtained using a random or consecutive sampling method; (2) data were available for analysis on at least 70% of the eligible patients (either as reported by the authors or derived from presented data); (3) depressive illness was defined using standard diagnostic criteria: major depression from the Diagnostic and Statistical Manual of Mental Disorders (DSM), depressive episode from the International Classification of Diseases (ICD) or similar (World Health Organization, 1992; American Psychiatric Association, 1994, 2013). The first two of these criteria aimed to minimise selection bias, and the third aimed to ensure that estimates could be compared across studies.
Data collection
We screened the titles and abstracts of all articles identified by the searches to determine whether each might meet the selection criteria. We then reviewed the full text of the article, with the help of a translator where necessary, if there was any possibility that it might be relevant and would meet our quality criteria. This process (including a screening of titles and abstracts) was conducted independently by two researchers with reference to a third researcher to resolve disagreements.
Two researchers independently extracted the following data from all the articles included in the review, using a specially designed, standardised data extraction form: country in which the study took place; age, sex and clinical characteristics of participants; sample size; type of depression interview used and profession of interviewer; diagnostic criteria used to determine the presence of depressive illness; prevalence of depression in the sample (for cohort studies, we extracted the prevalence of depression at the first time point only).
Clinical analysis
Two researchers reviewed the data extracted on participants’ clinical characteristics in order to assess their similarity across studies. We found that there was high clinical heterogeneity, indicating that a meta-analysis of all the studies would not yield meaningful results. Whilst some studies had recruited general medical and surgical inpatients, others had recruited inpatients with very specific clinical characteristics and therefore were of samples unrepresentative of the general hospital inpatient population. In order to deal with this clinical heterogeneity, we restricted our statistical analysis to studies of general medical and surgical inpatients. The studies of more specific patient groups are described in our results and online appendix in order to provide the reader with a comprehensive overview of the relevant literature.
Statistical analysis
We used forest plots to display the proportion (with exact binomial 95% confidence intervals) of participants diagnosed with depression in each study (Newcombe, Reference Newcombe1998).
We used random-effects models to describe the prevalence of depression in general medical and surgical inpatients. This is because it is implausible that the underlying study-specific prevalence of depression (i.e. the prevalence that would be observed were a study of infinite size) is exactly the same for each study. Prevalence is likely to vary from study to study according to factors, both measured and unmeasured, that differ between them (Stroup et al. Reference Stroup, Berlin, Morton, Olkin, Williamson and Rennie2000). Random effects models assume that the populations investigated in each study are themselves drawn from a wider population of populations and that the underlying study-specific prevalences in these populations therefore follow a statistical distribution, rather than taking a single value.
As is common for proportions, we used the logit transformation expressing each of the prevalences as a log-odds. Accordingly, our random-effects models assume that the logit transformed prevalences follow a normal distribution with a mean and standard deviation. This mean can be thought of as a ‘typical’ prevalence, while the standard deviation quantifies the underlying between-study variability in prevalences. This variability is summarised by a 95% prediction interval, which is the interval within which 95% of underlying study-specific prevalences are predicted to lie (for a thorough discussion of this topic see Guddat et al.) (Guddat et al. Reference Guddat, Grouven, Bender and Skipka2012). As such it differs from a 95% confidence interval which quantifies the precision of the mean of the study-specific prevalences (with the mean defined after logit transformation).
We used the inverse variance method of DerSimonian and Laird to estimate between-study heterogeneity in underlying depression prevalence and the I-squared measure which represents the proportion of total variance attributable to this heterogeneity (Higgins et al. Reference Higgins, Thompson, Deeks and Altman2003). The assumption that underlying prevalences are normally distributed after logit transformation was not contradicted by our data.
We investigated potential sources of the heterogeneity that we observed between the studies’ prevalence estimates (that is, the large amount of between-study variability compared with the total variability) by considering some of the known differences between the studies. To this end, we inspected scatter plots of depression prevalence against year of study publication, sample size, average (or another available measure for central tendency) participant age, and percentage of female participants. We used forest plots to compare depression prevalence in studies grouped by use of DSM major depression v. other diagnostic criteria for depression, and national income of the country where the study took place (we used income groupings because the studies had been done in too many different countries to group by country) (The World Bank, 2015). Where evidence of an association with depression prevalence was apparent, we performed a mixed-effects meta-regression and present its I-squared statistic, odds ratio and p value for the association (Thompson & Higgins, Reference Thompson and Higgins2002). We did not present funnel plots for bias assessment because in the presence of heterogeneity, there is no reason to expect a funnel shape (Terrin et al. Reference Terrin, Schmid and Lau2005). Statistical analysis was performed in R v3.2.2 using the ‘meta’ package v3.8-0 (Schwarzer et al. Reference Schwarzer, Carpenter and Rucker2014; R Core Team, 2015). Graphs were produced in R and Stata v14 (StataCorp, College Station, TX, USA).
Results
Our initial screening of 23 775 titles and abstracts yielded 4161 articles for full paper review. We considered 158 of these to be relevant to the review. Of these 158 articles, 65 (41%), describing 60 separate studies, met our quality criteria and were included (see Figs 1, 2 and online appendix) (Heeren & Rooymans, Reference Heeren and Rooymans1985; Feldman et al. Reference Feldman, Mayou and Hawton1987; Yellowlees et al. Reference Yellowlees, Alpers, Bowden, Bryant and Ruffin1987; Starkstein et al. Reference Starkstein, Robinson, Berthier and Price1988; Hardman et al. Reference Hardman, Maguire and Crowther1989; O'Riordan et al. Reference O'Riordan, Hayes, Shelley, O'Neill, Walsh and Coakley1989; Seltzer, Reference Seltzer1989; Abiodun & Ogunremi, Reference Abiodun and Ogunremi1990; Kigamwa, Reference Kigamwa1991; Koenig et al. Reference Koenig, Meador, Shelp, Goli, Cohen and Blazer1991, Reference Koenig, O'Connor, Guarisco, Zabel and Ford1993, Reference Koenig, George, Peterson and Pieper1997; Lazaro et al. Reference Lazaro, de Pablo, Nieto, Vieta, Vilalta and Cirera1991, Reference Lazaro, Marcos and Valdes1995; Kathol & Wenzel, Reference Kathol and Wenzel1992; Kok et al. Reference Kok, Heeren, Drenth, Janzing and de Wildt1992, Reference Kok, Heeren, Hooijer, Dinkgreve and Rooijmans1995; Snyder et al. Reference Snyder, Reyner, Schmeidler, Bogursky, Gomez and Strain1992; Thalassinos et al. Reference Thalassinos, Masson, Rouillon, Vinceneux and Lemperiere1992; Alexander et al. Reference Alexander, Dinesh and Vidyasagar1993; Fenton et al. Reference Fenton, Cole, Engelsmann and Mansouri1994; Jenkins et al. Reference Jenkins, Jamil, Taylor and Hughes1994; Kishi et al. Reference Kishi, Robinson and Forrester1994; Ng et al. Reference Ng, Chan and Straughan1995; Arnold & Privitera, Reference Arnold and Privitera1996; Arolt & Driessen, Reference Arolt and Driessen1996; Blomstedt et al. Reference Blomstedt, Katila, Henriksson, Ekholm, Jaaskelainen and Pyykko1996; Lykouras et al. Reference Lykouras, Adrachta, Kalfakis, Oulis, Voulgari and Christodoulou1996; Silverstone, Reference Silverstone1996; Arolt et al. Reference Arolt, Driessen and Dilling1997; Nair & Pillay, Reference Nair and Pillay1997; Hosaka et al. Reference Hosaka, Aoki, Watanabe, Okuyama and Kurosawa1999; Linka et al. Reference Linka, Bartko, Agardi and Kemeny1999, Reference Linka, Bartko, Agardi and Kemeny2000; Kugaya et al. Reference Kugaya, Akechi, Okuyama, Nakano, Mikami and Okamura2000; Uwakwe, Reference Uwakwe2000; Wancata et al. Reference Wancata, Meise and Sachs2000; Madianos et al. Reference Madianos, Papaghelis, Ioannovich and Dafni2001; Aghanwa & Ndububa, Reference Aghanwa and Ndububa2002; Prieto et al. Reference Prieto, Blanch, Atala, Carreras, Rovira and Cirera2002; Sharma et al. Reference Sharma, Avasthi, Chakrabarti and Varma2002; Fritzsche et al. Reference Fritzsche, Struss, Stein and Spahn2003; Petrak et al. Reference Petrak, Hardt, Wittchen, Kulzer, Hirsch and Hentzelt2003; Atesci et al. Reference Atesci, Baltalarli, Oguzhanoglu, Karadag, Ozdel and Karagoz2004; Marchesi et al. Reference Marchesi, Brusamonti, Borghi, Giannini, Di Ruvo and Minneo2004; Blumel et al. Reference Blumel, Gibbons, Kanacri, Kerrigan and Florenzano2005; Dogar et al. Reference Dogar, Khawaja, Azeem, Awan, Ayub and Iqbal2008; Dyster-Aas et al. Reference Dyster-Aas, Willebrand, Wikehult, Gerdin and Ekselius2008; Soeiro et al. Reference Soeiro, Colombo, Ferreira, Guimaraes, Botega and Dalgalarrondo2008; Pakriev et al. Reference Pakriev, Kovalev and Mozhaev2009; Koroglu & Tural, Reference Koroglu and Tural2010; Palmu et al. Reference Palmu, Suominen, Vuola and Isometsa2010, Reference Palmu, Suominen, Vuola and Isometsa2011; Zhong et al. Reference Zhong, Chen, Zhang, Xu, Zhou and Yang2010; Baubet et al. Reference Baubet, Ranque, Taïeb, Bérezné, Bricou and Mehallel2011; Kumar et al. Reference Kumar, Kar and Reddy2012; Regvat et al. Reference Regvat, Zmitek, Vegnuti, Kosnik and Suskovic2011; Turner et al. Reference Turner, Piazzini, Chiesa, Barbieri, Vignoli and Gardella2011; Annagür et al. Reference Annagür, Tazegül and Gündüz2013; Kayhan et al. Reference Kayhan, Cicek, Uguz, Karababa and Kucur2013; Moayedoddin et al. Reference Moayedoddin, Rubovszky, Mammana, Jeannot, Sartori and Garin2013; Singer et al. Reference Singer, Szalai, Briest, Brown, Dietz and Einenkel2013; Yan et al. Reference Yan, Gu, Zhong, Wang, Tang and Ling2013; Zhao et al. Reference Zhao, Li, Zhang, Song, Guo and Zhang2014; Topitz et al. Reference Topitz, Benda, Saumer, Friedrich, Konig and Soulier2015).
These studies had been conducted in 29 countries (see online appendix for map) and had included a total of 12 540 participants (median sample size 109, range 27–993).
A variety of interviews and associated diagnostic criteria were used. The most commonly used interview (16 studies) was the Structured Clinical Interview for DSM-IV (SCID) and the most commonly used diagnostic criteria (used in 47 studies) were those for DSM major depression (American Psychiatric Association, 1994; First et al. Reference First, Gibbon, Spitzer and Williams1996). The majority of studies (40) had employed a psychiatrist or clinical psychologist to conduct the diagnostic interviews.
The study sample was of general medical or surgical inpatients (or both) in 31 of the studies (median sample size 215, range 65–993, with a total of 9305 participants; see Table 1) (Feldman et al. Reference Feldman, Mayou and Hawton1987; Seltzer, Reference Seltzer1989; Abiodun & Ogunremi, Reference Abiodun and Ogunremi1990; Kigamwa, Reference Kigamwa1991; Koenig et al. Reference Koenig, Meador, Shelp, Goli, Cohen and Blazer1991, Reference Koenig, O'Connor, Guarisco, Zabel and Ford1993, Reference Koenig, George, Peterson and Pieper1997; Lazaro et al. Reference Lazaro, de Pablo, Nieto, Vieta, Vilalta and Cirera1991, Reference Lazaro, Marcos and Valdes1995; Kathol & Wenzel, Reference Kathol and Wenzel1992; Kok et al. Reference Kok, Heeren, Drenth, Janzing and de Wildt1992, Reference Kok, Heeren, Hooijer, Dinkgreve and Rooijmans1995; Thalassinos et al. Reference Thalassinos, Masson, Rouillon, Vinceneux and Lemperiere1992; Fenton et al. Reference Fenton, Cole, Engelsmann and Mansouri1994; Jenkins et al. Reference Jenkins, Jamil, Taylor and Hughes1994; Arolt & Driessen, Reference Arolt and Driessen1996; Silverstone, Reference Silverstone1996; Arolt et al. Reference Arolt, Driessen and Dilling1997; Nair & Pillay, Reference Nair and Pillay1997; Hosaka et al. Reference Hosaka, Aoki, Watanabe, Okuyama and Kurosawa1999; Linka et al. Reference Linka, Bartko, Agardi and Kemeny1999, Reference Linka, Bartko, Agardi and Kemeny2000; Uwakwe, Reference Uwakwe2000; Wancata et al. Reference Wancata, Meise and Sachs2000; Sharma et al. Reference Sharma, Avasthi, Chakrabarti and Varma2002; Marchesi et al. Reference Marchesi, Brusamonti, Borghi, Giannini, Di Ruvo and Minneo2004; Soeiro et al. Reference Soeiro, Colombo, Ferreira, Guimaraes, Botega and Dalgalarrondo2008; Pakriev et al. Reference Pakriev, Kovalev and Mozhaev2009; Koroglu & Tural, Reference Koroglu and Tural2010; Zhong et al. Reference Zhong, Chen, Zhang, Xu, Zhou and Yang2010; Kumar et al. Reference Kumar, Kar and Reddy2012; Kayhan et al. Reference Kayhan, Cicek, Uguz, Karababa and Kucur2013; Moayedoddin et al. Reference Moayedoddin, Rubovszky, Mammana, Jeannot, Sartori and Garin2013; Yan et al. Reference Yan, Gu, Zhong, Wang, Tang and Ling2013; Topitz et al. Reference Topitz, Benda, Saumer, Friedrich, Konig and Soulier2015). These studies reported prevalence estimates for depression that ranged from 5% to 34% (see Fig. 3). The high heterogeneity observed between study findings (I-squared 90%) indicated that no single estimate was sufficient to describe the prevalence of depression in general medical and/or surgical inpatients. Our random-effects model assumed that the underlying study-specific prevalences followed a normal distribution (on the log-odds scale). The mean of this distribution corresponded to a prevalence of 12% (95% CI 10–15%) with 95% of all populations predicted to have an underlying depression prevalence between 4% and 32% (the prediction interval).
DSM, Diagnostic and Statistical Manual of Mental Disorders; ICD, International Classification of Diseases; MMSE, Mini Mental State Examination; VA, Veterans Affairs.
*Calculated using data from paper.
a Lower-middle income country.
b Demographic data only available on a larger sample of which subgroup was interviewed.
c High income country (World Bank classification).
d Upper-middle income country.
e Endocrinology, nephrology, haematology, gastroenterology, rheumatology, oncology, cardiology, chest disease, infectious diseases, dermatology, physical medicine and rehabilitation, neurology, general surgery, chest surgery, cardiovascular surgery, plastic and reconstructive surgery, urology, orthopaedics, otorhinolaryngology, neurosurgery and gynaecology and obstetrics.
f Depressive episode alone or as part of recurrent depressive disorder or bipolar disorder.
g Haematology, neurology, nephropathy, gastroenterology, dermatology, integrated Chinese and western medicine, cardiology, endocrinology, respiratory, medical oncology, hand surgery, thoracic surgery, orthopaedics, urinary surgery, neurosurgery, sports medicine, maxillofacial surgery, general surgery, tumour surgery, pancreatic surgery and obstetrics.
h ‘Class III (the highest class) hospitals are large comprehensive hospitals integrating the best comprehensive medical services with teaching, research and preventive medicine. They have over 501 hospital beds and are equipped with the most advanced medical equipment and technologies. Class II hospitals provide general medical services for several communities and undertake some training and scientific research tasks. They have 101–500 hospital beds.’
In our investigations of potential sources of this observed heterogeneity visual inspection of the scatter and forest plots suggested that percentage of female participants, study sample size, the income band of the country in which the study was done, and the diagnostic criteria used (but not year of study publication or average participant age), may all be associated with the observed prevalence of depression. We therefore tested the association of these variables with depression prevalence. We found that when expressed as an odds, studies with a higher percentage of female participants reported a lower prevalence of depression (OR 0.82 per 10 percentage points increase in female participants, 95% CI 0.71–0.95, p = 0.007). Studies with larger sample sizes reported lower prevalences (OR 0.82 per doubling in size, 95% CI 0.68–0.99, p = 0.043). There were also non-significant associations with national income in the country in which the study was done (p = 0.292), and the diagnostic criteria used for depression (p = 0.154). Notably, in all our investigations the residual heterogeneity remained high (all I-squared >88%, see online appendix for scatter plots and forest plots) meaning that a very high proportion of the heterogeneity remained unaccounted for by the variables we considered.
In addition to the 31 studies of general medical and/or surgical patients, we identified 29 studies (median sample size 72, range 27–502, with a total of 3235 participants) of inpatients who were in a variety of specialist units (such as endocrinology or haematology) or had very specific clinical characteristics (such as a diagnosis of systemic sclerosis) (Heeren & Rooymans, Reference Heeren and Rooymans1985; Yellowlees et al. Reference Yellowlees, Alpers, Bowden, Bryant and Ruffin1987; Starkstein et al. Reference Starkstein, Robinson, Berthier and Price1988; Hardman et al. Reference Hardman, Maguire and Crowther1989; O'Riordan et al. Reference O'Riordan, Hayes, Shelley, O'Neill, Walsh and Coakley1989; Snyder et al. Reference Snyder, Reyner, Schmeidler, Bogursky, Gomez and Strain1992; Alexander et al. Reference Alexander, Dinesh and Vidyasagar1993; Kishi et al. Reference Kishi, Robinson and Forrester1994; Ng et al. Reference Ng, Chan and Straughan1995; Arnold & Privitera, Reference Arnold and Privitera1996; Blomstedt et al. Reference Blomstedt, Katila, Henriksson, Ekholm, Jaaskelainen and Pyykko1996; Lykouras et al. Reference Lykouras, Adrachta, Kalfakis, Oulis, Voulgari and Christodoulou1996; Kugaya et al. Reference Kugaya, Akechi, Okuyama, Nakano, Mikami and Okamura2000; Madianos et al. Reference Madianos, Papaghelis, Ioannovich and Dafni2001; Aghanwa & Ndububa, Reference Aghanwa and Ndububa2002; Prieto et al. Reference Prieto, Blanch, Atala, Carreras, Rovira and Cirera2002; Fritzsche et al. Reference Fritzsche, Struss, Stein and Spahn2003; Petrak et al. Reference Petrak, Hardt, Wittchen, Kulzer, Hirsch and Hentzelt2003; Atesci et al. Reference Atesci, Baltalarli, Oguzhanoglu, Karadag, Ozdel and Karagoz2004; Blumel et al. Reference Blumel, Gibbons, Kanacri, Kerrigan and Florenzano2005; Dogar et al. Reference Dogar, Khawaja, Azeem, Awan, Ayub and Iqbal2008; Dyster-Aas et al. Reference Dyster-Aas, Willebrand, Wikehult, Gerdin and Ekselius2008; Palmu et al. Reference Palmu, Suominen, Vuola and Isometsa2010; Reference Palmu, Suominen, Vuola and Isometsa2011; Baubet et al. Reference Baubet, Ranque, Taïeb, Bérezné, Bricou and Mehallel2011; Regvat et al. Reference Regvat, Zmitek, Vegnuti, Kosnik and Suskovic2011; Turner et al. Reference Turner, Piazzini, Chiesa, Barbieri, Vignoli and Gardella2011; Annagür et al. Reference Annagür, Tazegül and Gündüz2013; Singer et al. Reference Singer, Szalai, Briest, Brown, Dietz and Einenkel2013; Zhao et al. Reference Zhao, Li, Zhang, Song, Guo and Zhang2014). These studies reported prevalence estimates ranging from 2% to 56%. They are described in detail in the online appendix.
Discussion
This is the first systematic review of studies of the prevalence of depression in general hospital inpatients. The 60 studies that we found had been conducted in 29 countries and included a total of 12 540 participants. They reported a wide range of prevalence estimates. We reduced the clinical heterogeneity by focussing on the 31 studies of general medical and/or surgical inpatients. However, even in these studies, the estimated prevalence ranged from 3% to 34%. There was also a high degree of heterogeneity, indicating that even ‘general medical and/or surgical inpatients’ cannot be considered as a single population, but rather as a number of different populations, each with a different prevalence of depression. These populations had a median prevalence of depression of 12% and we can predict that 95% of them have a prevalence between 4% and 32%. This median prevalence of 12% is more than twice that in the general population, for which international studies suggest an average 12-month prevalence of approximately 5% (Kessler & Bromet, Reference Kessler and Bromet2013).
Our analyses were unable to adequately explain the observed heterogeneity in prevalence estimates. The only variables that we found to be significantly associated with the prevalence of depression were sample size and the proportion of female patients in the study samples. As these explained only a trivial amount of the heterogeneity and the latter association was not in the expected direction (a higher proportion of female patients was associated with a lower prevalence of depression), we judge this finding to be of questionable importance. Our inability to explain the observed heterogeneity indicates that it resulted from variables we were unable to investigate as they had not been consistently reported in the publications we reviewed. These unreported variables might be at the population, healthcare system, patient or methodological level. At the population level, it is likely that national and local prevalence of depression in the general population varies. At the healthcare system level, hospital type (e.g. university, community), funding systems, admission pathways and medical staffing vary substantially. At the patient level, it is likely that the characteristics of patients admitted to general hospitals and specifically to general medical and surgical (as opposed to sub-speciality) wards varies. Methodologically, there are likely to be unreported variations in how the studies were done. These include how the patients were sampled (e.g. who was excluded), how the diagnosis of depression was made (e.g. the details of how the diagnostic interviews were conducted, whether physical symptoms were counted toward the diagnosis of depression or not and exactly how the diagnostic criteria were applied) and when the assessment was done during the period of hospitalisation (e.g. soon after admission or later in the stay).
This review has strengths which include: (a) The use of clearly defined inclusion criteria for papers to minimise selection bias; (b) the focus on studies where the diagnosis of depression was made by interview; (c) the exclusion of studies with major design flaws (Stroup et al. Reference Stroup, Berlin, Morton, Olkin, Williamson and Rennie2000; Moher et al. Reference Moher, Liberati, Tetzlaff, Altman and Group2009). It also has limitations which include: (a) a reliance on the published reports to assess studies’ relevance and quality, which may potentially have led to us excluding studies that were in fact well conducted, but poorly reported; (b) our inability to investigate all potential sources of heterogeneity because of the limited potentially relevant data reported in the publications we reviewed.
Given the importance of the question we addressed in this review, we found a remarkably small literature, much of which was published some time ago. Furthermore, our quality assessments indicated that much of that literature was of poor quality. Common shortcomings were poor sampling strategies and the use of unclear case definitions for depressive disorder. Even the methodologically better studies selected for inclusion in this review were mostly small in size (median sample size 109) by epidemiological standards. There is consequently a clear and pressing need for better quality studies of the prevalence of depression in medical inpatients. If these are to inform service planning these should aim both to determine the prevalence of depression in specific settings (such as National Health Service hospitals in the UK) and to clarify the determinants of the substantial apparent variations in prevalence noted in this review. Suggestions for the design of future studies are given in Table 2.
Despite the limitations of the available evidence, we can reasonably conclude that depression is sufficiently common in medical inpatients to make planning for its systematic management worthwhile. This management should include systematic identification of depression during hospital admissions, monitoring of depression once identified (during the hospital admission and thereafter, to determine whether it resolves post-discharge) and the initiation of treatment when it is persistent (Mayou et al. Reference Mayou, Hawton and Feldman1988; Kathol & Wenzel, Reference Kathol and Wenzel1992). Few hospitals currently have such systems. The approach we have tested for depression management in cancer patients provides a potential model for how we might improve depression care in all medical settings (Walker & Sharpe, Reference Walker and Sharpe2014).
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291718000624.
Acknowledgement
This work was supported by the Oxford Academic Health Science Network; Sir Michael Sobell House Hospice, Oxford; and the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care Oxford at Oxford Health NHS Foundation Trust. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.
Declaration of interest
None.