Introduction
Over recent decades, mental health conditions have become the leading cause of long-term disability in most middle- and high-income countries (Harvey, Henderson, Lelliott, & Hotopf, Reference Harvey, Henderson, Lelliott and Hotopf2009; Murray et al., Reference Murray, Vos, Lozano, Naghavi, Flaxman, Michaud and Lopez2012), with depression now the leading cause of disease burden worldwide (World Health Organization, 2017). To date, most of the effort to reduce the burden of these disorders has been targeted at ensuring treatment is given to those with manifest disorders. Although effective treatments are available, cost-effectiveness models suggest that even if optimal treatment were delivered to all of those in need, only 35–50% of the overall burden of these conditions would be alleviated (Andrews, Issakidis, Sanderson, Corry, & Lapsley, Reference Andrews, Issakidis, Sanderson, Corry and Lapsley2004). Consequently, effective intervention prior to the onset of full-diagnostic disorder is fundamental in responding to this problem (Cuijpers, Beekman, & Reynolds, Reference Cuijpers, Beekman and Reynolds2012).
There is increasing evidence that prevention of mental disorders is possible (Cuijpers, Van Straten, & Smit, Reference Cuijpers, Van Straten and Smit2005). However, the costs associated with delivering most traditional face-to-face prevention programmes inhibit scalability and make population-wide roll-out of these programmes unfeasible (Solomon, Proudfoot, Clarke, & Christensen, Reference Solomon, Proudfoot, Clarke and Christensen2015). A recent meta-analysis found emerging evidence that common psychotherapeutic techniques, delivered through eHealth and mHealth (healthcare practices supported by Internet or mobile phone technologies, respectively), may be effective in preventing common mental disorders in general populations (Deady et al., Reference Deady, Choi, Calvo, Glozier, Christensen and Harvey2017a). To date though, large-scale, incidence-focused studies are lacking (Richards et al., Reference Richards, Ekers, McMillan, Taylor, Byford, Warren and Finning2016). Equally as pressing is the need for high-quality studies incorporating attention-control conditions (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017).
As the workplace is a dominant part of many adults' lives, it is a prime location for mental health prevention interventions (Tan et al., Reference Tan, Wang, Modini, Joyce, Mykletun, Christensen and Harvey2014). Furthermore, workplace issues are consistently reported as a major source of stress, which, in turn, has implications for a number of adverse psychological outcomes, including depression (Tennant, Reference Tennant2001). Whilst workplace stress and risk is not localized to any particular industry, those working in male-dominated industries (MDIs) have been found to have elevated rates of anxiety and mood disorders compared to other workers (Battams et al., Reference Battams, Roche, Fischer, Lee, Cameron and Kostadinov2014). The nature of work within these industries also highlights the limitations of conventional prevention programmes which are rarely accessible, flexible or tailored. Rapid growth in the areas of eHealth and mHealth represent new frontiers for delivering and targeting health interventions (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017; Linardon, Cuijpers, Carlbring, Messer, & Fuller-Tyszkiewicz, Reference Linardon, Cuijpers, Carlbring, Messer and Fuller-Tyszkiewicz2019). The present intervention was developed with specific tailoring to these particular industries (Peters, Deady, Glozier, Harvey, & Calvo, Reference Peters, Deady, Glozier, Harvey and Calvo2018). Although other apps for depression exist (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017), none have been specifically designed for use in a workplace context. Furthermore, non-traditional CBT-based apps are rare and previous work in this area focuses entirely on symptom reduction in an unwell population. This trial aimed to evaluate the effectiveness of a new smartphone app designed to reduce depression symptoms and subsequent incident depression amongst a large group of Australian workers. To our knowledge, this is the largest ever depression prevention trial and the largest ever trial of a mental health app.
Methods
Study design
A randomized controlled trial (RCT) was conducted in Australia with two parallel arms, comparing an app-based prevention intervention (HeadGear) and an attention-control app. The study was registered with the Australian New Zealand Clinical Trials Registry (ACTRN12617000548336), with ethical approval from the University of New South Wales (UNSW) Human Research Ethics Committee (HC17021). The study was conducted in accordance with relevant guidelines as evidenced by the published study protocol (Deady et al., Reference Deady, Johnston, Glozier, Milne, Choi, Mackinnon and Harvey2018). The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Participants
Participants were working Australians, with MDIs selectively targeted. An MDI is defined as one in which ⩾70% of workers are male (Australian Bureau of Statistics, 2008). Two recruitment methods were employed: promotion of the study via a range of Australian industry partners; and social media (Facebook, Twitter) advertising and postings targeted towards MDIs. All onboarding was completed electronically, including online consent, whereby participants explicitly agreed to each of the consent declaration items and were able to print/email the information sheet.
Eligible participants were required to have a valid telephone number, own an Apple/Android-operating smartphone, be currently employed and reside in Australia. Participants were excluded if they did not have reliable access to the Internet, could not read English or failed to provide their phone number. Participants were also not included if they had substantial levels of depression symptoms at baseline, as indicated by a score above 14 on the Patient Health Questionnaire-9 (PHQ-9) or meeting provisional major depressive disorder (MDD) diagnosis using the PHQ-9 algorithm (Kroenke, Spitzer, & Williams, Reference Kroenke, Spitzer and Williams2001).
Randomization and masking
Randomization occurred immediately following baseline assessment using automated procedures integrated into the trial management software. The randomization algorithm followed a block design with a block size of 10, to ensure an equal number of participants were assigned to each condition. As both the intervention and the attention-control condition had the same download and installation process, participants were blinded to their allocation.
Procedures
Interested individuals were directed to their respective app store via online advertisements or the study website. Upon downloading the app, participants provided informed consent and were screened. Participants in the intervention arm used HeadGear, a smartphone application-based intervention centred on behavioural activation (BA) and mindfulness (for full content, see study protocol) (Deady et al., Reference Deady, Johnston, Glozier, Milne, Choi, Mackinnon and Harvey2018). The main component of the app is a 30-day intervention involving one 5–10 min ‘challenge’ per day. These ‘challenges’ feature evidence-based therapeutic techniques delivered using a variety of formats. At the commencement of the intervention, users complete a risk calculator that assesses risk for future common mental disorders and provides personalized feedback regarding this risk (Fernandez et al., Reference Fernandez, Salvador-Carulla, Choi, Calvo, Harvey and Glozier2017). The HeadGear app also includes a mood tracker, a toolbox of skills (which is filled as the intervention is completed) and support service helplines.
Participants in the attention-control arm used a smartphone application with an identical design as HeadGear which included the risk calculator and mood tracker; however, there was no 30-day intervention. To control for the attentional component of the intervention, the control condition encouraged users to track their mood daily over a 30-day period.
Assessments were completed at baseline, post-intervention (5 weeks after baseline), 3- and 12-month follow-up. The intervention and control applications monitored usage data including time spent in-app, number of logins and activity completion rates.
While baseline assessment occurred in the app, follow-up assessments occurred via an online survey accessed through an SMS web link. Participants received up to three phone-based reminders to complete each follow-up. On completion of each assessment, participants were entered into a draw for one of four $200 Visa gift cards.
The PHQ-9 was used to measure depression symptoms and was the primary outcome measure for this study (Spitzer, Kroenke, & Williams, Reference Spitzer, Kroenke and Williams1999). The PHQ-9 is a reliable and valid nine-item measure of depression severity over the past 2 weeks, with each item scored on a four-point scale (Spitzer et al., Reference Spitzer, Kroenke and Williams1999). Summing the nine items results in a score ranging from 0 (no depressive symptoms) to 27 (all symptoms occurring nearly daily). The criterion and construct validity of the PHQ-9 have previously been demonstrated (Spitzer et al., Reference Spitzer, Kroenke and Williams1999). The tool also incorporates an algorithm for probable MDD diagnosis.
Anxiety was measured using the two-item Generalized Anxiety Disorder scale (GAD-2) (Kroenke, Spitzer, Williams, Monahan, & Löwe, Reference Kroenke, Spitzer, Williams, Monahan and Löwe2007). Each item of the GAD-2 is scored on a four-point scale (total ranging from 0 to 6). Scale scores of ⩾3 are suggested as cut-off points between the normal range and probable cases of anxiety (Kroenke et al., Reference Kroenke, Spitzer, Williams, Monahan and Löwe2007).
Resilience (the ability to adapt well in the face of adversity, trauma, tragedy and threats) was measured by the 10-item Connor–Davidson Resilience Scale (CD-RISC10), a self-report scale shown to have high internal consistency (Cronbach's α = 0.89), construct validity and test–retest reliability in general population and in clinical settings (Connor & Davidson, Reference Connor and Davidson2003). Total scores range from 0 to 40 with higher scores corresponding to greater resilience. The CD-RISC10 has been shown to differentiate between individuals who function well after adversity and those who do not.
Well-being was assessed using the five-item WHO Wellbeing Index (WHO-5) (Bech, Reference Bech2004). Raw scores range from 0 to 25, where 0 indicates the worst possible quality of life while a score of 25 represents the best possible quality of life. A score ⩽13 or an answer of 0 or 1 on any of the five items shows poor well-being. WHO-5 is a psychometrically sound measure of well-being with high internal consistency (Cronbach's α = 0.84) and convergent associations with other measures of well-being (Krieger et al., Reference Krieger, Zimmermann, Huffziger, Ubl, Diener, Kuehner and Holtforth2014).
Work performance was measured using three items from the Health and Work Performance Questionnaire (HPQ) (Kessler et al., Reference Kessler, Barber, Beck, Berglund, Cleary, McKenas and Wang2003) and an additional item pertaining to days absent in the last month. For the purposes of analysis, a composite measure for effective work days was constructed, by multiplying days present at work (absenteeism) by absolute work productivity score (presenteeism) as calculated by the HPQ, replicating previous work in the area (Wang et al., Reference Wang, Simon, Avorn, Azocar, Ludman, McCulloch and Kessler2007).
Outcomes
The primary outcome measure of the study was the level of depressive symptomatology (as measured by the PHQ-9) across the 3-month follow-up period. Secondary outcomes included change in incident cases of depression over the follow-up period (using the PHQ-9 diagnostic algorithm), anxiety symptomatology, well-being, resilience and work performance, and 12-month outcomes.
Statistical analysis
As a universal preventative intervention with an attention-control, the size of the effect of the intervention was anticipated to be relatively small. Meta-analysis of previous trials of workplace prevention of depression has shown a small effect size of d = 0.17 (Tan et al., Reference Tan, Wang, Modini, Joyce, Mykletun, Christensen and Harvey2014). Power calculations were carried out using the R package simR (Green & MacLeod, Reference Green and MacLeod2016). A sample size of 1134 was needed to detect a difference of two points on the PHQ-9 (equivalent to a similar small effect size), at a significance level of 5% with a power of 80%. Using an attrition rate of 40%, an initial sample of 2100 was required.
Primary analyses were undertaken on an intent-to-treat basis, including all participants as randomized, regardless of treatment received or withdrawal from the study. Likelihood-based methods (mixed-model repeated measures) using IBM Statistical Package for the Social Sciences (SPSS) for Windows (release 23.0.0) (IBM Corp, 2015) were undertaken. A priori planned comparisons of change from baseline across the 3-month follow-up period were used to test the study hypotheses. Variables found to be substantially imbalanced between groups post-randomization were tentatively included in these models and retained if statistically significant and influential on outcomes. Mathematical transformation or categorization of raw scores was undertaken to meet distributional assumptions and address any violation of assumptions. For dichotomous outcomes such as depression caseness, a comparable generalized mixed modelling approach was used. Relative risk of depression was estimated based on the PHQ-9 diagnostic algorithm at both follow-up and the trial endpoint. Number needed-to-treat (NNT) was derived from the relative risk of new-onset depression at either follow-up time point (aggregated).
Little's test of the missing completely at random mechanism was conducted to determine the quality of missingness. Where this was significant, further analysis of patterns of missingness was undertaken to identify predictors of attrition which might indicate that data may meet the missing at random criteria.
All tests of treatment effects were conducted using a two-sided α level of 0.05 and 95% confidence intervals.
Results
A total of 5,578 participants were assessed for inclusion in the study. Of those screened, 3,500 failed to meet eligibility criteria, primarily because they had too many depressive symptoms to qualify for this trial of prevention and three did not complete baseline assessment. The remaining 2,275 eligible participants were randomized. Four participants later withdrew from the study. Thus, for the purpose of intent-to-treat analysis, there were 1,128 participants in the intervention group and 1,143 in the control group. The flow of participants through the study phases is shown in Fig. 1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224113805274-0762:S0033291720002081:S0033291720002081_fig1.png?pub-status=live)
Fig. 1. Trial profile. *44 multiple registrations; 2,202 did not meet inclusion criteria (749 not employed, 29 not aged 18 or over, 1,424 depressed); 1,254 failed to provide contact information. Minor imbalance in groups occurred due to the removal of test users (erroneously included within the blocks).
Table 1 presents the baseline characteristics and app usage of the study sample. The mean age was 40 years and participants were predominantly male (74.2%). A slightly greater proportion of females were randomized to intervention [compared to control; odds ratio (OR) = 1.21; corresponding to a small effect size]. Provisional adjustments were made to outcome analyses by incorporating gender into these models. No effects of this adjustment were found and consequently, these results are not reported. Primary outcome data were available post-intervention for 48% of the sample and for 46% at 3-month follow-up. Missingness was associated with intervention group assignment (OR = 1.65; p < 0.001; 95% CI 1.40–1.96) and younger age [OR(per year) = 0.98; p < 0.001; 95% CI 0.97–0.99]. Baseline depression level was unrelated to missingness.
Table 1. Selected demographics and app usage by intervention arm
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224113805274-0762:S0033291720002081:S0033291720002081_tab1.png?pub-status=live)
MDI, male-dominated industry; PHQ-9, Patient Health Questionnaire-9.
Data are n (%) or mean (s.d.).
a p < 0.05.
b Agriculture/forestry/fishing, manufacturing, wholesale trade, mining, construction, other manual trade, transport/postal/warehousing, first responder/defence/security.
c Category based on tertile split.
d p < 0.001.
Primary outcomes
The primary outcome was the level of depressive symptomatology. PHQ-9 scores (and model residuals) showed substantial non-normality; as such, transformations (square root and log) were applied to the data. Regardless of transformation type, very similar results were obtained. Log transformations are reported here, with raw data used for effect size.
There was a statistically significant group-by-time interaction for depression symptoms (F 2,1123.5 = 3.97, p = 0.019 at primary 3-month assessment, F 3,734.7 = 2.98, p = 0.031 at 12-month; see Fig. 2). HeadGear was associated with a decline in mean PHQ-9 total score of 0.45 units more than the controls at post-intervention (Cohen's d = 0.15; 95% CI 0.03–0.27). At 3-month follow-up, the advantage of HeadGear was slightly lower, but remained statistically significant, at 0.41 units on the PHQ-9 (d = 0.13; 95% CI 0.01–0.27). Trial attrition was considerable at 12-month follow-up, with only 16.3% and 21.0% completing the assessment for intervention and control, respectively. At 12-month follow-up, group-by-time interaction for depression symptoms was not significantly different from baseline [t (473.8) = 1.50, p = 0.135]. Comparisons of differential change between intervention arms by gender on individual occasions have different patterns of individual statistical significance. However, these differences are not statistically significant nor is the overall pattern of differential change between intervention arms (Supplementary file).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224113805274-0762:S0033291720002081:S0033291720002081_fig2.png?pub-status=live)
Fig. 2. Mean log2 PHQ-9 total scores by intervention arm at each occasion.
On average, users completed approximately a third of the core intervention component of the app, use generally tapered off but spiked at the completion of the entire programme (10.1%). When comparing outcome by ‘challenge’ completion, individuals in the intervention arm who completed few challenges (⩽7) were comparable to the control at follow-up, while those completing more challenges showed a more pronounced change, particularly those who completed 20 days or more [control v. 20–29 days, t (1158.8) = −2.73, p = 0.006; control v. 30 days, t (1158.8) = −3.09, p = 0.002].
Secondary outcomes
Cumulative incidence of depression
As a result of the inclusion criteria, no participants exceeded the cut-off for the established PHQ-9 algorithm for a provisional diagnosis of MDD at baseline. Post-intervention, 4.2% of participants in the control condition met the criterion for likely MDD, compared to 1.9% of those assigned to the intervention, leading to an OR of depression of 0.44 (p = 0.038; 95% CI 0.21–0.96). At 3-month follow-up, case rates for the control arm were stable at 5.5%, while rates in the intervention arm increased to 3.2% yielding an OR of 0.57 (p = 0.079; 95% CI 0.31–1.07; see Fig. 3). By the 12-month follow-up, depression prevalence for controls remained stable at 5.1%, while amongst participants in the HeadGear arm, the prevalence of depression remained significantly lower at 1.1%. This difference was associated with an OR of 0.21 (p = 0.04, 95% CI 0.05–0.96). Aggregated over all follow-up occasions, the prevalence of depression at any time was 8.0% and 3.5% for controls and HeadGear recipients, respectively. Odds of caseness for incident depression in the intervention group compared to controls were significant with an OR of 0.43 (p = 0.001, 95% CI 0.26–0.70) (Fig. 3). This equates to an NNT of 23 (17–44), suggesting one case of depression was avoided for every 23 users of HeadGear, compared to mood monitoring.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224113805274-0762:S0033291720002081:S0033291720002081_fig3.png?pub-status=live)
Fig. 3. PHQ-9 caseness by intervention arm at each occasion of measurement.
Work performance
A significant difference in mean change was observed for work performance (HPQ) at follow-up [t (935.4) = 3.03, p = 0.003; see Table 2]. This was equivalent to an extra day of effective work each month amongst those using HeadGear compared to the control (1.19; 95% CI 1.10–1.29).
Table 2. Contrasts of differences in change from baseline to post-intervention and follow-up between conditions (EMM)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20220224113805274-0762:S0033291720002081:S0033291720002081_tab2.png?pub-status=live)
WHO-5, HO Wellbeing Index; HPQ, Health and Work Performance Questionnaire; CD-RISC10, Connor–Davidson Resilience Scale; GAD-2, Generalized Anxiety Disorder scale.
a p < 0.05
Resilience, well-being, anxiety
Other secondary outcomes are presented in Table 2. Significant differences in mean change were observed for resilience (CD-RISC10) at follow-up. For well-being (WHO-5), the difference in mean change was significant at post-intervention but not follow-up. While there was a decline from baseline to post-intervention for anxiety symptoms (GAD-2), means in each arm were very similar. Exploratory analyses using a cut-off score of 3 to dichotomize outcomes suggested that the proportion of participants with clinically significant GAD was lower in the intervention group, but this result was not statistically significant. Significant differences in mean change were observed for well-being (WHO-5) and work performance (HPQ) at 12-month follow-up (Table 2).
Discussion
To our knowledge, this study is the first RCT to examine the effectiveness of a smartphone app in preventing depression and the largest mental health prevention trial undertaken in a working population. Overall, this study demonstrates that a smartphone app can reduce depression symptoms, may in fact prevent incident depression caseness, and improve work performance. Given the rising cost of mental illness across the developed world (Harvey et al., Reference Harvey, Henderson, Lelliott and Hotopf2009), and the increasing awareness of the role workplaces can play in the development of mental illness, these findings make an important contribution to how new technology can be used to reduce the burden of common mental disorders.
Smartphones have become a pervasive feature of modern life and have transformed many aspects of our society, including healthcare. Published surveys have shown that over three-quarters of adults would be interested in using an app to monitor and manage their mood or mental health (Proudfoot et al., Reference Proudfoot, Parker, Hadzi Pavlovic, Manicavasagar, Adler and Whitton2010). Developers have responded rapidly to this interest, with over 10 000 mental health-related applications now available to download, a number which increases daily (Torous & Roberts, Reference Torous and Roberts2017). However, science has failed to keep pace with these changes and almost all these apps are untested. This study provides evidence, for the first time, that a smartphone app may indeed have preventative potential regarding depression caseness. Using a diagnostic algorithm for depression, HeadGear was shown to be protective compared to simple mood monitoring, with rates of depression onset in the control group more than twice that of workers randomized to use HeadGear. This indicates that one incident case of depression could be prevented for every 23 workers using HeadGear, which is comparable to the rates reported in a recent meta-analysis of depression prevention programmes (van Zoonen et al., Reference van Zoonen, Buntrock, Ebert, Smit, Reynolds, Beekman and Cuijpers2014) and superior to many studies of CBT prevention interventions. This suggests the preventative impact of HeadGear may be substantial compared to the current best practice interventions. However, it is worth noting the effect size in the present study was small and may have implications for broader utility.
Many eHealth studies aim to intervene at a subclinical level, and subsequently, there is evidence of greater effectiveness within these groups (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017; Heber et al., Reference Heber, Ebert, Lehr, Cuijpers, Berking, Nobis and Riper2017). While the effect sizes for universal preventative and low-level symptom reduction interventions such as this one are small in a relative sense (Tan et al., Reference Tan, Wang, Modini, Joyce, Mykletun, Christensen and Harvey2014), the effect at a population level is considerable (Ahern, Jones, Bakshis, & Galea, Reference Ahern, Jones, Bakshis and Galea2008). The potential to move whole distributions away from the threshold of diagnosis is substantial, especially where dissemination can be so extensive, feasible and economical (Rose, Reference Rose1994). The change observed in the control condition in the current trial may additionally be due to the use of the risk calculator and mood tracker. The latter may have influenced participants' mood states and the former may have led some control participants to seek assistance. Although the effect size reported compares favourably with recent meta-analyses in the area of mHealth CBT-related apps (Rathbone, Clarry, & Prescott, Reference Rathbone, Clarry and Prescott2017) and workplace depression prevention (Tan et al., Reference Tan, Wang, Modini, Joyce, Mykletun, Christensen and Harvey2014), the question about how clinically meaningful small changes in symptom scores remains to be seen and tempers conclusions around impact. Furthermore, these findings should be interpreted with some caution due to the rates of attrition; more work is required to both improve programme adherence and trial dropout in this regard.
The economic case for depression prevention has often been promoted and is an essential component of decisions regarding how different public health interventions should be prioritized and funded (Cuijpers et al., Reference Cuijpers, Beekman and Reynolds2012). For this reason, it is essential to test whether interventions can not only prevent symptom development, but also improve functioning. HeadGear was associated with significantly improved workplace performance and resilience at 3-month follow-up. Interestingly, this effect appeared to accumulate after the intervention had been delivered and was most apparent at follow-up. These findings indicate that the benefits associated with the app were broader than depression symptom or incidence reduction and that clear functional gains were also evident. The improvement in work performance amongst those randomized to receive HeadGear was equivalent to an extra day of effective work each month, which is a considerable functional and economic benefit. Although the app was not intended to specifically improve anxiety outcomes, the use of the GAD-2 (rather than a more rigorous measure) may have concealed these effects.
Our findings provide compelling evidence of the utility of mHealth tools, even in a clinically well population. A crucial element of this study [lacking in many eHealth trials (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017)] was the use of an attention-control. Although controls spent less time engaging with this ‘lite’ version of the app, this is not an uncommon occurrence. Control conditions in this field tend to be significantly less intensive than interventions which often require users to engage with significant learning and rehearsal (Firth et al., Reference Firth, Torous, Nicholas, Carney, Pratap, Rosenbaum and Sarris2017). In fact, there is a tendency for purposely inert attention-controls, while the content delivered here has been shown to have therapeutic benefit (Kauer et al., Reference Kauer, Reid, Crooke, Khor, Hearps, Jorm and Patton2012). This study also provides valuable evidence to support the use of specific therapeutic components in depression prevention. To date, much of the research examining prevention and treatment interventions for depression has focused on CBT-based techniques (Deady et al., Reference Deady, Choi, Calvo, Glozier, Christensen and Harvey2017a). Recent findings suggest BA (Deady et al., Reference Deady, Peters, Lang, Calvo, Glozier, Christensen and Harvey2017b) and mindfulness-based (Stratton et al., Reference Stratton, Choi, Lampit, Calvo, Harvey and Glozier2017) techniques may be preferred by some types of workers. This trial adds to growing evidence of the effectiveness of such intervention techniques broadly (Richards et al., Reference Richards, Ekers, McMillan, Taylor, Byford, Warren and Finning2016), and the mobile delivery of such programmes.
Despite the unique strengths of the trial, some limitations are present. Firstly, there was substantial trial attrition at all timepoints, particularly at 12 months, which limits interpretation of findings, especially long-term impact. A key factor which is unique to this trial, and may be implicated in this dropout rate, was the onboarding process. In order to streamline the user experience, completion of the full screening, consent and baseline assessment occurred within the app. This may have led to the study recruiting a less research-engaged (though perhaps more real-world) sample. Therefore, some caution is required in the conclusions reached as those completing follow-up may differ from those who do not in ways not measured. Similarly, on average, most users completed a third of the intervention days. Although not uncommon (Richards & Richardson, Reference Richards and Richardson2012), better outcomes were found with greater challenge completion, and further work should explore means to enhance user engagement to this end. Furthermore, the potential for regression to the mean cannot be discounted without a waitlist control. In attempting to remove barriers to recruitment, no diagnostic interview occurred at baseline, as such the lack of an initial diagnosis of depression could not be verified. Dual criteria of exclusion [>14 on the PHQ-9 (sensitivity = 68%, specificity = 95%) (Kroenke et al., Reference Kroenke, Spitzer and Williams2001) and/or meeting the algorithm's criteria] were used in an attempt to attain a baseline sample free of MDD (Van Hooff et al., Reference Van Hooff, McFarlane, Davies, Searle, Fairweather-Schmidt, Verhagen and Hodson2014). However, this approach may have resulted in some mild cases being misclassified as sub-syndromal, and thus a potential conflating of the concepts of prevention and very early treatment. Additionally, it was unfeasible to conduct full assessments on participants at follow-up, but this does preclude the ability to make definitive statements regarding true disorder prevention. Finally, although the gender imbalance present was intended due to the design of the intervention, as depression is more prevalent in females, the gender imbalance is a study limitation particularly with respect to the generalizability of our findings. However, concerns regarding this limitation are tempered by supplementary analyses that revealed gender had neither substantial confounding nor moderating effects. This is more that needs to be done to explore the utility of mHealth for mental health prevention.
Overall, the new mental health smartphone application, HeadGear, was associated with a significant reduction in depressive symptoms compared to an attention-control condition. The findings also suggest HeadGear may have preventative capacity for incident caseness of depression. Although a small effect size, combined with high rates of attrition, particularly at long-term follow-up, necessitate some caution is required in interpretation. Nevertheless, the study indicates that smartphone apps may have the capacity to empower individuals and enhance the way working adults manage their mental health. By utilizing new technologies, many of the feasibility obstacles encountered with more traditional interventions in this area are offset. The fusion of evidence-based techniques and new technology means we are now able to contemplate depression prevention initiatives operating at scale, meaning the prevention gains seen with chronic physical illnesses may finally be realized in mental health.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720002081
Acknowledgements
This study was developed in partnership with beyondblue with donations from the Movember Foundation. RC is funded by an Australian Research Council Future Fellowship FT140100824. SH and MD are supported by funding from the iCare Foundation and NSW Health. In addition to our funders, we would like to acknowledge the support of industry partners, including, but not limited to: NSW Fire and Rescue, Aurizon, Victoria SES, UNSW Facilities Management, EML, Transport NSW, Sydney Trains, Australia Post, Dairy NSW, Johns Lyng Group. We would also like to acknowledge Daniel Collins in helping prepare this manuscript for publication. We would like to acknowledge and thank all the study participants who have helped shape the content of the intervention.
Authorship
MD, DJ and SBH had full access to all the data in the study and take responsibility for the integrity of the data. AM had full access to the cleaned dataset and takes responsibility for the accuracy of the data analysis. MD had a primary role in the conceptualization, write up and editing of this manuscript. DJ, SBH and IC had a role in the conceptualization, write up and editing of this manuscript. NG had a role in the conceptualization and editing of this manuscript. DM had a role in the technical development of the intervention and editing of this manuscript. AM had a primary role in the statistical conceptualization and editing of this manuscript. RC, AG, RB and HC had a role in the write up and editing of this manuscript. All authors have read and approved the final manuscript.
Financial support
HC, AM, DJ, MD, SH, NG, RC, DM, DP, AG, IC, report grants from beyondblue (with donations from the Movember Foundation), during the conduct of the study. MD and SH report grants from iCare Foundation and NSW Health, during the conduct of the study. In addition, MD, IC, SH, DM, RC, DP, NG have a patent IP from app design pending. RB has no interests to disclose. All researchers have remained independent from the funders in the completion and submission of this work. MD, DJ, IC, NG, DM, RAC, DP and SH were involved in the development of the HeadGear application. The IP is jointly owned by MD, IC, NG, DM, RC, DP and SH; however, the authors do not currently receive any financial gain from this IP. There are no other conflicts of interest or competing interests to declare. The funders of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.