Introduction
About 800 000 people die by suicide every year making suicide the 15 leading cause of death worldwide according to the World Health Organization (2014), and the second among 15–29 year-olds (WHO, 2018). In the United States, suicide rates increased from 1999 through 2017, and the age-adjusted suicide rate was 33% higher in 2017 than in 1999 (Hedegaard, Curtin, & Warner, Reference Hedegaard, Curtin and Warner2018). Despite these findings, there is still little awareness in medical practice of objective suicide risk stratification, which has led to suicide being referred to as ‘the quiet epidemic’ (Turecki, Reference Turecki2014).
A growing body of knowledge has put forward several sociodemographic and clinical risk factors associated with individuals who attempt suicide (Borges et al., Reference Borges, Angst, Nock, Ruscio, Walters and Kessler2007, Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010; Nock et al., Reference Nock, Borges, Bromet, Alonso, Angermeyer, Beautrais and Williams2008). For instance, gender, age, race, marital status, education, income, prior suicide attempt, stressful life events, and body mass index (BMI) are all variables associated with suicide attempts (Borges et al., Reference Borges, Angst, Nock, Ruscio, Walters and Kessler2007, Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010; Heikkinen, Aro, & Lönnqvist, Reference Heikkinen, Aro and Lönnqvist1992; Johnston, Pirkis, & Burgess, Reference Johnston, Pirkis and Burgess2009; Nock et al., Reference Nock, Borges, Bromet, Alonso, Angermeyer, Beautrais and Williams2008; Oquendo et al., Reference Oquendo, Perez-Rodriguez, Poh, Sullivan, Burke, Sublette and Galfalvy2014; Perera et al., Reference Perera, Eisen, Dennis, Bawor, Bhatt, Bhatnagar and Samaan2016; Zhang, Yan, Li, & McKeown, Reference Zhang, Yan, Li and McKeown2013). Additionally, retrospective studies with psychological autopsies have shown that 90% of the subjects who died by suicide had a psychiatric disorder, including major depressive disorder, substance-related disorders, and/or personality disorders (Arsenault-Lapierre, Kim, & Turecki, Reference Arsenault-Lapierre, Kim and Turecki2004). These efforts have largely reported average group-level differences between suicide attempters and non-attempters. However, what was not known until recently is how to integrate these variables to build models to estimate the probability of an individual attempting suicide. Importantly, this problem should be approached with caution, focusing on generating models that can generalize well for future instances and can create proper sparse representations to reduce data collection efforts. This is an important question because suicide is a highly preventable event (Zalsman et al., Reference Zalsman, Hawton, Wasserman, van Heeringen, Arensman, Sarchiapone and Zohar2016). It is known that interventions such as cognitive behavior therapy (Morey, Lowmaster, & Hopwood, Reference Morey, Lowmaster and Hopwood2010), and lithium (Cipriani, Hawton, Stockton, & Geddes, Reference Cipriani, Hawton, Stockton and Geddes2013) can significantly reduce suicide attempts.
Over the past 5 years, our group and others started to build machine-learning models to predict suicide attempts (Belsher et al., Reference Belsher, Smolenski, Pruitt, Bush, Beech, Workman and Skopp2019; Kessler et al., Reference Kessler, Warner, Ivany, Petukhova, Rose, Bromet and Ursano2015; Passos et al., Reference Passos, Mwangi, Cao, Hamilton, Wu, Zhang and Soares2016; Walsh, Ribeiro, & Franklin, Reference Walsh, Ribeiro and Franklin2017). However, these studies had three limitations. First, most studies had only a few months of follow-up or relied on a retrospective (Choi, Lee, Yoon, Won, & Kim, Reference Choi, Lee, Yoon, Won and Kim2018) or cross-sectional design (Borges et al., Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010). Second, some of the studies aimed to build suicide prediction models within the general population (Borges et al., Reference Borges, Angst, Nock, Ruscio, Walters and Kessler2007, Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010), but they did not comprise nationally representative samples, which may have biased their findings. Third, some of the studies had a small sample size (Galfalvy, Oquendo, & Mann, Reference Galfalvy, Oquendo and Mann2008; Passos et al., Reference Passos, Mwangi, Cao, Hamilton, Wu, Zhang and Soares2016). It has also been stated recently that future studies should address specific populations with higher rates for suicide attempts, such as individuals with depressive episodes (Passos & Ballester, Reference Passos and Ballester2019).
The current study, therefore, aims to develop models to predict suicide attempts in the general population (aim 1) and in participants with lifetime major depressive episodes (aim 2) by using machine-learning techniques coupled with sociodemographic and clinical data. To address the limitations of previous studies, we used a nationally representative cohort publicly available by request with 43 093 participants and a follow-up period of 3 years (Hasin & Grant, Reference Hasin and Grant2015). Of note, we used easily accessible clinical variables to achieve our aims.
Methods
Data collection, study design, and participants
We used sociodemographic, clinical, and stressful life events data from a large 3-year follow-up study called the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) (National Institute on Alcohol Abuse and Alcoholism, 2006). NESARC was collected in two waves. Wave 1 was conducted in 2001–2002 and surveyed a representative sample of the adult population of the United States, oversampling black people, Hispanic individuals, and young adults aged 18–24 years. The target population was the civilian non-institutionalized population, 18 years and older, residing in households and group quarters. Face-to-face interviews were conducted with 43 093 respondents, yielding an overall response rate of 81%. Weighted data were adjusted to be representative of the civilian population of the United States on socioeconomic variables based on the 2000 Decennial Census. The mean interval between wave 1 and wave 2 interviews was 36.6 (s.e. = 2.62) months. Wave 2 of the NESARC was conducted in 2004–2005 and involved face-to-face reinterviews with all participants in the wave 1 interview. Excluding respondents ineligible for the wave 2 interview because they were deceased (n = 1403), deported, mentally or physically impaired (n = 781), or on active duty in the armed forces throughout the follow-up period (n = 950), the wave 2 response rate was 86.7%, reflecting 34 653 completed interviews. The cumulative response rate at wave 2 was the product of the wave 2 and wave 1 response rates, or 70.2%. The mean interval between wave 1 and wave 2 interviews was 36.6 (s.e. = 2.62) months. Wave 2 NESARC data were weighted to reflect design characteristics of the NESARC and account for oversampling. More information about NESARC can be found elsewhere (Hasin & Grant, Reference Hasin and Grant2015; National Institute on Alcohol Abuse and Alcoholism, 2006).
All potential NESARC respondents were informed in writing about the nature of the survey, the statistical uses of the data to be collected, the voluntary nature of their participation, and the federal laws that rigorously provide for the confidentiality of identifiable survey information. Only respondents consenting to participate after securing this information were interviewed. The research protocol for the initial NESARC survey and the follow-up survey (wave 2), including informed consent procedures, received full-ethical review and approval from the US Census Bureau and the Office of Management and Budget.
Assessments
The Alcohol Use Disorder and Associated Disabilities Schedule – Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition (AUDADIS-IV) was used (Hasin & Grant, Reference Hasin and Grant2015). AUDADIS-IV is a fully structured diagnostic interview designed to assess alcohol, drug, and mental disorders according to DSM-IV diagnostic criteria in both clinical and general populations, with good to excellent reliability for most variables shown in test–retest studies (Hasin & Grant, Reference Hasin and Grant2015).
Specific aims
Aim 1 was to build a tool for predicting future suicide attempt in the general population that would be able to objectively stratify the risk at an individual level. To achieve this, we built machine-learning models by using easily accessible predictor variables from wave 1. The outcome was attempted suicide in the follow-up period and this was assessed in wave 2, approximately 3 years later.
Aim 2 was to investigate whether a specific predictive clinical signature derived from a sample of this population, with lifetime major depressive episodes, could be created using a similar approach.
Selection of predictor variables
Selection of predictor variables to be utilized in ‘training’ an algorithm is a challenge in machine learning. However, a recommended method of selecting relevant predictor variables is to use expert domain knowledge – largely from previously published literature (Passos et al., Reference Passos, Ballester, Barros, Librenza-Garcia, Mwangi, Birmaher and Kapczinski2019). We selected predictor variables using a priori knowledge, through hypothesis-driven approaches. It is worth mentioning that these variables were decided a priori and approved by the US Census Bureau before the analysis.
Predictor variables comprised of psychiatric diagnoses [alcohol and drug use disorders, panic disorder, generalized anxiety disorder, specific phobia, social phobia, post-traumatic stress disorder (PTSD), major depressive disorder, dysthymic disorder, bipolar disorder, schizophrenia, and personality disorders]; stressful life events in the past 12 months (e.g. death of a family member or a close friend, being fired or laid off from a job, getting separated or divorced, being a victim of any type of crime); sociodemographic variables (age, gender, race, marital status, education, income, being raised by biological parents or not); and BMI. Additional details on variables used are provided in the online Supplementary methods. Notably, the majority of variables selected were related to psychiatric comorbidities, given that most individuals who attempt suicide are affected by a psychiatric disorder (Hoertel et al., Reference Hoertel, Franco, Wall, Oquendo, Kerridge, Limosin and Blanco2015; Nock, Hwang, Sampson, & Kessler, Reference Nock, Hwang, Sampson and Kessler2010). Recent findings have demonstrated that the effects of mental disorders on suicide risk can be exerted almost exclusively through a general psychopathology factor representing the shared effect across all mental disorders (Hoertel et al., Reference Hoertel, Franco, Wall, Oquendo, Kerridge, Limosin and Blanco2015). In addition, all selected sociodemographic variables were associated with suicide attempts in previous studies (Borges et al., Reference Borges, Angst, Nock, Ruscio, Walters and Kessler2007, Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010; Heikkinen et al., Reference Heikkinen, Aro and Lönnqvist1992; Johnston et al., Reference Johnston, Pirkis and Burgess2009; Nock et al., Reference Nock, Borges, Bromet, Alonso, Angermeyer, Beautrais and Williams2008; Oquendo et al., Reference Oquendo, Perez-Rodriguez, Poh, Sullivan, Burke, Sublette and Galfalvy2014; Perera et al., Reference Perera, Eisen, Dennis, Bawor, Bhatt, Bhatnagar and Samaan2016; Zhang et al., Reference Zhang, Yan, Li and McKeown2013), as well as being raised by biological parents (Borczyskowski, Hjern, Lindblad, & Vinnerljung, Reference Borczyskowski, Hjern, Lindblad and Vinnerljung2006; Keyes, Malone, Sharma, Iacono, & McGue, Reference Keyes, Malone, Sharma, Iacono and McGue2013; Slap, Goodman, & Huang, Reference Slap, Goodman and Huang2001), and BMI variables (Perera et al., Reference Perera, Eisen, Dennis, Bawor, Bhatt, Bhatnagar and Samaan2016; Zhang et al., Reference Zhang, Yan, Li and McKeown2013). Suicidal crises are typically triggered by recent life events (Turecki & Brent, Reference Turecki and Brent2015), but how stressful events interact with individual susceptibility to suicidal behavior or trait-like diathesis is as yet unclear (Van Heeringen & Mann, Reference Van Heeringen and Mann2014). Moreover, the specific nature of stressful life events can impact an individual in different ways (Oquendo et al., Reference Oquendo, Perez-Rodriguez, Poh, Sullivan, Burke, Sublette and Galfalvy2014) and a greater understanding of this phenomenon is required.
For aim 2, besides the predicting variables used in the first aim, we included another four predictor variables assessed only in participants with lifetime major depressive episodes: prior hospitalization because of depressive symptoms, past-suicide attempts, age at onset of first episode of major depression, and suicidal ideation (Holma et al., Reference Holma, Melartin, Haukka, Holma, Sokero and Isometsä2010; Isometsä, Reference Isometsä2014; Oquendo et al., Reference Oquendo, Galfalvy, Russo, Ellis, Grunebaum, Burke and Mann2004; Schaffer et al., Reference Schaffer, Isometsä, Tondo, Moreno, Turecki, Reis, C. and Yatham2014; Tondo, Lepri, & Baldessarini, Reference Tondo, Lepri and Baldessarini2007).
Statistical analysis
Descriptive analyses were reported as means (with standard deviations) or absolute and relative frequencies. We divided participants into two groups based on the outcome (participants who attempted suicide v. participants who did not between wave 1 and wave 2) for each aim, and we used chi-squared (χ2) or Student's t tests to analyze sociodemographic and clinical variables among these groups.
The statistical summaries reported in this document have been cleared by the US Census Bureau's Disclosure Review Board release authorization number CBDRB-FY20-094.
Machine-learning analysis
We used R software (Version R 3.3.1), RStudio (Version 0.99.902), and the following packages: caret, glmnet, randomForest, and nnet for this step (Kuhn, Reference Kuhn2008). Machine-learning approaches are usually superior to traditional multiple regression analyses, especially in contexts where coefficients would be unstable due to high correlations of predictors (Zou & Hastie, Reference Zou and Hastie2005). The elastic net is a machine-learning method that uses regularization with an embedded feature selection procedure. Through a cost function composed of both L1 (least absolute shrinkage and selection operator, i.e. Lasso regression) and L2 (ridge regression) weight magnitude penalties, the method can remove predictors with low impact on the outcome while regularizing for improved generalization. The coefficients of features less predictive to the outcomes shrunk toward zero simplifying the model, and reducing overfitting. As our dataset is composed of several attributes, identifying the most important of these enables wider applicability and more practical use of our predictive models.
As supplementary analysis, we also built models with two other machine-learning models called random forest and artificial neural networks (ANNs), because they can analyze complex relationships between variables, including nonlinear patterns (Passos et al., Reference Passos, Ballester, Barros, Librenza-Garcia, Mwangi, Birmaher and Kapczinski2019). Random forest (or decision tree forests) is an ensemble-based method that builds multiple decision trees (Breiman, Reference Breiman2001). The method combines the base principles of ‘bagging’ with random feature selection to add additional diversity to the decision tree models. ANNs model the relationship between a set of input and output signals using a model derived from our understanding of how a biological brain responds to stimuli from sensory inputs (Cross, Harrison, & Kennedy, Reference Cross, Harrison and Kennedy1995). We only used ANNs with a single hidden layer.
To build the model, we randomly split the dataset into two parts: (1) a training dataset with 75% of the whole sample and (2) test datasets with 25% of the sample. We removed all instances with missing data. After this, we used a standard machine-learning protocol with 10-fold cross-validation, hyperparameter tuning, and class imbalance correction in the training dataset (Fig. 1).
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_fig1.png?pub-status=live)
Fig. 1. Machine-learning protocol. First, we split the dataset into two parts: (1) training dataset with 75% of the whole sample and (2) test datasets with 25% of the sample. After this, we used a standard machine-learning protocol with 10-fold cross-validation, hyperparameter tuning, and class imbalance correction in the training dataset and we repeated the whole process in 50 iterations.
Class imbalance
Class imbalance introduces a bias toward classifying all the data as the majority class (i.e. did not attempt suicide in the current study), which usually leads to poor detection of the infrequent class. For the elastic net model, we implemented a class weighting technique instead of under-sampling. Each instance of the dataset was reweighted according to the inverse of the frequency of their class, as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_eqnU1.png?pub-status=live)
where wi is the weight for the instance i, ci ∈ {0,1} is the class of the instance i, and p(y) and p(n) are the marginal probabilities for the positive and negative class, respectively. Class imbalance for random forest and ANN was addressed through a resampling step, which entailed randomly under-sampling the majority class so that both classes match the prevalence on the sample without further stratification of other confounding factors in each analysis followed by model training. The whole process was repeated in 50 iterations. The algorithm-predicted probabilities were averaged over the resampling iterations.
Model performance measures
The validity of the models to predict ‘unseen’ subjects in test dataset was evaluated using sensitivity, specificity, balanced accuracy, positive predictive value (PPV), negative predictive value (NPV), and area under the ROC curve (AUC). We used a cutoff of 0.5 as the boundary for the class decision, that is, the algorithm classified probabilities above 50% as belonging to the positive outcome level (i.e. subject attempted suicide) and those below 50% to the negative outcome level (i.e. subject did not attempt suicide).
Variable importance
Variable importance was estimated using the standard procedures from the caret package. For elastic net, the values of the coefficients are used. For random forest, the model sensitivity to removing a predictor from its trees is used as a proxy for variable importance. For neural networks, the method described in Gevrey, Dimopoulos, and Lek (Reference Gevrey, Dimopoulos and Lek2003) is used.
Hyperparameter tuning
The standard grid search for the caret package was used. We changed the default search strategies of each algorithm such as: Elastic net searched for alpha from 0.1 to 1.0 with 0.1 intervals and lambda from 0.001 to 0.51 with 0.05 intervals; random forest searched for mtry from 1 to the total number of variables; neural networks searched for size from 1 to 100 with intervals of 5 and decay from 0.1 to 0.5 with intervals of 0.1. The selection of the best model was performed independently for each approach following the AUC.
Results
A total of 32 700 subjects were included in aim 1 of this study and 6350 in aim 2. Tables 1 and 2 summarize the clinical and sociodemographic characteristics among participants who attempted suicide v. participants who did not between wave 1 and wave 2 for the general population and for participants with lifetime major depressive episodes, respectively. All variables showed differences between groups, except for BMI in the general population and gender, BMI, and specific phobia in the sample with lifetime major depressive episodes.
Table 1. Sociodemographic and clinical characteristics in all participants
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_tab1.png?pub-status=live)
ADHD, attention deficit hyperactivity disorder; PTSD, post-traumatic stress disorder.
D: Statistic is based upon fewer than 15 observations.
The sum of some variables may vary because estimates on released outputs were rounded to minimize disclosure risk within and between projects.
χ2 tests with more than 1 degree of freedom (df) used Fisher's exact corrections, and the χ2 tests with 1 df used the Yates exact correction to p values.
Authorization number: CBDRB-FY20-094.
a Married or living with another as if married.
b Widowed, separated, divorced, or never married.
p values in the table are not adjusted for multiple comparisons.
Table 2. Sociodemographic and clinical characteristics in participants with lifetime major depressive episodes
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_tab2.png?pub-status=live)
ADHD, attention deficit hyperactivity disorder; PTSD, post-traumatic stress disorder.
D: Statistic is based upon fewer than 15 observations.
The sum of some variables may vary because estimates on released outputs were rounded to minimize disclosure risk within and between projects.
χ2 tests with more than 1 df used Fisher's exact corrections, and the χ2 tests with 1 df used the Yates exact correction to p values.
Authorization number: CBDRB-FY20-094.
a Married or living with another as if married.
b Widowed, separated, divorced, or never married.
p values in the table are not adjusted for multiple comparisons.
Figure 2 shows the ROC of all machine-learning algorithms used in the analyses performed on both samples.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_fig2.png?pub-status=live)
Fig. 2. ROC of the different algorithms. (a) ROC in all participants. (b) ROC in participants with lifetime major depressive episodes.
Elastic net regularization
The model built with elastic net regularization distinguished individuals who attempted suicide from those who did not with an AUC of 0.89 for aim 1 and 0.89 for aim 2. Balanced accuracy was 81.86% for aim 1 and 81.64% for aim 2. Other performance measures can be found in Table 3. The most important variables were borderline personality disorder, PTSD, and being of Asian descent for the model in all participants and previous suicide attempt, borderline personality disorder, and overnight stay in hospital because of depressive symptoms for the model in participants with lifetime major depressive episodes (online Supplementary Fig. S1).
Table 3. Model performance measures
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20221123115408919-0671:S0033291720004997:S0033291720004997_tab3.png?pub-status=live)
AUC, area under the ROC curve.
(a) Model performance measures in all participants. (b) Model performance measures in participants with lifetime major depressive episodes.
Authorization number: CBDRB-FY20-094.
Performance measures for random forest and ANN can be found in Table 3, while variable importance for these models is provided in online Supplementary Fig. S2.
Discussion
This is the first study to evaluate the prediction of suicide attempt in a nationally representative sample of the US population. Our models achieved good performance and all algorithms achieved greater than chance (>50%) accuracy in distinguishing attempters from non-attempters, with balanced accuracy for suicide attempt exceeding 0.80 in all models. As our primary analysis, elastic net found the most relevant predictive variables that distinguished those who attempted suicide from those who did not in the general population, to be, in descending order, borderline personality disorder, PTSD, and being of Asian descent. Similarly, in the sample with lifetime major depressive episode, the most relevant predictor variables were, in descending order, previous suicide attempt, borderline personality disorder, and overnight stay in hospital because of depressive symptoms.
Psychopathology is strongly associated with suicidal behavior (Arsenault-Lapierre et al., Reference Arsenault-Lapierre, Kim and Turecki2004; Borges et al., Reference Borges, Nock, Abad, Sampson, Alonso, Helena and Williams2010), and personality disorders, including borderline personality disorder, are also associated with premature mortality (Temes, Frankenburg, Fitzmaurice, & Zanarini, Reference Temes, Frankenburg, Fitzmaurice and Zanarini2019; Tyrer, Reed, & Crawford, Reference Tyrer, Reed and Crawford2015). For borderline personality disorder, the presence of suicide attempt or self-injurious behavior is one of the diagnostic criteria (APA, 2013) and a defining feature of the disorder, with over 60% reporting multiple suicide attempts (Zanarini et al., Reference Zanarini, Frankenburg, Reich, Fitzmaurice, Weinberg and Gunderson2008). An 8-year longitudinal follow-up study of 123 subjects with borderline personality disorder showed an increased risk of suicide attempt associated with illness severity and socioeconomic status, including minority race and frequent changes in employment (Soloff & Chiappetta, Reference Soloff and Chiappetta2017). PTSD is considered an independent predictor of attempted suicide (Sareen et al., Reference Sareen, Cox, Stein, Afifi, Fleet and Asmundson2007; Wilcox, Storr, & Breslau, Reference Wilcox, Storr and Breslau2009). A cohort study of 1698 young adults showed an adjusted relative risk between PTSD and suicide attempt of 2.7, even after adjustment for a prior major depressive episode, alcohol and drug abuse or dependence, whereas exposure to traumatic events without PTSD was not associated with an increased risk of attempted suicide (Wilcox et al., Reference Wilcox, Storr and Breslau2009). A traumatic experience is required for a diagnosis of PTSD and it is highly prevalent in the childhood of those who develop a borderline personality disorder (Leichsenring, Leibing, Kruse, New, & Leweke, Reference Leichsenring, Leibing, Kruse, New and Leweke2011). Our results, combined with those of previous studies, may indicate that trauma is a significant predictor of a suicide attempt, but only for those who develop a trauma related disorder. A meta-analysis reinforced the evidence that a PTSD diagnosis is associated with increased suicidality and supported an important role of comorbid major depression in the etiology of suicidality in PTSD (Panagioti, Gooding, & Tarrier, Reference Panagioti, Gooding and Tarrier2012).
A literature overview about suicide risk among immigrants and ethnic minorities showed a positive correlation between suicidal behavior and specific countries of origin. Non-European immigrant women demonstrated the highest risk for suicide attempt, a group that included young women of South Asian and black African origin (Forte et al., Reference Forte, Trobia, Gualtieri, Lamis, Cardamone, Giallonardo and Pompili2018).
Suicide attempt and hospitalization are risk factors for subsequent suicide attempts and suicide in participants with mood disorders (Tondo et al., Reference Tondo, Lepri and Baldessarini2007). A meta-analysis showed that the risk of suicide in people who presented to health care services after an incident of self-harm was 1.6% after 1 year and 3.9% after 5 years, and the estimated rate of repetition of non-fatal self-harm was 16.3% at 1 year, 16.8% at 2 years, and 22.4% at 5 years (Carroll, Metcalfe, & Gunnell, Reference Carroll, Metcalfe and Gunnell2014). In a 5-year prospective study, 249 patients with major depressive disorder were assessed and history of suicide attempts showed a hazard ratio of 4.39 to predict suicide during the follow-up (Holma et al., Reference Holma, Melartin, Haukka, Holma, Sokero and Isometsä2010).
There is conflicting evidence regarding the association between BMI and attempted suicide (Perera et al., Reference Perera, Eisen, Dennis, Bawor, Bhatt, Bhatnagar and Samaan2016). A critical review demonstrated that among men, a high BMI was associated with a low risk of attempted or completed suicide, while there was a paradox among women, namely, a high BMI was associated with an elevated risk of attempted suicide but a low risk of completed suicide (Zhang et al., Reference Zhang, Yan, Li and McKeown2013). BMI was among the most important predictive variables only in the random forest model (a nonlinear algorithm), which may highlight the complexity of the relationship between BMI and suicide attempt.
A recent systematic review has discussed the finding that prediction models of suicide death and suicide attempt achieved good accuracy but the PPV were low with high false-positive rates (Belsher et al., Reference Belsher, Smolenski, Pruitt, Bush, Beech, Workman and Skopp2019). Unfortunately, prevalence imposes a ceiling on PPV, so low PPV is expected because these models work with rare outcomes. Due to the higher prevalence of suicide attempt in the depressed sample, PPV was also higher (10.48%) compared to the general population (4.55%). These results are higher than most prior studies (Belsher et al., Reference Belsher, Smolenski, Pruitt, Bush, Beech, Workman and Skopp2019). We recommend that the model for the general population (aim 1) should be used as a screening tool to identify people at higher risk to attempt suicide. Health authorities should contact these people (or their relatives) to suggest more specific mental health assessments in the upcoming years. For people that already have a major depressive episode (the model built in aim 2) and were identified as positives for suicide attempts in the future, preventive strategies, such as the use of lithium or CBT, for instance, should be implemented.
The current study has some potential limitations. First, although our study has a longer follow-up period compared to prior literature [in Belsher's systematic review (Belsher et al., Reference Belsher, Smolenski, Pruitt, Bush, Beech, Workman and Skopp2019) only one of the included studies had a follow-up of more than 2 years], death by suicide or suicide attempt could still be ahead for people considered as false positives. Second, we are missing suicide attempts that resulted in deaths and all the individuals who died between wave 1 and wave 2. It is also noteworthy that a history of attempted suicide is associated with an increased rate of all-cause death and the life expectancy is reduced in these individuals (Al-Sayegh et al., Reference Al-Sayegh, Lowry, Polur, Hines, Liu and Zhang2015; Jokinen, Talbäck, Feychting, Ahlbom, & Ljung, Reference Jokinen, Talbäck, Feychting, Ahlbom and Ljung2018). Third, we are only reporting the self-reported suicide attempts, so we are missing the ones that could be found in administrative data. Fourth, we did not include exposure to early-life adversity, another well-characterized risk factor associated with suicidal behavior (Almeida et al., Reference Almeida, Draper, Snowdon, Lautenschlager, Pirkis, Byrne and Pfaff2012; Turecki & Brent, Reference Turecki and Brent2015), because these data were not collected in wave 1. Past suicide attempts are also strongly associated with suicidal behavior (Carroll et al., Reference Carroll, Metcalfe and Gunnell2014), but this was not included in the analysis with the general population because it was only assessed in wave 1 in individuals with lifetime major depressive episodes. Fifth, the models built in the current study may be useful for the US population; however, their accuracy should be assessed in other countries before implementation, as suicide attempts may vary according to culture and other population variables, such as religion (WHO, 2018). Sixth, data analyzed in the current study are more than 10 years old; however, the association between the variables assessed in the current study and the outcome do not change over time. Finally, regarding the machine-learning analysis, we failed to conduct calibration experiments to ensure that predicted probabilities are representative of actual suicide attempt probabilities. Future research on the same lines needs to ensure calibration is in-place before predictive models can be employed large-scale at the population level.
In summary, we report a highly accurate algorithm that is able to identify suicide attempts in the general population and in individuals with lifetime major depressive episodes using clinical, sociodemographic, and stressful life events’ data in a nationally representative sample. These results suggest that it is possible to utilize clinical measures to identify individuals at greater risk of attempting suicide. Future studies integrating data from different biological levels, such as genetics, metabolomics, and digital health data (Torous & Walker, Reference Torous and Walker2019) could potentially help to build more accurate models. Additionally, future studies should have even longer follow-up periods to increase PPV.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291720004997.
Acknowledgements
The statistical summaries reported in this document have been cleared by the US Census Bureau's Disclosure Review Board release authorization number CBDRB-FY20-094. Any opinions and conclusions expressed herein are those of the authors and do not necessarily reflect the views of the US Census Bureau. In addition, all results have been reviewed to ensure that no confidential information is disclosed. We also acknowledge the statistician from the US Census Bureau, Dr Jahn K. Hakes.
Author contributions
All authors contributed to the study design. The statistician from the US Census Bureau, Dr JahnK Hakes, did the analyses. CSM, PB, and ICP participated in the data analysis. CSM, PB, BC, BW, MAC, FK, and ICP were responsible for the interpretation of findings. CSM and PB were responsible for the figures. CSM, PB, BC, BW, MAC, FK, and ICP did the scientific literature search. CSM, PB, BC, BW, MAC, FK, and ICP participated in writing of the report, and all authors approved the final version of the manuscript.
Financial support
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001. IC Passos receives research support from CAPES, FIPE, and CNPq. CS Machado received master scholarship from CAPES, during the conduct of the study.
Conflict of interest
P Ballester, B Can, B Mwangi, M Caldieraro, F Kapczinski have nothing to disclose. IC Passos receives research support from CAPES (Finance Code 001), FIPE, and CNPq. CS Machado received master scholarship from CAPES (Finance Code 001), during the conduct of the study.