Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting self-harm within six months after initial presentation to youth mental health services: A machine learning study

  • Frank Iorfino ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Validation, Visualization, Writing – original draft, Writing – review & editing

    frank.iorfino@sydney.edu.au

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Nicholas Ho,

    Roles Formal analysis, Methodology, Visualization, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Joanne S. Carpenter,

    Roles Data curation, Methodology, Validation, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Shane P. Cross,

    Roles Conceptualization, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Tracey A. Davenport,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Daniel F. Hermens,

    Roles Data curation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia, Sunshine Coast Mind and Neuroscience Thompson Institute, University of the Sunshine Coast, Birtinya, Queensland, Australia

  • Hannah Yee,

    Roles Investigation, Project administration, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Alissa Nichles,

    Roles Data curation, Methodology, Project administration, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Natalia Zmicerevska,

    Roles Data curation, Methodology, Project administration, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Adam Guastella,

    Roles Data curation, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

  • Elizabeth Scott,

    Roles Conceptualization, Data curation, Writing – review & editing

    Affiliations Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia, St Vincent’s and Mater Clinical School, The University of Notre Dame, Sydney, NSW, Australia

  • Ian B. Hickie

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Resources, Supervision, Writing – review & editing

    Affiliation Brain and Mind Centre, University of Sydney, Sydney, NSW, Australia

Abstract

Background

A priority for health services is to reduce self-harm in young people. Predicting self-harm is challenging due to their rarity and complexity, however this does not preclude the utility of prediction models to improve decision-making regarding a service response in terms of more detailed assessments and/or intervention. The aim of this study was to predict self-harm within six-months after initial presentation.

Method

The study included 1962 young people (12–30 years) presenting to youth mental health services in Australia. Six machine learning algorithms were trained and tested with ten repeats of ten-fold cross-validation. The net benefit of these models were evaluated using decision curve analysis.

Results

Out of 1962 young people, 320 (16%) engaged in self-harm in the six months after first assessment and 1642 (84%) did not. The top 25% of young people as ranked by mean predicted probability accounted for 51.6% - 56.2% of all who engaged in self-harm. By the top 50%, this increased to 82.1%-84.4%. Models demonstrated fair overall prediction (AUROCs; 0.744–0.755) and calibration which indicates that predicted probabilities were close to the true probabilities (brier scores; 0.185–0.196). The net benefit of these models were positive and superior to the ‘treat everyone’ strategy. The strongest predictors were (in ranked order); a history of self-harm, age, social and occupational functioning, sex, bipolar disorder, psychosis-like experiences, treatment with antipsychotics, and a history of suicide ideation.

Conclusion

Prediction models for self-harm may have utility to identify a large sub population who would benefit from further assessment and targeted (low intensity) interventions. Such models could enhance health service approaches to identify and reduce self-harm, a considerable source of distress, morbidity, ongoing health care utilisation and mortality.

Introduction

The ability to predict future death by suicide is still not much better than chance [13]. Yet, self-harm, which includes any intentional acts to self-injure irrespective of motivational intent behind these actions (i.e. suicide attempts and non-suicidal self-injury) [46], are the source of ongoing distress, morbidity and health care utilisation [7, 8]. Young people presenting to health services represent a group particularly at risk given the early age of onset of self-harm and their strong association with mental disorders [5, 911]. Consequently, a focus on predicting those at greatest risk of self-harm (rather than simply death by suicide) is an important goal for services.

Although factors such as suicidal thoughts, depression and alcohol misuse are consistently associated with future self-harm [12, 13], there is still significant doubt about the actual clinical utility of these factors for individual risk predictions [3, 14]. The problem extends to other biological [15] or clinical [16, 17] risk factors which have similarly weak predictive value. Self-harm is likely to be driven by the complex interplay between a broad range of social, biological, psychological, and contextual factors rather than any one or simple set of factors [18]. Further, the influence these factors have on self-harm is probably dynamic over time. The use of modern data science methods may help us overcome some of these challenges by considering the high-dimensional interactions between a large set of variables [19]. These methods attempt to embrace the complexity of the problem, which may be better suited to yield findings that reflect the real-world experiences of clinicians who are asked to solve these complex classification problems every day. Such approaches have been used to predict suicide and self-harm in a range of hospital or outpatient settings [2024].

Inevitably, the low prevalence of self-harm in many of these populations means there are major statistical limitations [25]. While, some have argued that this limits the clinical utility of such models [26], others have suggested that positive predictive value (PPV) alone is not the criterion for evaluating the utility of these models [27]. The intended use of a model is important to evaluate its clinical utility on balance of its benefits and harms [28]. Specifically, is the model being used to determine who should be admitted to hospital (i.e. a highly invasive and costly intervention), or who should be recommended a more detailed assessment (i.e. a low cost, non-invasive intervention). For this, decision curve analysis (a measure of net benefit) can be used to evaluate prediction models in a way that takes into account the relative benefit of an intervention for a true-positive case versus the cost of an intervention for a false-positive case [29]. Rejecting these tools based on their low PPVs implicitly assumes a high decision threshold, yet we know that in other areas of medicine, low thresholds for intervention are common when there is a lower cost to intervention (e.g. the prescription of statins) [28, 30]. Consideration of the cost and benefits of intervention thresholds are required to advance the development of useful clinical decision-making tools.

In this study we utilise a large clinical cohort of young people presenting to for youth mental health care to determine whether demographic and clinical characteristics at first assessment could be used to predict self-harm within the next six months. The goal here is to apply machine learning methods to evaluate the net benefit of these prediction models, and to identify factors that are consistently associated with self-harm in this clinical population. This study focuses specifically on clinical relevance and service allocation for those entering clinical services so a shorter time frame was deemed to be a suitable follow-up period to maximise the implications for immediate clinical decision-making, and the broader definition of self-harm (i.e. suicide attempts and non-suicidal self-injury) was used to capture all harmful behaviours that would initiate service response in terms of more detailed assessment and/or intervention.

Material and methods

The study was approved by the University of Sydney Human Research Ethics Committee (2008/5453, 2012/1626) and participants gave written informed consent.

Participants

Participants are drawn from a cohort of 6743 individuals aged 12–30 who presented to the Brain and Mind Centre’s youth mental health clinics in Sydney and recruited to a research register between June 2008 and July 2018 [31]. These clinics include primary care services (i.e. headspace [32, 33]) as well as more specialised mental health services. Young people may have been self-referred, referred via a family member or friend, or the community (e.g. general practitioner) [33]. All participants received clinician-based case management and psychological, social, and/or medical interventions as part of standard care.

Eligibility criteria

As of December 2019, longitudinal data were available for N = 2901 participants. Of these 2901 participants, the inclusion criteria for potential participants to be included in this specific study were: (i) aged 12 to 30 years at the time of initial visit; and (ii) a follow-up visit within six months of initial visit. Application of these criteria reduced the sample to 1962 individuals.

Data collection

Data were extracted from clinical files, and code inputs according to proforma (i.e. standardised form) [31, 34]. The proforma records information at predetermined time points. The first available clinical assessment at the service is taken as the baseline time point for each participant and the date of this assessment is used to determine each of the follow up time points. If there is no clinical information available for any time point (i.e. the participant did not attend the service during that time) then that entry is left missing. All clinical notes from the preceding time points, up to and including the current time point are used to inform and complete the current pro forma entry.

The proforma was used to record specific illness course characteristics. More detailed descriptions about the proforma, including the interrater reliability, are reported in the supplement and cohort paper [31]. The measures used here include (see S1 Data); demographics, social and occupational functioning (including, the Social and Occupational Functioning Assessment Scale (SOFAS; [35]), and Not in Education, Employment or Training (NEET) as a measure of participation and engagement with education or work), mental disorder diagnoses, clinical stage, at-risk mental states, self-harm, suicidal thoughts and behaviours, physical health comorbidities, personal mental illness history, and treatment utilisation.

The presence of suicidal ideation, suicide attempts, and non-suicidal self-injury is recorded. A suicide attempt is recorded when a young person has undergone steps to take their own life. If an individual harmed themselves via cutting, hitting themselves, burning themselves, or scratching with the intention to self-harm only and not to take their life, then this is included as non-suicidal self-injury and not a suicide attempt. If a suicide attempt occurs, it is also recorded whether the attempt resulted in hospitalisation or presentation to a hospital emergency department. For the present study, the ‘suicide attempt’ and ‘non-suicidal self-injury’ variables were combined under the broad definition of self-harm and used as the primary outcome measure. This is consistent with current conceptualisations of non-suicidal self-injury and suicide attempts which recognise that the separation of non-suicidal self-injury and suicide attempts on the basis of apparent motivations (i.e. suicide intent) may be unwarranted. The dimensional nature of suicidal intent phenomena means that accurate characterisation of these behaviours is challenging, and so national guidelines tend to broadly focus on self-harm [36, 37]. For this reason, we use this broader definition of self-harm so that we capture harmful behaviours that are likely to be the drivers of service response in terms of assessment and/or intervention.

Statistical analysis

The assembled dataset consisted of 37 basic demographic and clinical variables to predict whether or not the patient will report self-harm in any follow-up visit within six months of baseline. Categorical predictor variables with a small number of observations were removed [38]. Here, we set the threshold for variables with uncommon observations at 25. All variables for all patients were complete except for “physical health problems–other” where any missing observations (N = 107) were imputed as absent.

We followed the analysis approach described in previously published work [39]. Briefly, models were trained and tested with ten repeats of ten-fold cross-validation. At each fold of the cross-validation, the training set was balanced with three approaches. Firstly, the number of patients in the minority class (cases of self-harm) was doubled using SMOTE [40] to synthetically generate cases. Secondly, borderline samples identified as ‘Tomek links’, which are pairs of similar samples from different classes [41], were removed. Finally, we randomly under-sampled the majority class to balance with the minority class. The test set remained unaltered to assess the models’ performance on the real-world distribution of self-harm.

Following the “No Free Lunch” theorem [42], a number of algorithms were implemented to build predictive models. These algorithms were chosen based on a number of reasons: their popularity in the literature [43], ability to perform both predictive modelling and variable selection [44], has been utilised in other work on self-harm and suicidality [24, 45], can handle both continuous and categorical variables, and some of the chosen algorithms can model non-linear relationships between predictor variables. The algorithms were (i) Area Under the Curve Random Forests (AUCRF) [46]; (ii) Boruta [47]; (iii) Lasso regression [48]; (iv) Elastic-net regression [49]; and (v) Bayesian Additive Regression Trees (BART) [50] and (vi) Logistic regression. These algorithms, aside from logistic regression, are described in more detail in [39].

The variable selection approach differed for each algorithm. For AUCRF, selected variables are those in the Random Forest model with the highest AUROC. For Boruta, the selected variables must have significantly better importance scores than their permutated form. For LASSO and Elastic-net, the variable must have a non-zero coefficient. For BART, the variable’s inclusion proportion must be greater than a local threshold calculated from the permutation null distribution. For logistic regression, variables were deemed to be selected if the p-value for the variable’s coefficient is ≤0.05.

Model performance

A range of metrics was used to assess different aspects of model performance: AUROC, the Area under the Precision-Recall Curve (AUPRC) [51], Brier scores [52], sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and net benefit. An AUROC closer to 1 represents a perfect model whereas an AUROC closer to 0.5 represents a weak model. The AUPRC is a measure for imbalanced outcome variables that evaluates between the predicted and true positives [51]. Brier score is a proper scoring function which captures the mean squared error between probabilistic predictions and true outcomes. As such, Brier scores range from 0 to 1 with scores closer to 0 indicating more correct and calibrated predictions. Sensitivity, specificity, PPV and NPV measurements were obtained by dichotomising probabilistic predictions with a cut-off at 0.5. Decision curve analysis [29] uses the net benefit metric as a means to assess the costs and benefits of treatment compared different treatment strategies. The net benefit is calculated using a model’s sensitivity and specificity across different probability thresholds and the outcome’s prevalence in the population. In decision curve analysis, the model with the highest utility is the optimal strategy [53].

Analyses were performed using R [54] version 3.6.2 using packages randomForest [55], Boruta [47], AUCRF [46], bartMachine [56], glmnet [57], caret [58], cluster [59], dplyr [60], ggplot2 [61] and rmda [62].

Results

A total of 1962 young people were eligible for these analyses with a mean age of 18.36 years (SD = 39.7) and 60% were female. Out of 1962 young people, 320 (16%) engaged in self-harm in the six months after first assessment and 1642 (84%) did not. Of those who did not engage in self-harm, 274 (17%) had suicidal thoughts in the six months after the first assessment. The baseline characteristics of these young people are described in Table 1.

thumbnail
Table 1. Baseline demographic and clinical characteristics of variables used for prediction.

https://doi.org/10.1371/journal.pone.0243467.t001

Model performance

All six algorithms produced predictive models with similar performance across all metrics in the test datasets (Table 2). One-hundred models were produced for each algorithm (600 in total) and on average, Boruta random forest models had the highest AUPRC (Mean: 0.346, SD: 0.056), PPV (Mean: 0.321, SD: 0.035) and specificity (Mean: 0.722, SD: 0.037), and also the lowest Brier scores (Mean: 0.185, SD: 0.014). Lasso regression models had the highest mean NPV (Mean: 0.934, SD: 0.018) and sensitivity (Mean: 0.752, SD: 0.078), and BART had the highest mean AUROC (Mean: 0.755, SD: 0.039).

The mean predicted probabilities for each patient, averaged across the ten repeats of cross-validation, skewed towards zero for the majority for patients who did engage in self-harm and, for those who did, it skewed towards one (Fig 1A). The tails of these distributions suggest that there are still a number of patients whom the models incorrectly classified. The top 25% of patients as ranked by this mean predicted probability accounted for 51.6% - 56.2% of all patients who exhibited self-harm. By the top 50% of ranked patients, this increased to 82.1% - 84.4%. The variance in predicted probabilities for each person across different algorithms and repeats of cross-validation can inform on the uncertainty of the model predictions (Fig 1B).

thumbnail
Fig 1. Predicted probabilities of self-harm across the six machine learning algorithms.

Panel A presents the distribution of mean predicted probabilities. The mean predicted probabilities for each patient are averaged across ten repeats of 10-fold cross-validation. Panel B presents the uncertainty in predicted probabilities for a selection of patients. Boxplots show the predicted probabilities across the five algorithms and repeated cross-validation. The predictions for person ID 70 would suggest that the person is highly unlikely to engage in self-harm, in contrast to person ID 2726. In some instances, such as person ID 7577, models could not distinguish whether a person would or would not engage in self-harm as the predicted probabilities are close to 0.5. There are instances where the models will conflict in their predictions (e.g. person ID 1927).

https://doi.org/10.1371/journal.pone.0243467.g001

The decision curve was produced also using the mean predicted probabilities and contrasted against strategies to ‘treat everyone’ and ‘treat no one’ (Fig 2). All six models were superior to the ‘treat everyone’ strategy and the net benefit of these models were positive for thresholds between 0.09 and 0.26. The curves for all six algorithms remain very close with BART being the marginally superior model.

thumbnail
Fig 2. Decision curve analysis of machine learning models predicting self-harm.

Net curves are plotted across a range of probability thresholds for self-harm. The grey line plots the assumption that all people will engage in self-harm (i.e. ‘treat everybody’), whereas the black line assumes that no one will engage in self-harm (i.e. ‘treat no one’). The six coloured lines plot the net benefit of using machine learning models to identify who will engage in self-harm.

https://doi.org/10.1371/journal.pone.0243467.g002

Key predictors

Variable importance for predicting self-harm are presented in Figs 2 and 3 of the online supplement. There were 7 predictors that were selected in at least 80% of the models (480/600). In rank order, these predictors are: (1) a history of self-harm; (2) age; (3) SOFAS score; (4) sex; (5) bipolar disorder; (6) psychosis-like experiences; (7) treatment with antipsychotics.

thumbnail
Fig 3. Hypothetical graph that weighs up the net benefit of different interventions in youth mental health settings.

This hypothetical graph weighs up the costs of intervention (i.e. individual and clinician burden) against the number of people we are willing to treat in order to prevent future self-harm. As the costs of intervention increase, naturally the acceptable number of false positives reduce because intervention is likely to result in greater costs than benefits (resulting in a negative net benefit). However, when the costs of intervention are low, a higher number of false positives are acceptable to successfully prevent one case of self-harm (resulting in positive net benefit).

https://doi.org/10.1371/journal.pone.0243467.g003

Discussion

A major priority for mental health services is to prevent self-harm, which is a considerable source of distress, morbidity, ongoing health care utilisation and mortality, particularly in youth. This study evaluates the potential utility of machine learning as a tool that can improve clinical decision-making. First, in a cohort of young people presenting to youth mental health services, the machine learning models here demonstrated fair overall prediction (AUROCs between 0.744 and 0.755) and were well calibrated which indicates that predicted probabilities were close to the true probabilities (brier scores between 0.185 and 0.196). Second, the decision curve analysis indicates that there was a net benefit of these models over a ‘treat everybody’ approach, suggesting the potential to allocate targeted assessments and interventions in addition to those broad health service strategies. Finally, we identified seven basic factors that were among the strongest predictors and demonstrate the relative importance of these characteristics to identify those who may be at risk for self-harm in young people with emerging mental disorders.

Most prediction studies for self-harm have focussed on adults presenting to hospital or emergency departments, other high-risk populations, or focussed exclusively on suicide attempts. The application of this approach to a younger cohort with greater clinical heterogeneity and who may not have a prior history of self-harm or serious mental disorders is novel. Developing such prediction models for different populations and settings is critical given the complexity of self-harm [3]. We report classification performance metrics that are comparable to many previous prediction models [2024], and comparable or better than most clinical instruments used in high-risk populations [63]. This level of performance was achieved using only basic demographic and clinical factors, common to many intake assessments and not as rich as more comprehensive digital assessments now available [64]. The clinical context is an important consideration given that these young people are typically early in the course of illness or never sought help before [65], and almost one in five (17%) cases were new onset self-harm.

Clinicians are asked to make decisions every day about who requires further assessment and what type of treatment is most likely to be appropriate and effective for an individual. The lack of useful markers of illness means that these decisions are generally based on broad clinical guidelines, risk assessment tools and clinical intuition, each of which have major limitations. Most international guidelines recommend a needs-based assessment in high-risk settings [37], yet carrying out such assessments can be time and resource intensive, lead to the use of informal triage rules [66, 67], or rely on unvalidated locally-developed proformas [67]. Furthermore, there are major limitations to relying on clinical judgements for a range of outcomes, including future self-harm [68, 69]. Together, this reiterates the challenges health services and clinicians face when trying to prevent self-harm. There is a need for innovative health service approaches that can improve the consistency, effectiveness and safety of clinical decision-making [70].

The decision curve analysis can be used to stimulate discussion within services about the cost-benefits of different interventions across a range of risk thresholds. For thresholds between 0.09 and 0.26 all models presented here have a net benefit that is higher than a ‘treat everyone' approach. So, at low thresholds such models may have utility for allocating low intensity interventions in a way that optimises the cost-benefits. In practice, everyone presenting for youth mental health care could receive a needs-based digital assessment that includes the assessment of suicidal thoughts and self-harm (‘treat everyone’ approach) [64, 71]. This health service strategy reflects international guidelines and approaches to reduce risk (e.g. zero suicide) [70, 72, 73]. For those above a low threshold (~0.20), a further assessment and low intensity interventions could be recommended [74, 75]. This next level of assessment or intervention may be viewed as inappropriate or unfeasible to be provided to everyone but recommending it to a large subpopulation is acceptable to prevent one case of self-harm (Fig 3). These additional resources may be even more cost effective when considering that those identified at risk for self-harm tend to also be at risk for a range of negative mental health outcomes [7, 76].

A better understanding of key model predictors may be helpful to inform clinical decision making for reducing self-harm [37]. While, caution should be taken when interpreting variable rankings [77], many of the variables were highly intuitive and clinically informative. Consistent with previous findings, a history of self-harm predicted future self-harm in youth, even when considered among a range of other clinical factors [3, 14]. Interestingly though, social and occupational functioning was the third highest predictor, ranked higher than all other clinical factors (e.g diagnoses, previous hospitalisation). Maladaptive social and occupational factors have been associated with self-harm in youth and tends to include; adverse or absent social relationships [78], and poor educational or employment participation [34]. This work suggests that poor social and occupational functioning may be a critical target for intervention to reduce self-harm for some.

The association between self-harm and mental or substance use disorders has been widely reported in youth [5, 79, 80]. Mental disorder diagnoses were selected in our models to predict future self-harm, however the frequency of which these variables were selected was less than expected relative to other variables in our study. Bipolar disorder and psychosis-like experiences were among the strongest predictors, yet while depression was less important, they were still selected in over 70% of models.

Limitations and future directions

These findings should be considered in the context of some limitations. First, the sample size used here is relatively small and there was a major class imbalance for the main outcome. Second, we only considered a limited set of categorical variables. There are a range of additional social and contextual factors not considered here which may have influenced the results. Third, we only use baseline variables to predict future self-harm, however these models could benefit from time varying predictions [81]. Though, the use of baseline variables only does serve to replicate real-world clinical decision making after an initial presentation to a service. We used a limited set of machine learning algorithms that provided the opportunity for variable selection. Future studies should consider the utility of these models compared to clinician ratings, or a combination of these to make more informed decisions.

Females made up nearly three quarters of those who exhibited self-harm. Evidence suggests that low-to-moderate self-harm (e.g. superficial cutting etc.) tend to be more common among females, while males tend to engage in methods of self-harm that are more severe and likely to result in suicide (e.g., hanging, firearms) [82]. In light of this, the self-harm identified by this study are most likely low-to-moderate in severity. A more comprehensive understanding of self-harm methods is a matter for future studies.

The costs and benefits implicitly modelled in this work assume these are uniform for the entire population. While this may be true, a more detailed evaluation of costs and benefits for subgroups within this population may be required to accurately model these to inform decision making. Similarly, further research may also benefit from predicting these outcomes among subpopulations within the service whereby self-harm are particularly common (i.e. borderline or complex cases) [83]. These results may provide the opportunity for increase personalisation of interventions, improvements in prediction performance and greater cost-benefit ratios.

Conclusion

The present work supports the view that data driven, and machine learning methods have the potential to advance clinical decision making for self-harm [30]. This study demonstrates the potential clinical utility of prediction models to identify a large sub population who may benefit from targeted (low intensity) interventions in addition to the broad health service prevention strategies. Enhancing how health services identify and respond to self-harm is a critical priority, not simply because of the risk they confer for future suicide, but due to the significant distress, morbidity and ongoing health care utilisation associated with self-harm.

Acknowledgments

We would like to thank all the young people who have participated in this study, and all the staff in the Youth Mental Health Team at the Brain and Mind Centre, past and present, who have contributed to this work.

References

  1. 1. Ribeiro J, Franklin J, Fox KR, Bentley K, Kleiman EM, Chang B, et al. Self-injurious thoughts and behaviors as risk factors for future suicide ideation, attempts, and death: a meta-analysis of longitudinal studies. Psychological medicine. 2016;46(2):225–36. pmid:26370729
  2. 2. Large M, Kaneson M, Myles N, Myles H, Gunaratne P, Ryan C. Meta-analysis of longitudinal cohort studies of suicide risk assessment among psychiatric patients: heterogeneity in results and lack of improvement over time. PloS one. 2016;11(6):e0156322. pmid:27285387
  3. 3. Franklin JC, Ribeiro JD, Fox KR, Bentley KH, Kleiman EM, Huang X, et al. Risk factors for suicidal thoughts and behaviors: A meta-analysis of 50 years of research. Psychological Bulletin. 2017;143(2):187. pmid:27841450
  4. 4. Kapur N, Cooper J, O'connor RC, Hawton K. Non-suicidal self-injury v. attempted suicide: new diagnosis or false dichotomy? The British Journal of Psychiatry. 2013;202(5):326–8. pmid:23637107
  5. 5. Hawton K, Saunders KE, O'Connor RC. Self-harm and suicide in adolescents. The Lancet. 2012;379(9834):2373–82.
  6. 6. Hawton K, Fagg J, Simkin S, Bale E, Bond A. Deliberate self-harm in adolescents in Oxford, 1985–1995. Journal of adolescence. 2000;23(1):47–55. pmid:10700371
  7. 7. Iorfino F, Hermens DF, Cross SPM, Zmicerevska N, Nichles A, Groot J, et al. Prior suicide attempts predict worse clinical and functional outcomes in young people attending a mental health service. Journal of affective disorders. 2018;238:563–9. pmid:29940520
  8. 8. Rickwood DJ, Mazzer KR, Telford NR, Parker AG, Tanti CJ, McGorry PD. Changes in psychological distress and psychosocial functioning in young people visiting headspace centres for mental health problems. The Medical Journal of Australia. 2015;202(10):537–42. pmid:26021366
  9. 9. Nock MK, Green JG, Hwang I, McLaughlin KA, Sampson NA, Zaslavsky AM, et al. Prevalence, correlates, and treatment of lifetime suicidal behavior among adolescents: results from the National Comorbidity Survey Replication Adolescent Supplement. JAMA psychiatry. 2013;70(3):300–10. pmid:23303463
  10. 10. Husky MM, Olfson M, He J-p, Nock MK, Swanson SA, Merikangas KR. Twelve-month suicidal symptoms and use of services among adolescents: results from the National Comorbidity Survey. Psychiatric services. 2012;63(10):989–96. pmid:22910768
  11. 11. Muehlenkamp JJ, Claes L, Havertape L, Plener PL. International prevalence of adolescent non-suicidal self-injury and deliberate self-harm. Child and adolescent psychiatry and mental health. 2012;6(1):10. pmid:22462815
  12. 12. Favril L, Yu R, Hawton K, Fazel S. Risk factors for self-harm in prison: a systematic review and meta-analysis. Lancet Psychiatry. 2020;7(8):682–91. pmid:32711709
  13. 13. Lim KX, Rijsdijk F, Hagenaars SP, Socrates A, Choi SW, Coleman JRI, et al. Studying individual risk factors for self-harm in the UK Biobank: A polygenic scoring and Mendelian randomisation study. PLoS Med. 2020;17(6):e1003137. pmid:32479557
  14. 14. Joiner TE Jr, Conwell Y, Fitzpatrick KK, Witte TK, Schmidt NB, Berlim MT, et al. Four studies on how past and current suicidality relate even when" everything but the kitchen sink" is covaried. Journal of abnormal psychology. 2005;114(2):291. pmid:15869359
  15. 15. Chang B, Franklin J, Ribeiro J, Fox K, Bentley K, Kleiman E, et al. Biological risk factors for suicidal behaviors: a meta-analysis. Translational psychiatry. 2016;6(9):e887. pmid:27622931
  16. 16. Fox KR, Franklin JC, Ribeiro JD, Kleiman EM, Bentley KH, Nock MK. Meta-analysis of risk factors for nonsuicidal self-injury. Clin Psychol Rev. 2015;42:156–67. pmid:26416295
  17. 17. Bentley KH, Franklin JC, Ribeiro JD, Kleiman EM, Fox KR, Nock MK. Anxiety and its disorders as risk factors for suicidal thoughts and behaviors: A meta-analytic review. Clinical psychology review. 2016;43:30–46. pmid:26688478
  18. 18. Huang X, Ribeiro JD, Franklin JC. The differences between individuals engaging in nonsuicidal self-injury and suicide attempt are complex (vs. complicated or simple). Frontiers in psychiatry. 2020;11.
  19. 19. Ribeiro J, Franklin J, Fox K, Bentley K, Kleiman E, Chang B, et al. Letter to the Editor: Suicide as a complex classification problem: machine learning and related techniques can advance suicide prediction-a reply to Roaldset (2016). Psychological medicine. 2016;46(9):2009–10. pmid:27091309
  20. 20. Kessler RC, Warner CH, Ivany C, Petukhova MV, Rose S, Bromet EJ, et al. Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS). JAMA psychiatry. 2015;72(1):49–57. pmid:25390793
  21. 21. Simon GE, Johnson E, Lawrence JM, Rossom RC, Ahmedani B, Lynch FL, et al. Predicting suicide attempts and suicide deaths following outpatient visits using electronic health records. American Journal of Psychiatry. 2018;175(10):951–60. pmid:29792051
  22. 22. Yuval Barak-Corren Victor M. Castro, Javitt Solomon, Hoffnagle Alison G., Dai Yael, Perlis Roy H., et al. Predicting Suicidal Behavior From Longitudinal Electronic Health Records. American Journal of Psychiatry. 2017;174(2):154–62. pmid:27609239
  23. 23. Walsh CG, Ribeiro JD, Franklin JC. Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning. Journal of child psychology and psychiatry. 2018. pmid:29709069
  24. 24. Walsh CG, Ribeiro JD, Franklin JC. Predicting Risk of Suicide Attempts Over Time Through Machine Learning. Clinical Psychological Science. 2017;5(3):457–69.
  25. 25. Carter G, Spittal MJ. Suicide Risk Assessment. Crisis. 2018;39(4):229–34. pmid:29972324
  26. 26. Belsher BE, Smolenski DJ, Pruitt LD, Bush NE, Beech EH, Workman DE, et al. Prediction models for suicide attempts and deaths: a systematic review and simulation. JAMA psychiatry. 2019. pmid:30865249
  27. 27. Kessler RC. Clinical Epidemiological Research on Suicide-Related Behaviors—Where We Are and Where We Need to Go. JAMA psychiatry. 2019. pmid:31188420
  28. 28. Simon GE, Shortreed SM, Coley RY. Positive predictive values and potential success of suicide prediction models. JAMA psychiatry. 2019;76(8):868–9. pmid:31241735
  29. 29. Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26(6):565–74. pmid:17099194
  30. 30. Kessler RC, Bossarte RM, Luedtke A, Zaslavsky AM, Zubizarreta JR. Suicide prediction models: a critical review of recent research with recommendations for the way forward. Molecular Psychiatry. 2020;25(1):168–79. pmid:31570777
  31. 31. Carpenter JS, Iorfino F, Cross S, Nichles A, Zmicerevska N, Crouse JJ, et al. Cohort profile: the Brain and Mind Centre Optymise cohort: tracking multidimensional outcomes in young people presenting for mental healthcare. BMJ Open. 2020;10(3):e030985. pmid:32229519
  32. 32. McGorry PD, Tanti C, Stokes R, Hickie IB, Carnell K, Littlefield LK, et al. headspace: Australia’s National Youth Mental Health Foundation- where young minds come first. Med J Aust. 2007;187:S68–S70. pmid:17908032
  33. 33. Scott EM, Hermens DF, Glozier N, Naismith SL, Guastella AJ, Hickie IB. Targeted primary care-based mental health services for young Australians. Med J Aust. 2012;196(2):136–40.
  34. 34. Iorfino F, Hermens DF, Shane P, Zmicerevska N, Nichles A, Badcock C-A, et al. Delineating the trajectories of social and occupational functioning of young people attending early intervention mental health services in Australia: a longitudinal study. BMJ Open. 2018;8(3):e020678. pmid:29588325
  35. 35. Goldman HH, Skodol AE, Lave TR. Revising axis V for DSM-IV: A review of measures of social functioning. Am J Psychiatry. 1992;149:1148–56. pmid:1386964
  36. 36. Kendall T, Taylor C, Bhatti H, Chan M, Kapur N, Group GD. Longer term management of self harm: summary of NICE guidance. Bmj. 2011;343:d7073. pmid:22113565
  37. 37. Carter G, Page A, Large M, Hetrick S, Milner AJ, Bendit N, et al. Royal Australian and New Zealand College of Psychiatrists clinical practice guideline for the management of deliberate self-harm. Australian & New Zealand Journal of Psychiatry. 2016;50(10):939–1000. pmid:27650687
  38. 38. Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. Journal of medical Internet research. 2016;18(12):e323–e. pmid:27986644
  39. 39. Demetriou EA, Park SH, Ho N, Pepper KL, Song YJC, Naismith SL, et al. Machine Learning for Differential Diagnosis Between Clinical Conditions With Social Difficulty: Autism Spectrum Disorder, Early Psychosis, and Social Anxiety Disorder. Frontiers in Psychiatry. 2020;11(545).
  40. 40. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Int Res. 2002;16(1):321–57.
  41. 41. Tomek I. An Experiment with the Edited Nearest-Neighbor Rule. IEEE Transactions on Systems, Man, and Cybernetics. 1976;SMC-6(6):448–52.
  42. 42. Wolpert DH. The Lack of A Priori Distinctions Between Learning Algorithms. Neural Computation. 1996;8(7):1341–90.
  43. 43. Fernández-Delgado M, Cernadas E, Barro S, Amorim D. Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res. 2014;15(1):3133–81.
  44. 44. Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of Machine Learning Research. 2003;3:1157–82.
  45. 45. Walsh CG, Ribeiro JD, Franklin JC. Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning. Journal of Child Psychology and Psychiatry. 2018;0(0). pmid:29709069
  46. 46. Calle ML, Urrea V, Boulesteix AL, Malats N. AUC-RF: A New Strategy for Genomic Profiling with Random Forest. Human Heredity. 2011;72(2):121–32. pmid:21996641
  47. 47. Kursa MB, Rudnicki WR. Feature Selection with the Boruta Package. 2010. 2010;36(11):13.
  48. 48. Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society Series B (Methodological). 1996;58(1):267–88.
  49. 49. Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 2005;67(2):301–20.
  50. 50. Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Ann Appl Stat. 2010;4(1):266–98.
  51. 51. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one. 2015;10(3):e0118432–e. pmid:25738806
  52. 52. Brier GW. Verification of forecasts expressed in terms of probability. Monthly Weather Review. 1950;78(1):1–3.
  53. 53. Vickers AJ, van Calster B, Steyerberg EW. A simple, step-by-step guide to interpreting decision curve analysis. Diagnostic and Prognostic Research. 2019;3(1):18. pmid:31592444
  54. 54. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing; 2017.
  55. 55. Liaw A, Wiener M. Classification and Regression by randomForest. R News. 2002;2(3):18–22.
  56. 56. Kapelner A, Bleich J. bartMachine: Machine Learning with Bayesian Additive Regression Trees. 2016. 2016;70(4):40.
  57. 57. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of statistical software. 2010;33(1):1–22. pmid:20808728
  58. 58. Kuhn M. caret: Classification and Regression Training. R package version 60–772017.
  59. 59. Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. cluster: Cluster Analysis Basics and Extensions. R package version 207–12018.
  60. 60. Wickham H, François R, Henry L, Müller K. dplyr. A grammar of data manipulation. R package version 0762018.
  61. 61. Wickham H. ggplot2: Elegant graphics for data analysis. New York: Springer-Verlag; 2016.
  62. 62. Brown M. rmda: Risk Model Decision Analysis. R package version 1.6 ed2018.
  63. 63. Carter G, Milner A, McGill K, Pirkis J, Kapur N, Spittal MJ. Predicting suicidal behaviours using clinical instruments: systematic review and meta-analysis of positive predictive values for risk scales. The British Journal of Psychiatry. 2017;210(6):387–95. pmid:28302700
  64. 64. Iorfino F, Cross SP, Davenport T, Carpenter JS, Scott E, Shiran S, et al. A Digital Platform Designed for Youth Mental Health Services to Deliver Personalized and Measurement-Based Care. Frontiers in psychiatry. 2019;10(595). pmid:31507465
  65. 65. Iorfino F, Scott EM, Carpenter JS, Cross SP, Hermens DF, Killedar M, et al. Clinical Stage Transitions in Persons Aged 12 to 25 Years Presenting to Early Intervention Mental Health Services With Anxiety, Mood, and Psychotic Disorders. JAMA Psychiatry. 2019.
  66. 66. Cooper J, Steeg S, Bennewith O, Lowe M, Gunnell D, House A, et al. Are hospital services for self-harm getting better? An observational study examining management, service provision and temporal trends in England. BMJ open. 2013;3(11):e003444. pmid:24253029
  67. 67. Quinlivan L, Cooper J, Steeg S, Davies L, Hawton K, Gunnell D, et al. Scales for predicting risk following self-harm: an observational study in 32 hospitals in England. BMJ open. 2014;4(5):e004732. pmid:24793255
  68. 68. Ægisdóttir S, White MJ, Spengler PM, Maugherman AS, Anderson LA, Cook RS, et al. The meta-analysis of clinical judgment project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist. 2006;34(3):341–82.
  69. 69. Woodford R, Spittal MJ, Milner A, McGill K, Kapur N, Pirkis J, et al. Accuracy of clinician predictions of future self‐harm: a systematic review and meta‐analysis of predictive studies. Suicide and Life‐Threatening Behavior. 2019;49(1):23–40. pmid:28972271
  70. 70. Stanley B, Mann JJ. The need for innovation in health care systems to improve suicide prevention. JAMA psychiatry. 2020;77(1):96–8. pmid:31577340
  71. 71. Iorfino F, Carpenter JS, Cross SP, Davenport TA, Hermens DF, Guastella AJ, et al. Multidimensional outcomes in youth mental health care: what matters and why? MEDICAL JOURNAL OF AUSTRALIA. 2019;211:S4–S11.
  72. 72. Gordon JA, Avenevoli S, Pearson JL. Suicide Prevention Research Priorities in Health Care. JAMA Psychiatry. 2020. pmid:32432690
  73. 73. Iorfino F, Davenport TA, Ospina-Pinillos L, Hermens DF, Cross S, Burns J, et al. Using New and Emerging Technologies to Identify and Respond to Suicidality Among Help-Seeking Young People: A Cross-Sectional Study (vol 19, e310, 2017). Journal of Medical Internet Research. 2017;19(10).
  74. 74. Comtois KA, Kerbrat AH, DeCou CR, Atkins DC, Majeres JJ, Baker JC, et al. Effect of augmenting standard care for military personnel with brief caring text messages for suicide prevention: a randomized clinical trial. JAMA psychiatry. 2019;76(5):474–83. pmid:30758491
  75. 75. Stanley B, Brown GK, Brenner LA, Galfalvy HC, Currier GW, Knox KL, et al. Comparison of the safety planning intervention with follow-up vs usual care of suicidal patients treated in the emergency department. JAMA psychiatry. 2018;75(9):894–900. pmid:29998307
  76. 76. McCarthy JF, Bossarte RM, Katz IR, Thompson C, Kemp J, Hannemann CM, et al. Predictive modeling and concentration of the risk of suicide: implications for preventive interventions in the US Department of Veterans Affairs. American journal of public health. 2015;105(9):1935–42. pmid:26066914
  77. 77. Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological methods. 2009;14(4):323. pmid:19968396
  78. 78. Aggarwal S, Patton G, Reavley N, Sreenivasan SA, Berk M. Youth self-harm in low- and middle-income countries: Systematic review of the risk and protective factors. International Journal of Social Psychiatry. 2017;63(4):359–75. pmid:28351292
  79. 79. Haw C, Hawton K, Houston K, Townsend E. Psychiatric and personality disorders in deliberate self-harm patients. British Journal of Psychiatry. 2001;178(1):48–54. pmid:11136210
  80. 80. Suominen K, Henriksson M, Suokas J, Isometsä E, Ostamo A, Lönnqvist J. Mental disorders and comorbidity in attempted suicide. Acta Psychiatrica Scandinavica. 1996;94(4):234–40. pmid:8911558
  81. 81. Nelson B, McGorry PD, Wichers M, Wigman JT, Hartmann JA. Moving from static to dynamic models of the onset of mental disorder: a review. JAMA psychiatry. 2017;74(5):528–34. pmid:28355471
  82. 82. Schrijvers DL, Bollen J, Sabbe BG. The gender paradox in suicidal behavior and its impact on the suicidal process. Journal of affective disorders. 2012;138(1–2):19–26. pmid:21529962
  83. 83. Andrewes HE, Hulbert C, Cotton SM, Betts J, Chanen AM. Relationships between the frequency and severity of non-suicidal self-injury and suicide attempts in youth with borderline personality disorder. Early Intervention in Psychiatry. 2019;13(2):194–201. pmid:28718985