Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Integrating human services and criminal justice data with claims data to predict risk of opioid overdose among Medicaid beneficiaries: A machine-learning approach

  • Wei-Hsuan Lo-Ciganic,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Department of Pharmaceutical Outcomes & Policy, College of Pharmacy, University of Florida, Gainesville, FL, United States of America, Center for Drug Evaluation and Safety (CoDES), University of Florida, Gainesville, FL, United States of America

  • Julie M. Donohue ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Visualization, Writing – review & editing

    jdonohue@pitt.edu

    Affiliations Department of Health Policy and Management, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America, Center for Pharmaceutical Policy and Prescribing, Health Policy Institute, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Eric G. Hulsey,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Visualization, Writing – review & editing

    Affiliations Vital Strategies, Overdose Prevention Program, Pittsburgh, PA, United States of America, Allegheny County Department of Human Services, Office of Analytics, Technology and Planning, Pittsburgh, PA, United States of America, Department of Behavioral and Community Health Sciences, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Susan Barnes,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliations Department of Health Policy and Management, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America, Center for Pharmaceutical Policy and Prescribing, Health Policy Institute, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Yuan Li,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Resources, Writing – review & editing

    Affiliation Allegheny County Department of Human Services, Office of Analytics, Technology and Planning, Pittsburgh, PA, United States of America

  • Courtney C. Kuza,

    Roles Conceptualization, Formal analysis, Methodology, Project administration, Resources, Writing – review & editing

    Affiliation Center for Pharmaceutical Policy and Prescribing, Health Policy Institute, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Qingnan Yang,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliation Center for Pharmaceutical Policy and Prescribing, Health Policy Institute, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Jeanine Buchanich,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliations Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America, Center for Occupational Biostatistics and Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America

  • James L. Huang,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliations Department of Pharmaceutical Outcomes & Policy, College of Pharmacy, University of Florida, Gainesville, FL, United States of America, Center for Drug Evaluation and Safety (CoDES), University of Florida, Gainesville, FL, United States of America

  • Christina Mair,

    Roles Conceptualization, Formal analysis, Methodology, Writing – review & editing

    Affiliation Department of Behavioral and Community Health Sciences, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, United States of America

  • Debbie L. Wilson,

    Roles Project administration, Resources, Writing – review & editing

    Affiliation Department of Pharmaceutical Outcomes & Policy, College of Pharmacy, University of Florida, Gainesville, FL, United States of America

  • Walid F. Gellad

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Visualization, Writing – review & editing

    Affiliations Center for Pharmaceutical Policy and Prescribing, Health Policy Institute, University of Pittsburgh, Pittsburgh, PA, United States of America, Division of General Internal Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, United States of America, Center for Health Equity Research Promotion, Veterans Affairs Pittsburgh Healthcare System, Pittsburgh, PA, United States of America

Abstract

Health system data incompletely capture the social risk factors for drug overdose. This study aimed to improve the accuracy of a machine-learning algorithm to predict opioid overdose risk by integrating human services and criminal justice data with health claims data to capture the social determinants of overdose risk. This prognostic study included Medicaid beneficiaries (n = 237,259) in Allegheny County, Pennsylvania enrolled between 2015 and 2018, randomly divided into training, testing, and validation samples. We measured 290 potential predictors (239 derived from Medicaid claims data) in 30-day periods, beginning with the first observed Medicaid enrollment date during the study period. Using a gradient boosting machine, we predicted a composite outcome (i.e., fatal or nonfatal opioid overdose constructed using medical examiner and claims data) in the subsequent month. We compared prediction performance between a Medicaid claims only model to one integrating human services and criminal justice data with Medicaid claims (i.e., integrated model) using several metrics (e.g., C-statistic, number needed to evaluate [NNE] to identify one overdose). Beneficiaries were stratified into risk-score decile subgroups. The samples (training = 79,087, testing = 79,086, validation = 79,086) had similar characteristics (age = 38±18 years, female = 56%, white = 48%, having at least one overdose = 1.7% during study period). Using the validation sample, the integrated model slightly improved on the Medicaid claims only model (C-statistic = 0.885; 95%CI = 0.877–0.892 vs. C-statistic = 0.871; 95%CI = 0.863–0.878), with small corresponding improvements in the NNE and positive predictive value. Nine of the top 30 most important predictors in the integrated model were human services and criminal justice variables. Using the integrated model, approximately 70% of individuals with overdoses were members of the top risk decile (overdose rates in the subsequent month = 47/10,000 beneficiaries). Few individuals in the bottom 9 deciles had overdose episodes (0-12/10,000). Machine-learning algorithms integrating claims and social service and criminal justice data modestly improved opioid overdose prediction among Medicaid beneficiaries for a large U.S. county heavily affected by the opioid crisis.

Introduction

In the United States (U.S.), opioid overdose deaths quintupled from 1999 to 2017, with 47,600 opioid overdose deaths in 2017 [1]. The total annual cost for opioid overdose, abuse and dependence is greater than $78 billion, and includes health care costs along with lost productivity and criminal justice system costs [2].

To mitigate the opioid crisis, stakeholders (e.g., payers, health systems, and policy makers) have developed policies and programs, such as increased naloxone distribution and improved access to medications for opioid use disorder (OUD); however, these interventions do not always target those most at risk. Current methods to identify individuals at high risk for overdose use simple criteria (e.g., high opioid dose measured by morphine milligram equivalent) and have significant limitations [3,4]. For example, recent studies suggest that the opioid high-risk measures used by the Centers for Medicare & Medicaid Services (CMS) miss more than 90% of beneficiaries with an overdose or OUD diagnosis [5,6].

Social determinants of health that fundamentally shape individuals’ health, risk behaviors, access to health resources and social support systems are associated with risk of OUD and opioid overdose [7]. For example, individuals involved with the criminal justice system or who are homeless may be at particularly high risk of OUD or opioid overdose [8]. However, social determinants of health data are managed in different agencies and often not integrated in ways that facilitate their use with health care data. In 2017, the President’s Commission on Combating Drug Addiction and the Opioid Crisis recommended strengthening data integration across different agencies and systems and using advanced data analytics to improve identifying individuals at high risk of overdose [9,10]. Linking Medicaid claims with public human services data can account for important social determinants (e.g., incarceration and social services use) of opioid overdose [8]. Death certificate data can capture some overdoses not receiving medical attention. Furthermore, recent studies indicate the shortcomings of opioid risk prediction tools in current use and recommend the development of more advanced models to better identify individuals who are at risk (or no risk) of an opioid overdose [5,1114]. Machine-learning techniques may improve opioid-overdose risk prediction because of its capabilities handling a large number of variables and complex interactions [6,1517], as we have recently demonstrated among Medicare beneficiaries.

In this analysis, we used integrated human services and Medicaid claims data from the Allegheny County Data Warehouse in Pennsylvania to develop and validate a machine-learning algorithm to improve prediction of opioid overdose among Medicaid beneficiaries. Pennsylvania ranks second among US states in drug overdose mortality, with an opioid-related overdose death rate of 35 per 100,000 population in Allegheny County in 2018 [1,18]. The county maintains a unique data warehouse, including human services data from criminal justice records from courts and the county jail, medical examiner’s autopsy data, Medicaid claims data, and others social service data for County residents. In addition to creating the machine learning algorithm, we also stratified beneficiaries into subgroups that had similar risks of developing an opioid overdose to assist a range of interventions by health, human services, or criminal justice systems.

Materials and methods

Design and data sources

This prognostic study used a retrospective cohort design. It was approved by the University of Pittsburgh Institutional Review Board. This study followed the Standards for Reporting of Diagnostic Accuracy (STARD) and the Transparent Reporting of a Multivariable Prediction Model for Individual Prognostic or Diagnosis (TRIPOD) reporting guidelines (S1 and S2 Appendices) [19,20].

We obtained data from the Allegheny County Department of Human Services (ACDHS) Data Warehouse in Pennsylvania, U.S. The ACDHS Data Warehouse is a central electronic repository of social, human services, and health data on clients and the services financed and/or managed directly by ACDHS and those delivered by a variety of other government entities [21]. As of 2018, the ACDHS Data Warehouse contained population-level data from more than 1.2 million clients residing in Allegheny County from over 30 sources. We limited analyses to Medicaid beneficiaries because of their disproportionately high risk of drug overdose [2224]. Among Medicaid beneficiaries in Allegheny County eligible for this analysis, we merged claims data with encounter data to capture the use of physical and behavioral health services financed by Medicaid, records of county-funded behavioral health services, records on dates of incarceration in Allegheny County Jail, criminal courts’ offense records from the Magisterial District Court or Court of Common Pleas, and several other publicly-funded human services encounter data including receipt of aging, child welfare system, homelessness and housing supports, independent living, intellectual disability services, and other public benefits services [21,25]. We also linked to autopsy reports from the Allegheny County Medical Examiner’s Office to identify persons who died of opioid overdose in Allegheny County [21,25].

Study sample

We identified Medicaid beneficiaries who were Allegheny County residents any time from 2015 to 2018. An index date was defined as the first observed date of Medicaid enrollment during our study period. We excluded beneficiaries who: (1) were under 12 years of age because of this group’s very low opioid-related mortality; (2) had invalid dates (e.g., multiple death dates, an index date before the date of birth or after the date of death); and (3) had a fatal opioid overdose in the first 30 days after the index date (because of a lack of predictor information); (see S1 Fig). Beneficiaries remained in the cohort once eligible, until censored because of death or Medicaid disenrollment.

Outcome variables: Opioid overdose

The primary outcome was a composite end point of any occurrence of either fatal opioid overdoses recorded in Medical Examiner data, or opioid overdose events captured in Medicaid claims, both of which were measured in 30-day periods from the index date of Medicaid enrollment [6]. We chose 30-day periods to better measure the immediate risk of individuals recently released from jail. We used the International Classification of Diseases codes (ICD-9/ICD-10; S1 Table) to identify overdose events from prescription or other opioids including heroin from inpatient or emergency department (ED) settings, which were overwhelmingly non-fatal, as <1% had a fatal overdose documented within 7 days. Hereafter, we use the term non-fatal opioid overdose to refer to these overdose events. Non-fatal overdose was defined as an opioid overdose code as the primary diagnosis, or other drug overdose or substance use disorders as the primary diagnosis (S2 Table) with opioid overdose as non-primary diagnosis, as defined previously.14 Fatal opioid overdose was measured using medical examiner’s autopsy data and defined as an accidental overdose death in which an opioid (e.g., oxycodone, fentanyl, heroin) was the primary cause or a contributing factor. In secondary analyses, we examined non-fatal opioid overdoses (measured in claims) and fatal opioid overdoses separately.

Candidate predictors

We compiled 290 candidate predictors identified from prior literature [2548] and our previous work (S3 Table) [6] including socio-demographics, health status, prescription opioid use patterns, and human services and criminal justice records to account for some of the social determinants of health. All of the candidate predictors were measured in 30-day periods. Patient sociodemographic characteristics included age, sex, race, type of Medicaid eligibility, and duration of continuous enrollment in Medicaid each month. Health status factors (e.g., number of ED visits, mental health disorders, history of opioid overdose) were derived from the literature and used in our prior work [6]. We included candidate predictors related to prescription opioid use (e.g., average morphine milligram equivalent [MME]) and other relevant medication use (e.g., benzodiazepines, gabapentinoids) when prescription claims data were available.

From ACDHS, we included monthly indicators of receipt of over a dozen publicly-funded human services programs, including receipt of aging, child welfare system, homeless and housing supports, independent living, intellectual disability services, and other public benefits services. For criminal justice, we included 18 indicators. Using records from Allegheny County Jail, we included an indicator for any jail release in the 30-day period, along with a continuous variable for the number of jail releases in each 30-day period. Using information from the Magisterial District Court and Court of Common Pleas we constructed variables for the number and type of criminal offenses in 30-day periods, by 8 types of offenses with 2 different levels (e.g., drug misdemeanor, drug felony, property misdemeanor, property felony). We measured and updated candidate predictors in 30-day periods after the index date to account for changes in predictors over time for predicting the risk of opioid overdose in the subsequent 30-day period (S2 Fig). This time-updating approach for predicting opioid-overdose risk in the subsequent 30 days mimics active surveillance a health system might conduct [6].

Gradient boosting machine approach and prediction performance evaluation

Our analysis comprised two steps: (1) creating risk prediction scores for opioid overdose for each 30-day period for each individual, and then (2) risk stratifying individuals into subgroups with similar overdose risks. First, we randomly and equally divided the cohort into training (developing algorithms), testing (refining algorithms), and validation (evaluating algorithm’s prediction performance) samples. We developed and tested prediction algorithms for opioid overdose using gradient boosting machine (GBM; Appendix Methods in S3 Appendix). Studies consistently show that GBM yields the best prediction results with an ability to handle complex interactions, which were likely given the complicated nature of opioid overdose with multifaceted factors involved [6,49]. Beneficiaries may have multiple 30-day periods until occurrence of a censored event including disenrollment or death. Sensitivity analyses were conducted using iterative patient-level random subsets (i.e., using one 30-day period with predictors measured to predict each patient’s risk in the subsequent month) from the validation data to ensure the findings’ robustness.

To assess whether the integrated data (e.g., health and human services and criminal justice system data) improved discrimination performance (i.e., extent to which predicted high-risk patients exhibit higher opioid-overdose rates compared to those predicted as low risk) compared to Medicaid claims data alone, we compared the C-statistics (0.7 to 0.8: good; >0.8: very good) and precision-recall curves between two GBM models [50]. Given that opioid-overdose events are rare outcomes and C-statistics do not incorporate outcome prevalence information, we report metrics including sensitivity, specificity, positive predictive value (PPV), negative predictive value, positive likelihood ratio (PLR), negative likelihood ratio (NLR), number needed to evaluate to identify one overdose (NNE), and estimated rate of alerts (S3 Fig) [51,52].

No single prediction probability threshold suits every purpose, so we present these metrics at multiple levels of sensitivity and specificity (e.g., arbitrarily choosing 90% sensitivity as an anchor). We also classified the validation sample’s beneficiaries into subgroups using the decile of their predicted overdose risk score, with the highest decile split into three strata based on the top 1st, 2nd to 5th, and 6th to 10th percentiles to allow closer examination of patients at highest risk of developing opioid overdose. We evaluated calibration plots (extent to which predicted opioid-overdose risk agreed with observed risks) by risk subgroup. Furthermore, we examined age, sex, and race differences by risk subgroup based on the prior literature indicating demographic-specific differences in substance use [5355].

We performed several additional analyses to ensure the algorithm’s practical utility. First, we report the top 30 important predictors of individuals possessing specific risk or protective factors. Rather than p values or coefficients, the GBM reports the importance of predictor variables included in a model. Importance is a measure of each variable’s cumulative contribution toward reducing the squared error, or heterogeneity within the subset, after the data set is sequentially split based on that variable. Thus, it reflects a variable’s impact on prediction. Absolute importance is then scaled to give relative importance, with a maximum importance of 100. Second, we conducted sensitivity analyses using 3-month measurement periods instead of 30-day periods because some prescription drug monitoring programs and health plans update data and evaluation risks quarterly [31,32,56].

Statistical analysis

We compared beneficiaries’ characteristics by training, testing, and validation samples with two-tailed Student’s t-test, chi-square test, and analysis of variance, or corresponding nonparametric test, as appropriate. Analyses were performed using SAS 9.4 (SAS Institute Inc, Cary, NC) and Salford Predictive Modeler software suite v8.2 (Minitab LLC, State College, Pennsylvania, USA).

Results

Beneficiary characteristics

Our analysis followed beneficiaries for an average of 34 months and included 8,118,676 30-day episodes. The outcome distributions and characteristics of the samples of beneficiaries (i.e., training = 79,087, testing = 79,086, validation = 79,086) were similar (mean age = 37.9±18.2 years, 132,750 (56.0%) female, 114,345 (48.2%) White and 73,857 (31.1%) Black; Table 1). Among the three samples overall, 3,945 (1.7%) beneficiaries had ≥1 opioid-overdose episode, 951 individuals (0.40%) had fatal opioid overdose, and 3,209 (1.4%) individuals had nonfatal overdose. Among individuals with fatal overdose, 207 (21.8%) had a prior non-fatal overdose.

thumbnail
Table 1. Characteristics of Medicaid beneficiaries (n = 237,259) in Allegheny County, Pennsylvania USA, divided into training, testing, and validation samples.

https://doi.org/10.1371/journal.pone.0248360.t001

Prediction performance using gradient boosting machine (GBM)

The four prediction performance measures for GBM models with Medicaid claims only versus with integrated data are summarized in Fig 1. At the episode level, the integrated model slightly improved the Medicaid claims only model (C-statistic = 0.885, 95%CI = 0.877–0.892 vs. C-statistic = 0.871, 95%CI = 0.863–0.878; Fig 1A). Based on the area under the curve, the integrated model also had slightly improved precision-recall performance (Fig 1B).

thumbnail
Fig 1. Performance matrix for predicting opioid overdose using gradient boosting machine in the integrated model vs. Medicaid claims only model in Allegheny County, Pennsylvania USA.

Fig 1 shows four prediction performance matrices in the validation sample (79,086 beneficiaries with 2,700,425 non-overdose episodes and 1,748 overdose episodes). Fig 1A shows the areas under the ROC curves (or C-statistics); Fig 1B shows the precision-recall curves (precision = PPV and recall = sensitivity): Precision recall curves that are closer to the upper right corner or are above another method have improved performance; Fig 1C shows the number needed to evaluate by different cutoffs of sensitivity; and Fig 1D shows alerts per 100 patients by different cutoffs of sensitivity. Abbreviations: AUC: Area under the curves; GBM: Gradient boosting machine; ROC: Receiver Operating Characteristics.

https://doi.org/10.1371/journal.pone.0248360.g001

S4 Table presents the prediction performance measures by sensitivity and specificity level (90%-100%). When at the threshold that balances sensitivity and specificity (based on Youden index), the integrated model improved modestly on the model with Medicaid claims only (integrated model: 80.8 sensitivity, 79.7% specificity, 0.26% PPV, 99.9% NPV, NNE = 389 to identify one opioid overdose, and 20 positive alerts/100 beneficiaries vs. the Medicaid claims only model: 79.6% sensitivity, 77.7% specificity, 0.23% PPV, 99.9% NPV, NNE = 435, and 23 positive alerts/100 beneficiaries). When sensitivity was set at 90% (i.e., to attempt identifying 90% of actual opioid overdoses), the integrated model performed slightly better (integrated model: 66.6% specificity, 0.17% PPV, 99.9% NPV, NNE = 574 to identify one opioid overdose, and 34 positive alerts/100 beneficiaries vs. the Medicaid claims only model: 64.6% specificity, 0.16% PPV, 99.9% NPV, NNE = 608, and 36 positive alerts/100 beneficiaries) than the Medicaid only model (Fig 1C and 1D). When, specificity was set at 90% (i.e., to attempt identifying 90% of actual non-overdoses), the integrated model had a 65.2% sensitivity, 0.42% PPV, 99.9% NPV, NNE = 238, and 10 positive alerts/100 beneficiaries). Sensitivity analyses using randomly and iteratively selected patient-level data overall yielded similar results (see S4A–S4D Fig for an example).

As shown in S5 and S6 Figs, similar to the main findings, the integrated model resulted in slight improvement from the models with Medicaid claims only for the two separate secondary outcomes (i.e., fatal vs. nonfatal overdose).

Risk stratification by decile risk subgroup

Fig 2 shows the actual opioid-overdose rate of individuals by decile subgroup, comparing the integrated model to the Medicaid claims data only model. In the integrated model, the highest-risk subgroup (risk scores in the top 1st percentile; 1.2% [n = 974] of the validation cohort) had a 1.33% positive predictive value and the NNE was 74. Among 65 individuals with an overdose in the validation cohort, 45 (69.2%) occurred in the top decile. Compared to the lower risk-groups, individuals in the top decile had at least a 35-fold higher overdose rate (e.g., observed overdose rate in the subsequent month: decile 1 = 0.47%, decile 2 = 0.12%). The overdose rates were minimal (0 to 3 per 10,000) for the 3rd through 10th decile subgroups. As shown in Table 2, those in the higher risk-groups (e.g., top 1st percentile) were generally more likely to be aged 25–34 and 35–44 years, male, and White.

thumbnail
Fig 2. Opioid overdose identified by decile risk subgroup in the validation sample (n = 79,086) using gradient boosting machine: Integrated vs. Medicaid claim only models a.

a: Based on the individual’s predicted probability of an opioid overdose (fatal/nonfatal) event, we classified beneficiaries in the validation sample into decile risk subgroups, with the highest decile further split into 3 additional strata based on the top 1st, 2nd to 5th, and 6th to 10th percentiles to allow closer examination of patients at highest risk of developing overdose.

https://doi.org/10.1371/journal.pone.0248360.g002

thumbnail
Table 2. Demographic profiles by risk subgroup in the gradient boosting machine model integrated model in the validation sample (n = 79,086).

https://doi.org/10.1371/journal.pone.0248360.t002

Important predictors

Fig 3 shows the 30 most important predictors (out of 80 important predictors identified by the integrated model), with the top five predictors being age, OUD diagnosis identified from behavioral health claims, race, receipt of public benefit services, and Medicaid eligibility type. Nine of the 30 most important predictors were human services and criminal justice variables not measurable in Medicaid claims.

thumbnail
Fig 3. Top 30 predictors for opioid overdose (fatal/nonfatal) identified by gradient boosting machine model integrated with Department of Human Services and criminal justice records data (ordered by importance)a.

a Rather than p values or coefficients, the GBM reports the importance of predictor variables included in a model. Importance is a measure of each variable’s cumulative contribution toward reducing the squared error, or heterogeneity within the subset, after the data set is sequentially split based on that variable. Thus, it reflects a variable’s impact on prediction. Absolute importance is then scaled to give relative importance, with a maximum importance of 100. For example, the top 5 important predictors identified from GBM included age, opioid use disorder diagnosis identified from behavioral health claims, race, public benefit services, and Medicaid eligibility group. *: These variables were binary indicators derived from Allegheny County Data Warehouse that may be captured either from state public services data or Medicaid claims. Abbreviations: BH: Behavioral health; CP: Common Plea Court; CYF: Child, Youth, and Family; DHS: Department of Human Services; GBM: Gradient boosting machine; MD: Magisterial District Court; PH: Physical health; SUD: Substance use disorders.

https://doi.org/10.1371/journal.pone.0248360.g003

Sensitivity analyses using 3-month measurement periods performed similarly to the main analyses (S7 Fig).

Discussion

By combining Medicaid claims data with human services and criminal justice records in Allegheny County, Pennsylvania, a county heavily impacted by the opioid crisis, we developed machine-learning models with strong performance for predicting beneficiaries’ risk of opioid overdose. Our findings highlight the potential utility of machine-learning approaches when predicting opioid overdoses. Although the positive predictive value of the model was low, as expected given the low incidence of overdose in a 30-day period [51], the integrated model successfully segmented the sample into different risk groups based on the predicted risk scores. Approximately 90% of the population had a minimal risk of overdose, and the top decile group captured almost 70% of the beneficiaries who had an opioid overdose. The ability to identify these risk groups is important for payers and policy makers whose interventions are presently based on measures of risk that are less accurate [5]. The integrated models slightly improved on the models with Medicaid claims data only, highlighting the potential value of breaking down data silos to incorporate information on social determinants of health in addressing the opioid crisis.

We identified seven prior published opioid prediction models, each focusing on predicting opioid overdose over different time windows (e.g., 6 months), with only one linking to statewide corrections and hospital databases [6,43,46,48,5759]. Most of the studies had key limitations, including relying on single sources, such as administrative claims data, electronic health records or prescription drug monitoring programs; not measuring predictors over time but at baseline; only capturing the first overdose episode; inability to identify complex, non-linear or non-intuitive relationships (interactions) between the predictors and outcomes; and having suboptimal prediction performance. Our study overcomes these limitations by using machine-learning approaches capable of incorporating non-linear or complex interactions, linking to human services use and criminal justice records data, and updating predictor measures monthly to reflect dynamic condition changes in prediction. We also used a population-based sample (including Medicaid beneficiaries with and without a filled opioid prescription) to predict beneficiaries’ overdose risk in the subsequent 30-day period instead of using a lengthy time period.

Our models integrating human services and criminal justice data resulted in slight improvements in prediction performance compared to using Medicaid claims data alone, indicating the role of key social determinants of health in opioid overdose prediction [5,8,60]. Although the improvement in performance was not large and varied by the selection of probability thresholds, it was consistent across all measures of prediction performance. Creating state- or county-level data warehouses that link individual-level records across multiple public service systems is a promising and immediate strategy to guide public health interventions for those at high risk of overdose [10,59,61]. For example, Chapter 55 of the Acts of 2015 in Massachusetts allows comprehensive linkage across different sources of datasets at the individual level to gain deeper understanding of circumstances influencing fatal and non-fatal overdoses [62,63]. Our risk prediction scores and risk stratification approach can be used in healthcare settings (e.g., ED), and also by the state, county, or other public health stakeholders and agencies (e.g., community behavioral health organizations, criminal justice system) to more efficiently identify patients at high risk of overdose to timely target interventions.

Regardless of the different probability thresholds used to identify high-risk individuals in our study, the low incidence rates of overdose over 30-day periods resulted in low PPV (<2%). In order to maximize the clinical utility and minimize false positives of any clinical tool for predicting rare outcomes, identifying subgroups with different risk profiles can be valuable to provide guidance on how to target interventions and allocate limited resources more efficiently. First, over 90% of individuals with minimal or no overdose risk can be excluded using our risk stratification strategy. For the remaining individuals, those in the highest risk subgroup (e.g., top 1st percentile of predicted risk scores) may benefit from interventions offering close monitoring by case managers or other specialists, although these programs can be costly. Individuals in the moderate risk subgroups (e.g., top 2nd-5th percentile of risk scores) may benefit from lower cost or low-risk harm prevention approaches such as naloxone kits distribution [61]. Given the high morbidity and mortality from overdose and the interventions currently being deployed to many individuals who are at much less overdose risk using less powerful prediction criteria [5,6], the false positive rate may be acceptable. Stakeholders or agencies who will implement the prediction algorithm can choose the thresholds for identifying individuals at high risk based on their interventions and resource capacities.

Our study has limitations. First, although we used a validated algorithm using ICD codes to identify opioid-overdose events in medical claims (PPV = 81%-84%) [64] and medical examiner’s autopsy records to identify fatal overdose, we could not capture nonfatal overdoses not receiving medical attention, or fatal overdoses that occurred outside of the county or state. Second, some potential predictors such as socio-behavioral information (e.g., family history) and laboratory results are not captured in our data but could improve the model. At the time of the study, we only had prescription claims for those receiving behavioral health services from the county. Although our algorithm was able to incorporate prescription information when available, its performance may be further improved when we have complete prescription claims data in the future. Third, our study describes the development and validation of the prediction model but does not address challenges to implementation. For example, technical issues related to updating risk scores in a real-time manner need to be considered. The demographic differences in risk subgroups that we observed indicate that health services may not be provided in an equitable manner based on race or socioeconomic status. These ethical issues and potential biases point to the importance of performing comprehensive bias assessments and identifying potential approaches to improve algorithm fairness prior to model implementation. Finally, prediction algorithms derived from the Medicaid population in a large county in Pennsylvania may not generalize to other states or populations with different demographic profiles or program benefits. However, the demographic characteristics of opioid-related overdose in Allegheny County were generally similar to other U.S. counties heavily impacted by overdose during the study period [65,66].

Conclusions

In conclusion, integrating human services and criminal justice data with Medicaid claims using machine learning showed small but potentially informative improvements in risk prediction for opioid overdose among Medicaid beneficiaries. These findings demonstrate the potential utility of machine-learning approaches for opioid-overdose risk prediction, and highlight the value of breaking down data silos allowing state, county or other public health stakeholders and agencies to provide more timely public health interventions.

Supporting information

S1 Fig. Sample size flow chart of study cohort.

https://doi.org/10.1371/journal.pone.0248360.s001

(DOCX)

S2 Fig. Study design diagram.

Each patient had at least one Medicaid enrollment data point between 2015 to 2018. An index date was defined as the first observed date of Medicaid enrollment during our study period. We followed patients starting every 30 days after the index date until they were censored because of death or disenrollment. We measured predictor candidates and opioid overdose episodes for the 30-day periods.

https://doi.org/10.1371/journal.pone.0248360.s002

(DOCX)

S3 Fig. Classification matrix and definition of prediction performance metrics.

https://doi.org/10.1371/journal.pone.0248360.s003

(DOCX)

S4 Fig. Performance matrix for predicting opioid overdose between gradient boosting machine models with integrated data vs. Medicaid claims only data in Medicaid beneficiaries (Allegheny County, Pennsylvania): Sensitivity analyses using patient-level data.

Figure shows four prediction performance matrices using an example of using patient-level data (79,021 non-overdose and 65 overdose patients, excluding those who had an overdose from the first 30-day period) from the validation sample. S4A Fig shows the areas under the ROC curves (or C-statistics); S4B Fig shows the precision-recall curves (precision = PPV and recall = sensitivity)—precision recall curves that are closer to the upper right corner or above the other method have improved performance; S4C Fig shows the number needed to evaluate by different cutoffs of sensitivity; and S4D Fig shows alerts per 100 patients by different cutoffs of sensitivity. Abbreviations: AUC: Area under the curves; GBM: Gradient boosting machine; ROC: Receiver Operating Characteristics.

https://doi.org/10.1371/journal.pone.0248360.s004

(DOCX)

S5 Fig. Performance matrix for predicting opioid overdose between gradient boosting machine models with integrated data vs. Medicaid claims only models in Medicaid beneficiaries (Allegheny County, Pennsylvania): Nonfatal opioid overdose.

Figure shows four prediction performance matrices for predicting overdose in the subsequent 30 days at the episode level from the validation sample. S5A Fig shows the areas under ROC curves (or C-statistics); S5B Fig shows the precision-recall curves (precision = PPV and recall = sensitivity)—precision recall curves that are closer to the upper right corner or above the other method have improved performance; S5C Fig shows the number needed to evaluate by different cutoffs of sensitivity; and S5D Fig shows alerts per 100 patients by different cutoffs of sensitivity. Abbreviations: AUC: Area under the curves; GBM: Gradient boosting machine; ROC: Receiver Operating Characteristics.

https://doi.org/10.1371/journal.pone.0248360.s005

(DOCX)

S6 Fig. Performance matrix for predicting opioid overdose between gradient boosting machine models with integrated data vs. Medicaid claims only models in Medicaid beneficiaries (Allegheny County, Pennsylvania): Fatal opioid overdose.

Figure shows four prediction performance matrices for predicting overdose in the subsequent 30 days at the episode level from the validation sample. S6A Fig shows the areas under ROC curves (or C-statistics); S6B Fig shows the precision-recall curves (precision = PPV and recall = sensitivity)—precision recall curves that are closer to the upper right corner or above the other method have improved performance; S6C Fig shows the number needed to evaluate by different cutoffs of sensitivity; and S6D Fig shows alerts per 100 patients by different cutoffs of sensitivity. Abbreviations: AUC: Area under the curves; GBM: Gradient boosting machine; ROC: Receiver Operating Characteristics.

https://doi.org/10.1371/journal.pone.0248360.s006

(DOCX)

S7 Fig. Performance matrix for predicting opioid overdose between gradient boosting machine models with integrated vs. Medicaid claims only data in Medicaid beneficiaries (Allegheny County, Pennsylvania): Sensitivity analyses using 3-month windows.

Figure shows four prediction performance matrices for predicting overdose in the subsequent 3 months at the episode level from the validation sample. S7A Fig shows the areas under ROC curves (or C-statistics); S7B Fig shows the precision-recall curves (precision = PPV and recall = sensitivity)—precision recall curves that are closer to the upper right corner or above the other method have improved performance; S7C Fig shows the number needed to evaluate by different cutoffs of sensitivity; and S7D Fig shows alerts per 100 patients by different cutoffs of sensitivity. Abbreviations: AUC: Area under the curves; GBM: Gradient boosting machine; ROC: Receiver Operating Characteristics.

https://doi.org/10.1371/journal.pone.0248360.s007

(DOCX)

S1 Table. Diagnosis codes for identifying opioid overdose.

https://doi.org/10.1371/journal.pone.0248360.s008

(DOCX)

S2 Table. Other diagnosis codes used to identify the likelihood of opioid overdose.

* Excluding codes for opioid and heroin overdose. a: Based on Dunn KM, Saunders KW, Rutter CM, et al. Opioid prescriptions for chronic pain and overdose: A cohort study. Ann Intern Med. 2010; 152 (2):85–92 but excluding E950-959 (suicide and self-inflicted injury codes).

https://doi.org/10.1371/journal.pone.0248360.s009

(DOCX)

S3 Table. Summary of predictor candidates (n = 290) measured in 30-day windows for predicting subsequent opioid overdosesa.

a: Details for the operational definitions for each variable and corresponding diagnosis and procedure codes and National Drug Codes can be provided per request to the corresponding author. b: We used an “as-prescribed” approach that assumes patients taking all prescribed opioids on the schedule recommended by their clinicians. (Bohnert AS et al. JAMA. 2011;305(13):1315–1321. doi: 10.1001/jama.2011.370). Patients who received refills for the same drug at the same dose and schedule while still having opioid prescriptions within three days from a prior fill were assumed to have taken the medication from the prior fill before taking medication from the second fill. (Gellad WF et al. Am J Public Health. 2018;108(2):248–255. doi: 10.2105/AJPH.2017.304174). c: We calculated morphine milligram equivalent (MME) for each opioid prescription, defined by the quantity dispensed multiplied by the strength in milligrams, multiplied by a conversion factor. (Bohnert AS et al. JAMA. 2011;305(13):1315–1321. doi: 10.1001/jama.2011.370). For each person, the average daily MME during the 30-day window was calculated by summing MMEs across all opioids and dividing by the number of days supplied. d: Data sources were obtained from the Allegheny County Department of Human Services Data Warehouse in Pennsylvania, U.S. Abbreviations: BZD: Benzodiazepines; DUI: Driving under the influence; LAO: Long-acting opioids; MME: Morphine milligram equivalent; No: Number of; SAO: Short-acting opioids; SUD: Substance use disorders.

https://doi.org/10.1371/journal.pone.0248360.s010

(DOCX)

S4 Table. Prediction performance measures for predicting opioid overdose (fatal/nonfatal) varying sensitivity and specificity using gradient boosting machine: With integrated data vs. Medicaid claims data only models.

a: Scores were calculated by predicted probability multiplied by 100. Score threshold refers to the score used to classify or predict individuals with opioid overdose (i.e., ≥ the threshold) vs. non-overdose (i.e., <threshold). b: Optimized threshold was calculated by the Youden Index to achieve balanced sensitivity and specificity. Abbreviations: GBM: Gradient boosting machine; INF: Infinity; N/A: Not able to calculate; NNE: Number needed to evaluate; NPV: Negative predictive values; PLR: Positive likelihood ratio; PPV: Positive predictive values; RF: Random forest.

https://doi.org/10.1371/journal.pone.0248360.s011

(DOCX)

S1 Appendix. Standards for Reporting of Diagnostic Accuracy (STARD).

https://doi.org/10.1371/journal.pone.0248360.s012

(DOCX)

S2 Appendix. Transparent Reporting of a Multivariable Prediction Model for Individual Prognostic or Diagnosis (TRIPOD).

https://doi.org/10.1371/journal.pone.0248360.s013

(DOCX)

Acknowledgments

Disclosure

The views presented here are those of the authors alone and do not necessarily represent the views of the Department of Veterans Affairs, or the Allegheny County or United States Government.

References

  1. 1. Scholl L, Seth P, Kariisa M, Wilson N, Baldwin G. Drug and Opioid-Involved Overdose Deaths—United States, 2013–2017. MMWR Morb Mortal Wkly Rep. 2018;67(5152):1419–27. Epub 2019/01/04. pmid:30605448; PubMed Central PMCID: PMC6334822 potential conflicts of interest. No potential conflicts of interest were disclosed.
  2. 2. Florence CS, Zhou C, Luo F, Xu L. The Economic Burden of Prescription Opioid Overdose, Abuse, and Dependence in the United States, 2013. Med Care. 2016;54(10):901–6. pmid:27623005.
  3. 3. Smith SM, Dart RC, Katz NP, Paillard F, Adams EH, Comer SD, et al. Classification and definition of misuse, abuse, and related events in clinical trials: ACTTION systematic review and recommendations. Pain. 2013;154(11):2287–96. pmid:23792283.
  4. 4. Cochran G, Woo B, Lo-Ciganic WH, Gordon AJ, Donohue JM, Gellad WF. Defining Nonmedical Use of Prescription Opioids Within Health Care Claims: A Systematic Review. Substance abuse. 2015;36(2):192–202. pmid:25671499.
  5. 5. Wei YJ, Chen C, Sarayani A, Winterstein AG. Performance of the Centers for Medicare & Medicaid Services’ Opioid Overutilization Criteria for Classifying Opioid Use Disorder or Overdose. JAMA. 2019;321(6):609–11. Epub 2019/02/13. pmid:30747958.
  6. 6. Lo-Ciganic WH, Huang JL, Zhang HH, Weiss JC, Wu Y, Kwoh CK, et al. Evaluation of Machine-Learning Algorithms for Predicting Opioid Overdose Risk Among Medicare Beneficiaries With Opioid Prescriptions. JAMA Netw Open. 2019;2(3):e190968. Epub 2019/03/23. pmid:30901048.
  7. 7. Dasgupta N, Beletsky L, Ciccarone D. Opioid Crisis: No Easy Fix to Its Social and Economic Determinants. American Journal of Public Health. 2018;108(2):182–6. pmid:29267060
  8. 8. Altekruse SF, Cosgrove CM, Altekruse WC, Jenkins RA, Blanco C. Socioeconomic risk factors for fatal opioid overdoses in the United States: Findings from the Mortality Disparities in American Communities Study (MDAC). PLoS One. 2020;15(1):e0227966. Epub 2020/01/18. pmid:31951640; PubMed Central PMCID: PMC6968850.
  9. 9. The President’s Commission on Combating Drug Addiction and the Opioid Crisis. Final Report Draft. 2017 [cited 2019 December 29]. Available from: https://www.whitehouse.gov/sites/whitehouse.gov/files/images/Final_Report_Draft_11-15-2017.pdf.
  10. 10. Smart R, Kase CA, Meyer A, Stein BD. U.S. Department of Health and Human Services Assistant Secretary for Planning and Evaluation (ASPE) Office of Health Policy. Data Sources and Data-Linking Strategies to Support Research to Address the Opioid Crisis. Washington, DC: Office of Health Policy; 2018. Available from: https://aspe.hhs.gov/system/files/pdf/259641/OpioidDataLinkage.pdf.
  11. 11. Rough K, Huybrechts KF, Hernandez-Diaz S, Desai RJ, Patorno E, Bateman BT. Using prescription claims to detect aberrant behaviors with opioids: comparison and validation of 5 algorithms. Pharmacoepidemiol Drug Saf. 2019;28(1):62–9. Epub 2018/04/25. pmid:29687539; PubMed Central PMCID: PMC6200661.
  12. 12. Canan C, Polinski JM, Alexander GC, Kowal MK, Brennan TA, Shrank WH. Automatable algorithms to identify nonmedical opioid use using electronic data: a systematic review. J Am Med Inform Assoc. 2017;24(6):1204–10. Epub 2017/10/11. pmid:29016967.
  13. 13. Goyal H, Singla U, Grimsley EW. Identification of Opioid Abuse or Dependence: No Tool Is Perfect. Am J Med. 2017;130(3):e113. Epub 2017/02/22. pmid:28215952.
  14. 14. Wood E, Simel DL, Klimas J. Pain Management With Opioids in 2019–2020. JAMA. 2019:1–3. Epub 2019/10/11. pmid:31600370.
  15. 15. Amalakuhan B, Kiljanek L, Parvathaneni A, Hester M, Cheriyath P, Fischman D. A prediction model for COPD readmission: catching up, catching our breath, and improving a national problem. J Community Hosp Intern Med Perspect. 2012;2:9915–21.
  16. 16. Chirikov VV, Shaya FT, Onukwugha E, Mullins CD, dosReis S, Howell CD. Tree-based Claims Algorithm for Measuring Pretreatment Quality of Care in Medicare Disabled Hepatitis C Patients. Med Care. 2015. pmid:26225448.
  17. 17. Thottakkara P, Ozrazgat-Baslanti T, Hupf BB, Rashidi P, Pardalos P, Momcilovic P, et al. Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. PLoS One. 2016;11(5):e0155705. pmid:27232332; PubMed Central PMCID: PMC4883761.
  18. 18. OverdoseFreePA: Death Data Overview 2019 [cited 2019 December 30]. Available from: https://www.overdosefreepa.pitt.edu/know-the-facts/death-data-overview/.
  19. 19. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015;351:h5527. Epub 2015/10/30. pmid:26511519; PubMed Central PMCID: PMC4623764.
  20. 20. Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73. Epub 2015/01/07. pmid:25560730.
  21. 21. Allegheny County Department of Human Services. Allegheny County Data Warehouse. [cited 2020 January 1]. Available from: https://www.alleghenycountyanalytics.us/index.php/dhs-data-warehouse/.
  22. 22. Fernandes JC, Campana D, Harwell TS, Helgerson SD. High mortality rate of unintentional poisoning due to prescription opioids in adults enrolled in Medicaid compared to those not enrolled in Medicaid in Montana. Drug Alcohol Depend. 2015;153:346–9. pmid:26077605.
  23. 23. Wachino V. Best Practices for Addressing Prescription Opioid Overdoses, Misuse and Addiction 2016 [cited 2016 July 19]. Available from: https://www.medicaid.gov/federal-policy-guidance/downloads/cib-02-02-16.pdf.
  24. 24. Xing J, Mancuso D, Felver BEM. Overdose Deaths among Medicaid Enrollee in Washington Sate: Washington State Department of Social & Health Services; 2015 [cited 2016 July 19]. Available from: https://www.dshs.wa.gov/sites/default/files/SESA/rda/documents/research-4-92_0.pdf.
  25. 25. Hacker K, Jones LD, Brink L, Wilson A, Cherna M, Dalton E, et al. Linking Opioid-Overdose Data to Human Services and Criminal Justice Data: Opportunities for Intervention. Public Health Rep. 2018;133(6):658–66. Epub 2018/10/10. pmid:30300555; PubMed Central PMCID: PMC6225885.
  26. 26. Webster LR, Webster RM. Predicting aberrant behaviors in opioid-treated patients: preliminary validation of the Opioid Risk Tool. Pain Med. 2005;6(6):432–42. Epub 2005/12/13. pmid:16336480.
  27. 27. Ives TJ, Chelminski PR, Hammett-Stabler CA, Malone RM, Perhac JS, Potisek NM, et al. Predictors of opioid misuse in patients with chronic pain: a prospective cohort study. BMC Health Serv Res. 2006;6:46. Epub 2006/04/06. pmid:16595013; PubMed Central PMCID: PMC1513222.
  28. 28. Becker WC, Sullivan LE, Tetrault JM, Desai RA, Fiellin DA. Non-medical use, abuse and dependence on prescription opioids among U.S. adults: Psychiatric, medical and substance use correlates. Drug and Alcohol Dependence. 2008;94(1):38–47. pmid:18063321
  29. 29. Hall AJ, Logan JE, Toblin RL, et al. Patterns of abuse among unintentional pharmaceutical overdose fatalities. JAMA. 2008;300(22):2613–20. pmid:19066381
  30. 30. CDC. Overdose deaths involving prescription opioids among Medicaid enrollees—Washington, 2004–2007. MMWR. 2009;58(42):1171–5. pmid:19875978.
  31. 31. White AG, Birnbaum HG, Schiller M, Tang J, Katz NP. Analytic models to identify patients at risk for prescription opioid abuse. Am J Manag Care. 2009;15(12):897–906. pmid:20001171.
  32. 32. Dunn KM, Saunders KW, Rutter CM, Banta-Green CJ, Merrill JO, Sullivan MD, et al. Opioid prescriptions for chronic pain and overdose: a cohort study. Ann Intern Med. 2010;152(2):85–92. pmid:20083827; PubMed Central PMCID: PMC3000551.
  33. 33. Edlund MJ, Martin BC, Devries A, Fan MY, Braden JB, Sullivan MD. Trends in use of opioids for chronic noncancer pain among individuals with mental health and substance use disorders: the TROUP study. The Clinical journal of pain. 2010;26(1):1–8. Epub 2009/12/23. pmid:20026946; PubMed Central PMCID: PMC2917238.
  34. 34. Sullivan MD, Edlund MJ, Fan MY, Devries A, Brennan Braden J, Martin BC. Risks for possible and probable opioid misuse among recipients of chronic opioid therapy in commercial and Medicaid insurance plans: The TROUP Study. Pain. 2010;150(2):332–9. pmid:20554392; PubMed Central PMCID: PMC2897915.
  35. 35. Bohnert AS, Valenstein M, Bair MJ, Ganoczy D, McCarthy JF, Ilgen MA, et al. Association between opioid prescribing patterns and opioid overdose-related deaths. JAMA. 2011;305(13):1315–21. Epub 2011/04/07. pmid:21467284.
  36. 36. Volkow ND, McLellan TA, Cotto JH, Karithanom M, Weiss SR. Characteristics of opioid prescriptions in 2009. JAMA. 2011;305(13):1299–301. pmid:21467282; PubMed Central PMCID: PMC3187622.
  37. 37. Webster LR, Cochella S, Dasgupta N, Fakata KL, Fine PG, Fishman SM, et al. An analysis of the root causes for opioid-related overdose deaths in the United States. Pain Medicine (Malden, Mass). 2011;12 Suppl 2:S26–S35. pmid:21668754.
  38. 38. Cepeda MS, Fife D, Chow W, Mastrogiovanni G, Henderson SC. Assessing opioid shopping behaviour: a large cohort study from a medication dispensing database in the US. Drug Safety. 2012;35(4):325–34. pmid:22339505
  39. 39. Peirce GL, Smith MJ, Abate MA, Halverson J. Doctor and pharmacy shopping for controlled substances. Medical Care. 2012;50(6):494–500. pmid:22410408.
  40. 40. Rice JB, White AG, Birnbaum HG, Schiller M, Brown DA, Roland CL. A Model to Identify Patients at Risk for Prescription Opioid Abuse, Dependence, and Misuse. Pain Medicine. 2012;13(9):1162–73. pmid:22845054
  41. 41. Baumblatt JA, Wiedeman C, Dunn JR, Schaffner W, Paulozzi LJ, Jones TF. High-Risk Use by Patients Prescribed Opioids for Pain and Its Role in Overdose Deaths. JAMA internal medicine. 2014. Epub 2014/03/05. pmid:24589873.
  42. 42. Hylan TR, Von Korff M, Saunders K, Masters E, Palmer RE, Carrell D, et al. Automated prediction of risk for problem opioid use in a primary care setting. J Pain. 2015;16(4):380–7. Epub 2015/02/03. pmid:25640294.
  43. 43. Zedler B, Xie L, Wang L, Joyce A, Vick C, Brigham J, et al. Development of a Risk Index for Serious Prescription Opioid-Induced Respiratory Depression or Overdose in Veterans’ Health Administration Patients. Pain Med. 2015;16(8):1566–79. Epub 2015/06/17. pmid:26077738; PubMed Central PMCID: PMC4744747.
  44. 44. Cochran G, Gordon AJ, Lo-Ciganic WH, Gellad WF, Frazier W, Lobo C, et al. An Examination of Claims-based Predictors of Overdose from a Large Medicaid Program. Med Care. 2017;55(3):291–8. Epub 2016/12/17. pmid:27984346; PubMed Central PMCID: PMC5309160.
  45. 45. Carey CM, Jena AB, Barnett ML. Patterns of Potential Opioid Misuse and Subsequent Adverse Outcomes in Medicare, 2008 to 2012. Ann Intern Med. 2018;168(12):837–45. Epub 2018/05/26. pmid:29800019.
  46. 46. Glanz JM, Narwaney KJ, Mueller SR, Gardner EM, Calcaterra SL, Xu S, et al. Prediction Model for Two-Year Risk of Opioid Overdose Among Patients Prescribed Chronic Opioid Therapy. J Gen Intern Med. 2018. Epub 2018/01/31. pmid:29380216.
  47. 47. Rose AJ, Bernson D, Chui KKH, Land T, Walley AY, LaRochelle MR, et al. Potentially Inappropriate Opioid Prescribing, Overdose, and Mortality in Massachusetts, 2011–2015. J Gen Intern Med. 2018;33(9):1512–9. Epub 2018/06/28. pmid:29948815; PubMed Central PMCID: PMC6109008.
  48. 48. Zedler BK, Saunders WB, Joyce AR, Vick CC, Murrelle EL. Validation of a Screening Risk Index for Serious Prescription Opioid-Induced Respiratory Depression or Overdose in a US Commercial Health Plan Claims Database. Pain Med. 2018;19(1):68–78. Epub 2017/03/25. pmid:28340046; PubMed Central PMCID: PMC5939826.
  49. 49. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York, NY: Springer; 2008.
  50. 50. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;10(3):e0118432. Epub 2015/03/05. pmid:25738806; PubMed Central PMCID: PMC4349800.
  51. 51. Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M. Why the C-statistic is not informative to evaluate early warning scores and what metrics to use. Crit Care. 2015;19:285. Epub 2015/08/14. pmid:26268570; PubMed Central PMCID: PMC4535737.
  52. 52. Tufféry S. Data Mining and Statistics for Decision Making. 1st ed: John Wiley & Sons; 2011.
  53. 53. Cotto JH, Davis E, Dowling GJ, Elcano JC, Staton AB, Weiss SR. Gender effects on drug use, abuse, and dependence: a special analysis of results from the National Survey on Drug Use and Health. Gend Med. 2010;7(5):402–13. pmid:21056867.
  54. 54. SAMHSA. Results from the 2014 National Survey on Drug Use and Health: Mental Health Detailed Tables. Rockville, MD: Substance Abuse and Mental Health Services Administration, 2015.
  55. 55. Liang Y, Goros MW, Turner BJ. Drug Overdose: Differing Risk Models for Women and Men among Opioid Users with Non-Cancer Pain. Pain Med. 2016. pmid:28025361.
  56. 56. Yang Z, Wilsey B, Bohm M, Weyrich M, Roy K, Ritley D, et al. Defining risk of prescription opioid overdose: pharmacy shopping and overlapping prescriptions among long-term opioid users in medicaid. Journal of Pain. 2015;16(5):445–53. pmid:25681095.
  57. 57. Oliva EM, Bowe T, Tavakoli S, Martins S, Lewis ET, Paik M, et al. Development and applications of the Veterans Health Administration’s Stratification Tool for Opioid Risk Mitigation (STORM) to improve opioid safety and prevent overdose and suicide. Psychol Serv. 2017;14(1):34–49. Epub 2017/01/31. pmid:28134555.
  58. 58. Ferris LM, Saloner B, Krawczyk N, Schneider KE, Jarman MP, Jackson K, et al. Predicting Opioid Overdose Deaths Using Prescription Drug Monitoring Program Data. American Journal of Preventive Medicine. 2019;57(6):e211–e7. pmid:31753274
  59. 59. Krawczyk N, Schneider KE, Eisenberg MD, Richards TM, Ferris L, Mojtabai R, et al. Opioid overdose death following criminal justice involvement: Linking statewide corrections and hospital databases to detect individuals at highest risk. Drug and Alcohol Dependence. 2020:107997. pmid:32534407
  60. 60. Sorbero; ME, Kranz; AM, Bouskill; KE, Ross; R, Palimaru; AI, Meyer A. U.S. Department of Health and Human Services Assistant Secretary for Planning and Evaluation (ASPE) Office of Health Policy Research Report: Addressing Social Determinants of Health Needs of Dually Enrolled Beneficiaries in Medicare Advantage Plans 2018 [cited 2020 February 5]. Available from: https://aspe.hhs.gov/system/files/pdf/259896/MAStudy_Phase2_RR2634-final.pdf.
  61. 61. Saloner B, McGinty EE, Beletsky L, Bluthenthal R, Beyrer C, Botticelli M, et al. A Public Health Strategy for the Opioid Crisis. Public Health Reports. 2018;133(1_suppl):24S–34S. pmid:30426871
  62. 62. Harle CA, Apathy NC, Cook RL, Danielson EC, DiIulio J, Downs SM, et al. Information Needs and Requirements for Decision Support in Primary Care: An Analysis of Chronic Pain Care. AMIA Annu Symp Proc. 2018;2018:527–34. Epub 2019/03/01. pmid:30815093; PubMed Central PMCID: PMC6371256.
  63. 63. Stopka TJ, Amaravadi H, Kaplan AR, Hoh R, Bernson D, Chui KKH, et al. Opioid overdose deaths and potentially inappropriate opioid prescribing practices (PIP): A spatial epidemiological study. International Journal of Drug Policy. 2019;68:37–45. pmid:30981166
  64. 64. Green CA, Perrin NA, Janoff SL, Campbell CI, Chilcoat HD, Coplan PM. Assessing the accuracy of opioid overdose and poisoning codes in diagnostic information from electronic health records, claims data, and death records. Pharmacoepidemiol Drug Saf. 2017;26(5):509–17. Epub 2017/01/12. pmid:28074520.
  65. 65. Hedegaard H, Chen LH, Warner M. Drug-poisoning deaths involving heroin: United States, 2000–2013. NCHS Data Brief. 2015;(190):1–8. Epub 2015/05/02. pmid:25932890.
  66. 66. Haffajee RL, Lin LA, Bohnert ASB, Goldstick JE. Characteristics of US Counties With High Opioid Overdose Mortality and Low Capacity to Deliver Medications for Opioid Use Disorder. JAMA Netw Open. 2019;2(6):e196373. Epub 2019/06/30. pmid:31251376; PubMed Central PMCID: PMC6604101.