Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A comparison of three methods in categorizing functional status to predict hospital readmission across post-acute care

  • Chih-Ying Li ,

    Roles Conceptualization, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    chili@utmb.edu

    Affiliation Department of Occupational Therapy, University of Texas Medical Branch, Galveston, Texas, United States of America

  • Amol Karmarkar,

    Roles Conceptualization, Data curation, Methodology, Writing – review & editing

    Affiliation Division of Rehabilitation Sciences, University of Texas Medical Branch, Galveston, Texas, United States of America

  • Yong-Fang Kuo,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliations Department of Preventive Medicine & Community Health, University of Texas Medical Branch, Galveston, Texas, United States of America, Sealy Center on Aging, University of Texas Medical Branch, Galveston, Texas, United States of America

  • Hemalkumar B. Mehta,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations Sealy Center on Aging, University of Texas Medical Branch, Galveston, Texas, United States of America, Department of Surgery, University of Texas Medical Branch, Galveston, Texas, United States of America

  • Trudy Mallinson,

    Roles Methodology, Writing – review & editing

    Affiliation Department of Clinical Research and Leadership, The George Washington University, Washington, DC, United States of America

  • Allen Haas,

    Roles Data curation, Methodology, Writing – review & editing

    Affiliation Department of Preventive Medicine & Community Health, University of Texas Medical Branch, Galveston, Texas, United States of America

  • Amit Kumar,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Department of Physical Therapy, Northern Arizona University, Phoenix, Arizona, United States of America

  • Kenneth J. Ottenbacher

    Roles Funding acquisition, Methodology, Resources, Supervision, Writing – review & editing

    Affiliations Division of Rehabilitation Sciences, University of Texas Medical Branch, Galveston, Texas, United States of America, Sealy Center on Aging, University of Texas Medical Branch, Galveston, Texas, United States of America

Abstract

Background

Methods used to categorize functional status to predict health outcomes across post-acute care settings vary significantly.

Objectives

We compared three methods that categorize functional status to predict 30-day and 90-day hospital readmission across inpatient rehabilitation facilities (IRF), skilled nursing facilities (SNF) and home health agencies (HHA).

Research design

Retrospective analysis of 2013–2014 Medicare claims data (N = 740,530). Data were randomly split into two subsets using a 1:1 ratio. We used half of the cohort (development subset) to develop functional status categories for three methods, and then used the rest (testing subset) to compare outcome prediction. Three methods to generate functional categories were labeled as: Method I, percentile based on proportional distribution; Method II, percentile based on change score distribution; and Method III, functional staging categories based on Rasch person strata. We used six differentiation and classification statistics to determine the optimal method of generating functional categories.

Setting

IRF, SNF and HHA.

Subjects

We included 130,670 (17.7%) Medicare beneficiaries with stroke, 498,576 (67.3%) with lower extremity joint replacement and 111,284 (15.0%) with hip and femur fracture.

Measures

Unplanned 30-day and 90-day hospital readmission.

Results

For all impairment conditions, Method III best predicted 30-day and 90-day hospital readmission. However, we observed overlapping confidence intervals among some comparisons of three methods. The bootstrapping of 30-day and 90-day hospital readmission predictive models showed the area under curve for Method III was statistically significantly higher than both Method I and Method II (all paired-comparisons, p<.001), using the testing sample.

Conclusions

Overall, functional staging was the optimal method to generate functional status categories to predict 30-day and 90-day hospital readmission. To facilitate clinical and scientific use, we suggest the most appropriate method to categorize functional status should be based on the strengths and weaknesses of each method.

Introduction

In many disciplines of medicine, clinical staging refers to hierarchical categories along the continuum of the measured construct. [13] The concept of “clinical staging” is also applied in acute and post-acute prospective payment systems, for example, the skilled nursing facilities (SNFs) resource utilization groups, known as case-mix group [4,5]. Individuals in the same SNFs resource utilization group are expected to share common abilities, respond similarly to assessment items, and likely have analogous needs for resources or equivalent costs of care [4,5]. When applied to functional status, known as “functional staging”, such categorizations allow clinicians to accurately plan care, track prognosis, and enable researchers to define and refine case-mix adjustment groups. Functional staging can also be used to examine intervention effectiveness [610], enables meaningful categorical comparisons within and across groups of person(s) and setting(s).

While continuous scores may provide detailed clinically information for clinicians [11,12], categorizing scores facilitates policy discussion and decision-making. Additionally, using continuous score produces a summed score. The same summed score could, in fact, represents different levels of performances [13]. The site-neutral unified payment model, proposed by the Medicare Payment Advisory Commission [14], recommends eliminating payment difference across settings for patients with similar case-mix demographics and severity of impairments. Generating categories based on functional status provides clinical evidence for unified payment models and other health reform measures. Investigators have demonstrated that adding functional status categories in risk-adjustment models (e.g., hospital readmission) reduces differences in population-level case-mix [15,16]. Adding functional status categories in predictive models can therefore improve the equality of resources allocation, care quality, and generate more accurate estimated care costs [17].

Practitioners and researchers have used functional status categories to present hierarchical levels of patients’ function for decades [610]. However, methods used to categorize functional status to predict health outcomes often are arbitrary and vary significantly. To identify the optimal method to categorize functional status, we compared three approaches in developing functional status categories to predict hospital readmission. Method I is a conventional percentile approach: tertile, quartile or quintile based on summed-scores distribution. Method II is a combination of change score with percentile method: tertile, quartile or quintile of change score between admission and discharge. Method III is a functional staging method using person strata categories based on latent trait theory. This paper aims to examine the relatively optimal approach to categorize functional status with outcome prediction in hospital readmission for Medicare beneficiaries. Hospital readmission was chosen as the main outcome in this study because it is an important national quality measure of patient care [4,18].

Materials and methods

Data source

The study included 100% Medicare claims data from 2013–2014. We used the following data files: Inpatient Rehabilitation Facility (IRF) and Inpatient Rehab Facility- Patient Assessment Instrument (IRF-PAI) [19]; Skilled Nursing Facility (SNF) and Minimum Data Set (MDS 3.0) [20]; Home Health Agency (HHA) and Outcome and Assessment Information Set (OASIS-C) [21]; the Medicare Provider Analysis and Review and the Master Beneficiary Summary files.

Ethical assurances

This study was approved by the University Institutional Review Board (IRB # 16–0014). Additionally, a Data Use Agreement was established with the Centers for Medicare and Medicaid Services prior to all data analyses.

Cohort selection

We identified 2,953,006 eligible cases using a combination of Medical Severity Diagnosis Related Group codes and ICD-9 procedure codes for three impairment conditions: stroke (061–066), lower extremity joint replacement (469–470, 81.51 and 81.54) and hip/femur fractures (480–482). Using a combination of claims and assessment data, we included only those beneficiaries discharged from a hospital to one of the three post-acute care (PAC) settings: IRF, SNF and HHA. After applying exclusion criteria (S1 Table), the final analytical sample included 740,530 cases: 17.7% with stroke (n = 130,670), 67.3% with lower extremity joint replacement (n = 498,576), and 15.0% (n = 111,284) with hip and femur fracture (Table 1).

thumbnail
Table 1. Demographics and person-level characteristics (N = 740,530).

https://doi.org/10.1371/journal.pone.0232017.t001

To develop and validate the three proposed methods, we used 1:1 ratio to randomly split the study cohort into a development subset (n = 370,265) and a testing subset (n = 370,265). The development subset was used to develop functional status categories from three methods. The testing subset was used to compare outcome prediction for three methods.

We also conducted sensitivity analysis to examine difference of demographics and person-level characteristics before and after excluding 23% of potential patients (step 12 vs. step 15 in S1 Table). The cohort in step 12 included patients who did not receive PAC. The cohort that included 23% patients (generated by step 12) had less total SNF stay within 90 days at IRF compared to the cohort used in this study (generated by step 15). However, we did not find other variables significantly different between step-12 cohort and our study cohort (S6 Table).

Study outcome

The primary outcome was unplanned all-cause 30-day and 90-day hospital readmission (yes/no) after index hospital discharge [22,18]. We chose 30-day window to reflect current reimbursement system. Additionally, we included a longer follow-up time-period (90-day) to be consistent with the episode-based payment initiatives [23,24].

Primary variable

The primary variable was functional status categories for two domains (Self-Care and Mobility) generated from three methods (details below). Self-Care and Mobility domains were chosen as these two domains being consistently measured across the PAC settings. Additionally, these two domains are potentially modifiable factors relevant to hospital readmission.

Functional status categories

Comparable items of the Self-Care and Mobility domains from each assessment were selected based on their conceptual meanings (e.g., eating items were selected from IRF-PAI, MDS and OASIS as the three items measure the same activity: eating). The number of selected items by assessment was 11 in IRF-PAI (6 Self-Care and 5 Mobility items), 11 in MDS (5 Self-Care and 6 Mobility) and 8 in OASIS (5 Self-Care and 3 Mobility) (S2 Table). We used co-calibration tables [25] to co-calibrate Self-Care and Mobility scores separately into a 0–100 scale, for the following three methods.

Method I: Percentile based on proportional distribution.

For each impairment condition, we created tertile, quartile and quintile categories based on the co-calibrated summed score distribution for each assessment. Self-Care and Mobility had the same numbers of categories. We generated percentiles first for each assessment, following c-statistics to determine whether to choose tertile, quartile or quintile for each impairment condition at each setting. Based on the c-statistics, quartile was chosen for stroke and lower extremity joint replacement, and quintile was chosen for hip and femur fracture. S1 Fig demonstrates an example of using Method I to generate functional categories of IRF-PAI Self-Care in Stroke. The same procedure was repeated for MDS and OASIS across impairment conditions. Detailed categories were provided in S3 Table.

Method II: Change score with percentile distribution.

We first calculated the change score between admission and discharge for each assessment (Self-Care and Mobility were calculated separately). Secondly, we calculated percentile (tertile, quartile and quintile) based on the change score distribution. Lastly, to increase clinical meaningfulness when interpreting negative, zero and positive change scores, we combined the percentile change score distribution with the following operational definitions: tertiles (small, medium and large change), quartiles (negative and zero change, small positive change, medium positive change and large positive change) and quintiles (negative change, zero change, small positive change, medium positive change and large positive change).

Same as Method I, Self-Care and Mobility of each assessment had the same number of categories due to the nature of percentile method. Using c-statistics, quartile was selected for stroke and lower extremity joint replacement; quintile was selected for hip and femur fracture. The quintile proportion was found inapplicable for stroke and lower extremity joint replacement as the same functional score was used in more than one category. S2 Fig demonstrates an example of using Method II to generate functional categories of IRF-PAI Self-Care in Stroke. The same procedure was repeated for MDS and OASIS across impairment conditions. Detailed categories were provided in S4 Table.

Method III: Functional staging.

Fig 1 provides the detailed procedures demonstrating how we generated functional staging categories for IRF-PAI Self-Care in Stroke. We generated a person separation index (Gp) and calculated person strata, to statistically distinguish different ability levels using Rasch person strata formula (4*Gp+1)/3 [2631]. We followed this existing formula to calculate the number of person strata for each assessment by impairment condition [2636]. Person strata are the concept based on a norm reference method using the distribution of person measure and centering on the mean of the person distribution. Each strata needs to be separated by at least three measurement errors apart to be statistically distinct [2631]. We then identified the corresponding cutoff raw score from the 0–100 scale co-calibration table [25].

thumbnail
Fig 1. Method III: Use functional staging approach to generate functional score categories (Example of IRF-PAI Self-Care in Stroke).

https://doi.org/10.1371/journal.pone.0232017.g001

Using the development subset, for stroke, we generated four categories for Self-Care and three categories for Mobility for all three instruments. For lower extremity joint replacement, we generated three Self-Care and two Mobility categories for IRF-PAI and OASIS; and three Self-Care and three Mobility categories for MDS. For hip and femur fracture, we generated three Self-Care and two Mobility categories for IRF-PAI; two Self-Care and three Mobility categories for MDS; and four Self-Care and three Mobility categories for OASIS (S5 Table).

Model comparisons

Six indices were used to compare the outcome prediction of the three methods:

C-statistics/Area under the Curve (AUC).

The c-statistics measure the discrimination ability of the model. We compared the logistic model discrimination using c-statistics with asymptotic 95% confidence intervals. The c-statistic is also known as the AUC, the area under the receiver operating characteristic curves. The AUC is the most commonly used method to evaluate probability of model performance in the context of binary outcomes with higher values indicates better model fit [3742].

Somer’s Delta (Somer’s D).

Somer’s D is a nonparametric test to assess the strength and direction of the association between an ordinal dependent variable and an ordinal independent variable. Somer’s D is based on the assumption of a monotonic relationship between the independent and the outcome variables. Higher Somer’s D indicates better model fit [43].

Akaike information criterion (AIC)/Bayesian information criterion (BIC).

Both AIC and BIC [44] evaluate goodness-of-fit (model fit) and penalize for the excessive number of estimated parameters using log-likelihood functions. AIC/BIC provide a standard to balance between model parsimony and the penalty for overfitting [45,46]. Lower AIC/BIC value indicates better model fit [44,45].

Integrated Discrimination Improvement (IDI).

The IDI indicates the difference in discrimination slopes between two models. The IDI measures whether the new model improves the average sensitivity without sacrificing its average specificity [47]. Higher (positive) values of IDI indicate that the new model is better than the reference model.

Net Reclassification Improvement (NRI).

The NRI is a reclassification measure using reclassification tables constructed separately for respondents with and without events (i.e., outcome occurs or not) between two models [48]. Higher (positive) values of NRI (percent) indicate reclassification by the new model had higher sensitivity compared to the reference model.

Statistical analyses

We stratified all analyses by impairment conditions for both development and testing subsets. First, we constructed a baseline logistic regression model which included sociodemographic variables (age, sex, race/ethnicity, disability entitlement and Medicare-Medicaid dual eligibility), health status (Hierarchical Condition Category composite score, Elixhauser comorbidity categories, condition-specific severity, hospital length of stay, intensive care days and coronary care days) and post-acute length of stay. Then, we added three types of functional status to the baseline logistic regression model. We used baseline model to (a) ensure fair comparison conveyed by different functional status categories from three methods, and to (b) examine the magnitude change of outcome prediction by adding functional status variables. The predictive models with three methods of generating functional status categories were examined by AUC, Somer’s D, AIC, BIC, IDI and NRI using the testing sample. To validate the stability of the estimates, a bootstrap procedure with 1000 re-samples was used to statistically compare c-statistics of the three methods using the testing sample. The c-statistics with bootstrapping is a standardized way for model comparison. Each of the three methodologies were later compared using paired t-tests if significant difference existed among methods. We used SAS version 9.4 (SAS Institute, Inc., Cary, NC) to perform all analyses.

Results

Demographics

The majority were discharged to SNF (n = 325,708; 44.0%), followed by HHA (n = 277,295; 37.4%) and IRF (n = 137,527; 18.6%) (Table 1). The mean ages were 79.0 (7.6), 79.5 (7.7) and 74.3 (6.2) at IRF, SNF and HHA, respectively. The majority were female (63.9%, 72.5% and 58.8% at IRF, SNF and HHA, respectively) and non-Hispanic White (84.6%, 88.5% and 89.4% at IRF, SNF and HHA, respectively). The most common impairment conditions across PAC settings were ischemic stroke (n = 50,549; 36.8%) at IRF; knee replacement for both SNF (n = 113,345; 34.8%) and HHA (n = 154,205; 55.6%). Patients at IRF had slightly more comorbidities [3.4 (1.9)], compared to SNF 3.0 (1.9) and HHA 2.2 (1.6). Patients discharged to IRF had the highest rate of 30-day (10.9%) and 90-day hospital readmission (19.1%), compared to SNF (9.5%, 17.9%) and HHA (3.5%, 8.0%) (Table 1).

Model comparisons of three methods

Table 2 reports performance metric of different methods in predicting 30-days and 90-day readmissions. For all three impairment conditions, c-statistics and Somer’s D were the highest and AIC/BIC were the lowest for Method III in predicting 30-day and 90-day readmissions. For example, among patients with stroke, Method III had the highest c statistics (0.8340) compared to Methods I (0.8319) and II (0.8271) and the lowest AIC (Method III: 38442, Method II: 28958, Method I: 28615) and BIC (Method III: 38642, Method II: 39167, Method I: 38824) for 30-day hospital readmission. Three impairment conditions had the same result: Method III better predicted both 30-day and 90-day hospital readmission compared to Methods I and II (Table 2). Method III also had positive IDI and NRI values (better slope discrimination) compared to Method I and Method II (Table 2). However, the confidence intervals of Method III (both 30- and 90-day) were overlapping with those of Method I and Method II for stroke, lower extremity joint replacement and hip/femur fracture.

thumbnail
Table 2. Comparisons of outcome predictions with three functional category methods (N = 370,265).

https://doi.org/10.1371/journal.pone.0232017.t002

Bootstrapping

In both 30-day and 90-day hospital readmission models, the results of the bootstrapping using testing sample showed that the AUC for Method III was the highest compared with both Method I and Method II for the three impairment conditions (all paired-comparisons, p<.001).

Clinical application

We provided functional status categories generated from Methods I-III (S3S5 Tables). We also provide the estimated risk of 30-day and 90-day hospital readmission using the self-care and mobility combinations based on Method III functional staging categories (Fig 2). For example, among patients with stroke who had self-care score between 6–11, those with mobility score between 5–15 will have 22.8% probability of 30-day readmission and 33.4% probability of 90-day readmission (Fig 2).

thumbnail
Fig 2. Using Method III to estimated 30-day and 90-day hospital readmission rate.

https://doi.org/10.1371/journal.pone.0232017.g002

Discussion

Generating meaningful categories allow for functional status comparisons and optimal outcome prediction across post-acute settings. This study compared three functional category methods and found the functional staging approach (Method III) generated the relatively optimal prediction for 30-day and 90-day hospital readmission. While the study findings imply that using functional staging approach can be relatively optimal for outcome prediction, it is unclear whether this improvement can also produce superior clinically meaningful levels. To facilitate clinical and scientific use, we suggest the most appropriate method to categorize functional status should be based on the strengths and weaknesses of each approach. For example, Method I may have the advantage of convenience (quick to calculate), Method II may have the advantage when reporting functional change and Method III may have the advantage in outcome prediction (i.e. hospital readmission). The choice of the method requires a delicate judgement and balance between available resource, time demand and study purpose. This study provides preliminary data to guide future healthcare policy reforms (e.g., bundled payment) when classifying patients’ self-care and mobility function. We also generated tables of functional categories based on the three methods and plots of function-based readmission risks using functional staging for clinicians and researchers to use.

Policymakers are beginning to explore the impact of functional status on classification systems in post-acute risk-adjusted capitation payments [15,49,50]. Researchers and the Medicare Payment Advisory Commission reported that adding functional status improved prediction of resource use and cost of care [1516,4951]. Categorizing patients into clusters would be clinically and administratively useful (e.g. patients in the same cluster may experience comparable care cost or require similar resources). By its nature, functional staging is hierarchical and thus may provide gradients of functional recovery (or loss) that can help case-mix adjustment in services use and outcome comparisons, aiding in care provision, resource allocation decisions and eventually quality of care evaluation.

We acknowledge that patients with varying clinical characteristics and disease severity may benefit differently from various levels of care provided at different types of post-acute settings. However, recent healthcare reform proposals emphasize the need for a unified prospective payment system for post-acute settings [14]. Thus, comparisons of effectiveness and efficiency of care for patients with similar case-mix demographics across post-acute settings are eminent and inevitable. Identifying standardized and consistent approaches to measure functional status across post-acute settings could inform future policy decisions and improve quality of patient care after hospitalization. Based on the Improving Medicare Post-Acute Care Transformation Act of 2014, Centers for Medicare and Medicaid Services Section GG data elements were implemented to collect unified functional data across PAC settings [52,53]. While Section GG data elements potentially would resolve functional assessment issues related to uniformity across PAC settings, using a standardized functional categorization method based on co-calibration functional scores provides firsthand comparisons of functional status across PAC settings. This study serves as a basis for Section GG data elements to develop hierarchical functional categorizations across settings in the future.

The study findings also indicate that generating more categories is not associated with better outcome prediction. Our results support the notion that the number of functional status categories varies by impairment condition, and using distinct functional levels may be more appropriate than the arbitrary percentile cutoff criteria, where a predefined fixed number of distribution-based categories dictates the categorization. Functional staging consider hierarchical functional levels, thus this empirical approach can classify patients into distinct functional levels.

Current evidence regarding the advantages and limitations of different functional category methods remains unclear and largely unexplored. In the emerging environment of value-based care and precision medicine, it is reasonable to ask: are percentile proportional distribution and change score too insensitive to provide accurate functional categories necessary to assess and predict quality outcomes? If the answer is yes, then what are the appropriate approaches? Our study and findings address this question and provide a potential solution for improving rigor in comparative effectiveness studies across post-acute settings.

Ongoing demonstration projects of uniform functional assessment, episode-based payment models, and unified payment system across post-acute settings signify the growing need to conduct rigorous post-acute health services and health policy research. This study is the first we are aware of to examine the impact of quality measures based on different categorization methods of functional status. Future study should examine whether different categorization methods of functional status are associated with different provision of care services. It is also important to explore other variables in addition to functional status to optimize outcome prediction accuracy for individual patients. In addition, future study should validate whether our finding can be applied to other quality outcomes, such as successful community discharge for Medicare beneficiaries.

Study limitations

This study has limitations related to using Medicare files [54]. For example, our findings may not be applicable to persons < 66 years old or those enrolled in insurance plans other than Fee-For-Services. In addition, this study focused on the physical aspects of functional status while cognitive function is an essential element of functional performance. We suggested future studies of this kind include cognitive function items. We are aware of the importance of stability of functional staging for both clinical application and policy decision-making, and recognize that co-calibration methodologies may introduce conversion measurement errors. We are also aware of that using categorization may introduce discontinuity at the boundaries of cut-off scores, thus limit statistical power, precision, and obscure the ‘functionality’ of individual differences. Future study also needs to identify whether the improvement of functional staging approach has clinical meanings compared to alternative methods. We also suggest future study investigating whether different clinically meaningful change levels can and/or should be included within each category, or if items should be weighted to enhance accuracy for both clinical utility and policy decision-making.

Conclusions

Current measures and methods examining functional status across post-acute settings vary significantly. To compare effectiveness and quality of care across post-acute settings, identifying an optimal functional category method is imperative. While our study found functional staging approach generated functional categories that explained the largest variances in both 30-day and 90-day hospital readmission prediction, we are uncertain whether functional staging approach can provide clinically meaningful improvement compared to alternative methods. We suggest clinicians, researchers and policy makers execute their best judgments to balance the strengths and weaknesses of each method when categorizing functional status. Additional research is needed to better understand the advantages and the limitations of using functional staging categories to assess and predict other important national quality measures across post-acute settings.

Supporting information

S2 Table. Selected functional items in IRF-PAI, MDS and OASIS (Self-Care, Mobility).

https://doi.org/10.1371/journal.pone.0232017.s002

(DOCX)

S3 Table. Method I (Percentile Admission Score): Raw scores for IRF-PAI, MDS and OASIS (Self-Care & Mobility) in stroke, lower extremity joint replacement and hip/femur fracture (Quartile).

https://doi.org/10.1371/journal.pone.0232017.s003

(DOCX)

S4 Table. Method II (Percentile Change Score): Raw scores for IRF-PAI, MDS and OASIS (Self-Care & Mobility) in stroke, lower extremity joint replacement and hip/femur fracture (0–100 Co-calibrated Score)^.

https://doi.org/10.1371/journal.pone.0232017.s004

(DOCX)

S5 Table. Method III (Functional Staging based on Rasch Model): Raw scores for IRF-PAI, MDS and OASIS (Self-Care & Mobility) in stroke, lower extremity joint replacement and hip/femur fracture.

https://doi.org/10.1371/journal.pone.0232017.s005

(DOCX)

S6 Table. Demographics and person-level characteristics of inclusion and exclusion samples.

https://doi.org/10.1371/journal.pone.0232017.s006

(DOCX)

S1 Fig. Method I: Use percentile admission score to generate functional score categories (Example of IRF-PAI Self-Care in Stroke).

https://doi.org/10.1371/journal.pone.0232017.s007

(DOCX)

S2 Fig. Method II: Use percentile change score to generate functional score categories (Example of IRF-PAI Self-Care in Stroke).

https://doi.org/10.1371/journal.pone.0232017.s008

(DOCX)

Acknowledgments

The authors would like to thank Sarah Toombs Smith, PhD, a board-certified Editor in the Life Sciences (bels.org), at the Sealy Center on Aging, University of Texas Medical Branch, for her assistance in reviewing and editing the manuscript prior to our submission.

References

  1. 1. Narayan K, Lin MY. Staging for cervix cancer: Role of radiology, surgery and clinical assessment. Best Pract Res Clin Obstet Gynaecol. 2015; 29(6): 833–44. pmid:25898789
  2. 2. Scott J, Leboyer M, Hickie I, et al. Clinical staging in psychiatry: a cross-cutting model of diagnosis with heuristic and practical value. Br J Psychiatry. 2013; 202: 243–5. pmid:23549937
  3. 3. Edge SB. American Joint Committee on Cancer: AJCC Cancer Staging Manual. 7th ed. New York: Springer, 2009.
  4. 4. Medicare Payment Advisory Commission. Skilled nursing facility services payment system. http://medpac.gov/docs/default-source/payment-basics/medpac_payment_basics_16_snf_final.pdf?sfvrsn=0. Revised October 2016. Accessed January 12, 2018.
  5. 5. Centers for Medicare and Medicaid Service: Resident Assessment Instrument (RAI) Manual. Chapter 6: Medicare Skilled Nursing Facility Prospective Payment System (SNF PPS). https://www.ahcancal.org/facility_operations/Documents/UpdatedFilesRAI3.0/MDS%203.0%20Chapter%206%20V1.02%20July%202010.pdf Published July 2017. Accessed June 16, 2018.
  6. 6. Jette AM, Tao W, Norweg A, Haley S. Interpreting rehabilitation outcome measurements. J Rehabil Med. 2007 Oct;39(8):585–90. pmid:17896048
  7. 7. Wang YC, Hart DL, Stratford PW, et al. Clinical interpretation of a lower-extremity functional scale-derived computerized adaptive test. Phys Ther 2009; 89:957–68. pmid:19628577
  8. 8. Jette DU, Warren RL, Wirtalla C. Validity of functional independence staging in patients receiving rehabilitation in skilled nursing facilities. Arch Phys Med Rehabil. 2005; 86(6):1095–101. pmid:15954046
  9. 9. Tao W, Haley SM, Coster WJ, et al. An exploratory analysis of functional staging using an item response theory approach. Arch Phys Med Rehabil 2008; 89:1046–53. pmid:18503798
  10. 10. Woodbury ML, Velozo CA, Richards LG, et al. Rasch analysis staging methodology to classify upper extremity movement impairment after stroke. Arch Phys Med Rehabil. 2013; 94(8):1527–33. pmid:23529144
  11. 11. Altman DG. Categorizing Continuous Variables. Encyclopedia of Biostatistics. 2005; https://doi.org/10.1002/0470011815.b2a10012.
  12. 12. Naggara O, Raymond J, Guilbert F, et al. Analysis by categorizing or dichotomizing continuous variables is inadvisable: an example from the natural history of unruptured aneurysms. AJNR Am J Neuroradiol. 2011 Mar; 32(3):437–40. pmid:21330400
  13. 13. Fisher SR, Middleton A, Graham JE, et al. Same but different: FIM summary scores may mask variability in physical functioning profiles. Arch Phys Med Rehabil. 2018 Aug;99(8):1479–1482.e1. pmid:29428342
  14. 14. Medicare Payment Advisory Commission. Chapter 6: Mandated report: Site-neutral payments for select conditions treated in inpatient rehabilitation facilities and skilled nursing facilities. http://www.medpac.gov/docs/default-source/reports/chapter-6-site-neutral-payments-for-select-conditions-treated-in-inpatient-rehabilitation-facilities.pdf?sfvrsn=0 Published June 2014. Accessed November 16, 2016.
  15. 15. Fuller RL, Hughes JS, Goldfield NI. Adjusting population risk for functional health status. Popul Health Manag. 2016 Apr;19(2):136–44. pmid:26348621
  16. 16. Pope GC, Kautter J, Ellis RP, et al. Risk adjustment of Medicare capitation payments using the CMS-HCC model. Health Care Financ Rev. 2004;25(4):119–141. pmid:15493448
  17. 17. Kumar A, Karmarkar AM, Graham JE, et al. Comorbidity indices versus function as potential predictors of 30-day readmission in older patients following postacute rehabilitation. J Gerontol A Biol Sci Med Sci. 2017 Feb;72(2):223–228. pmid:27492451
  18. 18. Simoes J., Grady J., DeBuhr J., et al. All-Cause Hospital-Wide Measure updates and specification report: hospital-level 30-day risk-standardized readmission measure–version 6.0. Prepared for the Centers for Medicare and Medicaid Services. New Haven, CT: Yale New Haven Health Services Corporation/Center for Outcomes Research & Evaluation, 2017.
  19. 19. Medicare Payment Advisory Commission. Chapter 10: Inpatient Rehabilitation Facility Services: Assessing Payment Adequacy and Updating Payments. http://www.medpac.gov/docs/default-source/reports/chapter-10-inpatient-rehabilitation-facility-services-march-2015-report-.pdf?sfvrsn=0. Published March 2015. Accessed March 11, 2017.
  20. 20. Medicare Payment Advisory Commission. Chapter 8: Skilled Nursing Facility Services: Assessing Payment Adequacy and Updating Payments http://www.medpac.gov/docs/default-source/reports/mar18_medpac_ch8_sec.pdf?sfvrsn=0. Published March 2018. Accessed June 13, 2018.
  21. 21. Centers for Medicare and Medicaid. Outcome and Assessment Information Set (OASIS). OASIS Field Test Summary Report https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HomeHealthQualityInits/Downloads/OASIS-Field-Test-Summary-Report_02-2018.pdf Published February 2018. Accessed June 13, 2018.
  22. 22. Horwitz L., Partovian C., Lin Z., et al. Hospital-Wide All-Cause Risk-Standardized Readmission Measure: Measure Methodology Report. Prepared for the Centers for Medicare and Medicaid Services. New Haven, CT: Yale New Haven Health Services Corporation/Center for Outcomes Research & Evaluation, 2011.
  23. 23. Centers for Medicare and Medicaid Services. Bundled Payments for Care Improvement (BPCI) Initiative: General Information. https://innovation.cms.gov/initiatives/bundled-payments/. Last Updated and Access September 2018.
  24. 24. Leland NE, Gozalo P, Christian TJ, et al. An examination of the first 30 days after patients are discharged to the community from hip fracture postacute care. Med Care. 2015; 53(10):879–87. pmid:26340664
  25. 25. Mallinson T, Deutsch A, Heinemann, A, et al. Comparing Function Across Post-Acute Rehabilitation Settings after Co-calibration of Self-Care and Mobility Items. ACRM-ASNR Annual Conference, October 13, 2012; Vancouver, Canada.
  26. 26. Andrich D. (1982). An index of person separation in latent trait theory, the traditional KR20 index, and the Guttman scale response pattern. Educational Psychology Research, 9, 95–104.
  27. 27. Wright BD. Number of person or item strata: (4*separation+1)/3. Rasch Meas Trans 2002; 16:888.
  28. 28. Wright BD, Masters GN. Rating scale analysis. Rasch measurement. Chicago: MESA Pr; 1982.
  29. 29. Fisher WJ. Reliability, separation, strata statistics. Rasch Meas Trans. 1992;6:328.
  30. 30. Wright B. Separation, reliability and skewed distributions: Statistically different levels of performance. Rasch Meas Trans. 2001;14, 786.
  31. 31. Smith EV Jr (2001): Evidence for the reliability of measures and validity of measure interpretation: a Rasch measurement perspective. J Appl Meas 2: 281–311. pmid:12011511
  32. 32. Velozo CA, Woodbury ML. Translating measurement findings into rehabilitation practice: an example using Fugl-Meyer Assessment-Upper Extremity with patients following stroke. J Rehabil Res Dev. 2011;48(10):1211–22. pmid:22234665
  33. 33. Souza MAP, Coster WJ, Mancini MC, et al. Rasch analysis of the participation scale (P-scale): usefulness of the P-scale to a rehabilitation services network. BMC Public Health. 2017, 17(1):934. pmid:29216914
  34. 34. Anselmi P, Vidotto G, Bettinardi O, et al. Measurement of change in health status with Rasch models. Health Qual Life Outcomes. 2015, 13:16. pmid:25889854
  35. 35. Li CY, Romero R, Bonilha H, et al. Linking existing instruments to develop an activity of daily living item bank. Eval Health Prof. 2016, 1–19. pmid:27856680
  36. 36. Moeini S, Rasmussen JV, Klausen TW, et al. Rasch analysis of the Western Ontario Osteoarthritis of the Shoulder index—the Danish version. Patient Relat Outcome Meas. 2016; 7: 173–181. Published online 2016, 14. pmid:27881929
  37. 37. D’Agostino RB, Griffith JL, Schmidt CH, et al. Measures for evaluating model performance. Proceedings of the Biometrics Section. American Statistical Association, Biometrics Section: Alexandria, VA. 1997; 253–258.
  38. 38. Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology 1975; 12:387–415.
  39. 39. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143(1):29–36. pmid:7063747
  40. 40. Harrell FE, Lee KL, Mark DB. Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine 1996; 15:361–387. pmid:8668867
  41. 41. Pencina MJ, D’Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Statistics in Medicine 2004; 23:2109–2123. pmid:15211606
  42. 42. Cook NR. Use and Misuse of the receiver operating characteristics curve in risk prediction. Circulation 2007; 115:928–935. pmid:17309939
  43. 43. Newson, R. B. Interpretation of Somers’ D under four simple models. Retrieved on 25 June 2017 from: http://www.imperial.ac.uk/nhli/r.newson/papers.htm#miscellaneous_documents.
  44. 44. Akaike H. Information theory and an extension of the maximum likelihood principle. In Petrov B. N & Csaki F (Eds.). Second international symposium on information theory (p. 267–281). Budapest, Hungary: Akademai Kiado, 1973.
  45. 45. Schwarz G. Estimating the dimension of a model. Annals of Statistics, 1978, 6, 461–464.
  46. 46. Yang Y. Can the strengths of AIC and BIC be shared? Biometrika 2005, 92; 4: 937–950.
  47. 47. Dziak JJ, Coffman DL, Lanza ST, Li R. Sensitivity and specificity of information criteria. The Methodology Center. The Pennsylvania State University. Technical Report Series #12–119. 2012. Retrieved on 12/29/2019 from https://www.methodology.psu.edu/files/2019/03/12-119-2e90hc6.pdf
  48. 48. Pencina MJ, D—Agostino RB Sr, D’Agostino RB Jr. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat Med 2008; 27:157–72. pmid:17569110
  49. 49. Noyes K, Liu H, Temkin-Greener H. Medicare capitation model, functional status, and multiple comorbidities: model accuracy. Am J Manag Care. 2008 Oct;14 (10):679–90. pmid:18837646
  50. 50. Iezzoni LI. Risk adjustment for measuring health care outcomes. Chicago, IL: Health Administration Press; 2013.
  51. 51. Vertrees J, Averill R, Eisenhandler J, et al. The ability of event-based episodes to explain variation in charges and Medicare payments for various post acute service bundles. Contractor report for Medicare Payment Advisory Commission. Washington, DC, 2013.
  52. 52. Centers for Medicare and Medicaid. (2018a). Functional Measures. Retrieved on August 15, 2018 from: https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Post-Acute-Care-Quality-Initiatives/Functional-Measures-.html
  53. 53. Centers for Medicare and Medicaid. (2018b). IMPACT Act of 2014 data standardization & cross setting measures. Retrieved on August 11, 2018 from: https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/Post-Acute-Care-Quality-Initiatives/IMPACT-Act-of-2014/IMPACT-Act-of-2014-Data-Standardization-and-Cross-Setting-Measures.html
  54. 54. Cheng HG, Phillips MR. Secondary analysis of existing data: opportunities and implementation. Shanghai Arch Psychiatry. 2014; 26(6):371–5. pmid:25642115