Development and psychometric properties of short form of central sensitization inventory in participants with musculoskeletal pain: A cross-sectional study

Tomohiko Nishigami; Katsuyoshi Tanaka; Akira Mibu; Masahiro Manfuku; Satoko Yono; Akihito Tanabe

doi:10.1371/journal.pone.0200152

Abstract

Background

The central sensitization inventory (CSI) comprises 25 items and is commonly used to measure somatic and emotional symptoms related to central sensitization symptoms. CSI was developed as an easy-to-administer screening instrument for patients at high risk of developing central sensitization in whom it was essential to quickly evaluate the condition. The purpose of the present study was to develop a short form of CSI and evaluate its psychometric properties using a contemporary approach called Rasch analysis.

Methods

A total of 505 patients with musculoskeletal disorders were recruited in this study. The CSI, pain intensity, pain interference, and the health-related quality of life (QOL) were evaluated for each participant. The original CSI items were consecutively analyzed using the Rasch model. Successive Rasch analyses were performed until a final set of items satisfied the model fit requirements. We also analyzed the psychometric properties of the original and short forms of CSI.

Results

Four consecutive Rasch analyses identified the removable items. Finally, the shortest questionnaire obtained that maintained the correct psychometric properties based on the Rasch model contained only 9 items (CSI-9). Rasch analysis showed that the CSI-9 had acceptable internal consistency, exhibited unidimensionality, had no notable differential item functioning, and was functional on the category rating scale.

Conclusions

The nine-item short form of CSI has acceptable psychometric properties and is suitable for use for patients with musculoskeletal pain. Thus, the CSI-9 can be used as a brief instrument to evaluate central sensitization.

Citation: Nishigami T, Tanaka K, Mibu A, Manfuku M, Yono S, Tanabe A (2018) Development and psychometric properties of short form of central sensitization inventory in participants with musculoskeletal pain: A cross-sectional study. PLoS ONE 13(7): e0200152. https://doi.org/10.1371/journal.pone.0200152

Editor: Juan V. Luciano, Parc Sanitari Sant Joan de Déu, SPAIN

Received: January 30, 2018; Accepted: June 20, 2018; Published: July 5, 2018

Copyright: © 2018 Nishigami et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data sharing is restricted by the Institutional Ethics Committee of Konan Women's University because the data contains information that could be used to identify study participants. Qualified researchers may request data access from the Institutional Ethics Committee of Konan Women's University at gakuken@konan-wu.ac.jp.

Funding: This work was supported by Health Labour Sciences Research Grant Number 17948489, https://mhlw-grants.niph.go.jp, to TN. The funders had a role in the Publication Fees.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The International Association for the Study of Pain defines central sensitization (CS) as an increased responsiveness of nociceptive neurons in the central nervous system to normal or subthreshold afferent input [1]. Several systematic reviews demonstrate that CS plays a significant role in the treatment of patients with osteoarthritis [2], shoulder pain [3], whiplash [4], fibromyalgia [5], and tendinopathy [6]. The Central Sensitization Inventory (CSI) was developed as a screening instrument for CS-related symptoms [7]. The scale comprises 25 items related to the assessment of health-related symptoms that are common to central sensitivity syndromes (CSS), such as restless leg syndrome, chronic fatigue syndrome, fibromyalgia, and temporomandibular joint disorder. Confirmatory factor analysis revealed that a bifactor mode containing one general factor and four orthogonal factors (physical symptoms, emotional distress, headache/jaw symptoms, and urological symptoms) fits the CSI structure better than the unidimensional and the 4-factor models [8].

The CSI total scores have been shown to distinguish between patients with chronic pain and control subjects [9, 10]. A higher CSI score is associated with pain-related outcomes [10, 11] and increase in levels of brain-derived neurotrophic factor [12], which contributes to both induction and maintenance of CS [13], and predicts poor long-term postoperative outcomes [14,15].

CSI has been translated into many languages and has been validated by a variety of practitioners[12, 16–21]. A systematic review revealed that CSI has strong psychometric properties and therefore could be a clinically useful measurement instrument [22]. While CSI comprises 25 items and can be used as a psychometric instrument, it is easier to use in clinical settings it if it is shorter. Alternative questionnaires such as a shorter version of pain catastrophizing scale [23–25], pain beliefs and coping strategies [26–28], and pain-related self-efficacy [29] were developed to facilitate a quick screening and to reduce the participant’s burden. This short version of the questionnaire is also preferable because many health professionals have limited time with patients in a clinical setting. CSI was developed as an easy-to-administer screening instrument for patients at higher risk of developing CS [7] in whom it was essential to quickly evaluate the condition. Thus, a brief questionnaire on CSI has the potential to be clinically useful, to be time efficient, and to reduce patient burden in both clinical and research environments. However, it has been noted that shorter questionnaires risk sacrificing reliability. Therefore, short-form measures must be shown to have an acceptable level of reliability and validity.

The Rasch model [30], which is included in the item response theory, can be used to estimate the item or ability parameters and is a way to analyze responses to questionnaires with the goal of improving measurement accuracy and reliability. It is often used to reduce the items covered in questionnaires [31–36]. This model constructs a line of measurement with the items placed hierarchically [37, 38], which permits identification of redundant items to be omitted from the original questionnaire. The present study aimed to develop a short form of CSI and evaluate its psychometric properties.

Methods

Participants

Individuals aged between 20 and 80 years were consecutively recruited and screened by orthopedists to receive physical therapy from an orthopedic clinic. Those with acute (pain that lasts less than 3 months) or chronic (pain that lasts for at least 3 month) musculoskeletal pain disorders (lower back pain, neck pain, shoulder pain, knee pain, ankle pain, and/or hand pain) were included. All participants underwent X-ray examination before receiving physical therapy. At the screening stage by orthopedists, participants with fracture, sprain, cancer, multiple sclerosis, brain or spinal cord injury, history of stroke, or history of psychiatric disease (e.g., schizophrenia, bipolar disorder, or somatoform disorder) as diagnosed by a psychiatrist were excluded.

All evaluations were performed before physical therapy. Of the initial sample of 510 participants, 5 participants were excluded because they did not complete all the items in the questionnaires. A total of 505 individuals were included; all these patients were Japanese. Their characteristics are summarized in Table 1. Patients with musculoskeletal disorders were distributed as follows: 187 patients (37.0%) with lower back pain, 89 (17.6%) with neck pain, 84 (16.6%) with shoulder pain, 82 (16.2%) with knee pain, 42 (8.3%) with ankle pain, and 21 (4.1%) with hand pain. Of all the participants, 333 (65.9%) were women with mean ± standard deviation (SD) age of 52.4 ± 15.1 years and mean ± SD pain duration of 22.6 ± 57.4 weeks.

Download:

Table 1. Demographic information.

https://doi.org/10.1371/journal.pone.0200152.t001

Ethical approval was obtained from the Institutional Ethics Committee of Konan Woman's University. Written informed consent was obtained from all the participants before the study, and the study was conducted in accordance with the tenets of the Declaration of Helsinki.

Procedures

All participants were assessed for demographic data (age, gendar, height, and weight), pain duration, CSI, health-related quality of life (QOL), pain intensity, and pain interference. CSI-J consists of Parts A and B. Part A consists of a 25-item self-report questionnaire designed to assess health-related symptoms that are common to CSS. Each item is rated on a 5-point Likert-type scale, with total scores ranging from 0 to 100. Part B (which is not scored) asks the participants whether one or more specific disorders, including seven separate CSSs, have been diagnosed previously (restless leg syndrome, chronic fatigue syndrome, fibromyalgia, temporomandibular joint disorder, migraine or tension headaches, irritable bowel syndrome, multiple chemical sensitivities, neck injury (including whiplash], anxiety or panic attacks, and depression).

Health-related QOL was measured using the EuroQol 5-dimension (EQ-5D) instrument [39]. EQ-5D was developed as a non-disease specific standardized instrument, which could be used to complement existing health-related QOL measure [40, 41]. It comprises five dimensions: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Each dimension has three grades (no problems, some problems, and extreme problems), which generates a single index value for each health state. These values are numbers on a scale where 1 refers to full health and 0 refers to death.

Pain intensity and interference were measured using the Brief Pain Inventory (BPI) [42, 43]. BPI comprises four pain intensity and seven pain interference items. These items are rated using a scale of 0–10, where 0 = no pain and 10 = worst possible pain. Based on the values obtained, individual pain intensity and interference scores were evaluated by calculating the mean. The validation and clinical utility of BPI has been evaluated for several disorders [44–46].

Development of a short version of CSI

Rasch analysis.

The original CSI was consecutively analyzed using the Rasch model by Winsteps software (v3.90.2, Chicago, Illinois). Chi-square fit statistics were used to determine how well each CSI item contributed to total CSI scores [47]. The most commonly used chi-squares are known as outfit and infit, which are reported as mean squares (in logits) [48]. Rasch analysis examines the predictability of the data when assessing item quality directly using the mean-square statistics. Excessively large-fit residuals (>1.3 logits) indicate a large difference between the expected and observed performance of an item [47], and may indicate that the item is assessing a construct other than what it was intended to assess. In contrast, excessively small-fit residuals (<0.7 logits) indicate items that behave too predictably [48]. Successive Rasch analyses were performed until a final set of items satisfied the model fit requirements.

Psychometric properties of a short version of CSI.

We investigated the psychometric properties of the original and shortened questionnaires, and compared the following components:

Category order.

We assessed category order to ensure the Likert scale functioned as expected. The CSI has five response categories (0 = Never, 1 = Rarely, 2 = Sometimes, 3 = Often, 4 = Always). Category probability curves, and average measure and category fit statistics (infit and outfit) were used to explore rating scale functioning. Fit statistics are recommended to be between 0.6 and 1.4[49, 50]. Moreover, in a well-functioning rating scale, each curve has a distinct peak and 4 clear thresholds that represent the point at which the likelihood of endorsing one category is equal to that of endorsing the next. Disordered thresholds can occur if a category is underutilized or respondents use the categories in an unexpected manner (e.g., respondents cannot differentiate between the categories). In addition, the item characteristic curve (ICC) was plotted. The item is more discriminating where the ICC is steeper, and less discriminating where the ICC is flatter.

Targeting.

Scale-to-sample targeting was made by comparing person locations from the sample with item locations of the scale. Their means should be close to 0 logits. In addition, we investigated targeting by visualizing the person-item map and comparing the means of the items and person measures. This was analyzed by plotting the person-item location threshold distribution graph with distributions of persons on the top half of the graph and item thresholds at the bottom half of the graph. Good targeting means that the distributions of persons and distributions of item thresholds match visually.

Internal consistency.

Winsteps provides a person reliability index and Cronbach’s alpha as indicators of internal consistency [51] and both should exceed 0.7 [52]. McDonald’s Omega was also calculated using Mplus 8 (Los Angeles, CA, United States).

Unidimensionality.

We intended to summate the CSI to provide an overall measure of CS-related symptomology. Individual items should share this unidimensional construct yet be sufficiently different to warrant their inclusion. Assessment of fit evaluates unidimensionality by identifying items that function unexpectedly. The principal component analysis of residuals (PCA) identifies unexpected clusters of items suggestive of a secondary dimension that could threaten measurement of the primary dimension. The PCA residual correlation matrix was visually inspected to identify clusters of items that would be suggestive of a second dimension. An eigenvalue greater than 2.0 for the PCA of residuals suggests a second dimension [53]. Response dependency between the items was examined by inspecting the residual correlation matrix [52] for pairs of items with correlations exceeding 0.4 [54–55].

The dimensionality of both versions was also explored using exploratory structural equation modeling (ESEM) with geomin oblique rotation in Mplus 8. We tested various models: from a one-factor model to a five-factor model in the CFI-25 and from a one-factor model to a two-factor model in the CFI-9. A minimum cutoff of 0.95 for Comparative Fit Index (CFI), a maximum cutoff of .08 for Root Mean Square Error of Approximation (RMSEA), and a maximum cutoff of 0.08 for Standardized Root Mean Square Residual (SRMR) were considered as indicative of acceptable fit [56–58]. Models with smaller values of Akaike information criterion (AIC) and Bayesian information criterion (BIC) are preferred to those with higher AIC and BIC values.

Person fit.

Participants with outfit residuals greater than 1.5 logits were examined to determine the reason for poor fit. They were compared with those who fit the model using Fisher’s exact test [59] of significance (for gender) or the Mann-Whitney U test (for age, pain intensity, pain duration, and CSI). Response strings of misfitting participants were analyzed to identify patterns in their responses.

Differential item functioning (DIF).

Items should function in a similar manner for all people of similar levels of agreeability. We assessed for DIF across 5 subgroups: gender, age (≤60, >60 years), pain intensity from BPI (≤5, >5), pain duration (≤12, >12 month), and pain interference from BPI (median split: ≤2.75, >2.75). DIF was tested using a Mantel-Haenszel chi-square test with significance level set at p = 0.01 for each item. Item bias was explored if an item yielded a significant difference of greater than 0.5 logits between subgroups [60].

Test-retest reliability.

CSI reliability was assessed using scores obtained from a second round of the questionnaire administered to participants within 2 weeks of the first questionnaire completion. An intraclass correlation coefficient (ICC) 2-way mixed model with absolute agreement was used to determine measurement reliability. ICC_3,1 values 0–0.40 were considered to indicate fair reliability, 0.41–0.60 moderate reliability, 0.61–0.80 substantial reliability, and 0.81–1.00 almost perfect reliability [61].

Relationship to clinical status.

Correlation analysis was evaluated using the Statistical Package for Social Sciences Version 25 (IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY: IBM Corp.). Data distribution was tested for homoscedasticity using the Kolmogorov-Smirnov test. A series of correlations was performed examining the relationships between the original and the short form of CSI total score and pain intensity, interference, and QOL. These correlations were investigated with Spearman’s correlation coefficient.

In addition, the correlation between the original and the short form of CSI total score was investigated by calculating the ICC_3,1 and the 95% confidence interval for agreement. ICC_3,1 values 0–0.40 were considered to indicate fair reliability, 0.41–0.60 moderate reliability, 0.61–0.80 substantial reliability, and 0.81–1.00 almost perfect reliability [61].

Short form of CSI total score groupings by CSS diagnoses.

Five severity levels of the original CSI (subclinical = 0–29, mild = 30–39, moderate = 40–49, severe = 50–59, and severe = 60–100) were recommended to evaluate symptom severity or assess clinical changes in response to treatment [58]. To determine the level of CSI severity in the shorter version of CSI, the difference in shorter version of CSI scores was compared with the number of CSS (0 vs. 1 vs. 2 vs. 3+) using one-way analysis of variance (ANOVA) and Tukey post-hoc analysis as well as the original CSI did [62]. CSS diagnoses were determined from subject self-report on CSI part B. P values of <0.05 were considered statistically significant.

The differences in age, pain duration, pain intensity, pain interference, and health-related QOL were also compared with the shortened CSI severity level group using ANOVA and Tukey post-hoc analysis. Gender was compared with the shortened CSI severity level group using the chi-squared test. We adjusted alpha to 0.008 because we undertook six separate analyses.

Results

Table 2 provides a summary of the clinical profile of all participants. Of 505 participants, 377 (74.6%) participants had acute pain and 128 (25.4%) participants had chronic pain.

Download:

Table 2. Clinical profiles.

https://doi.org/10.1371/journal.pone.0200152.t002

Development of the shortened questionnaire

Four consecutive Rasch analyses identified the items that could be removed from the questionnaire (Table 3). Finally, the shortest questionnaire that maintained correct psychometric properties based on the Rasch model was obtained. It contained only 9 items (CSI-9). The CSI-9 items include: 1. Unrefreshed in morning; 2. Muscles stiff/achy; 3. Pain all over body; 4. Headaches; 5. Do not sleep well; 6. Difficulty concentrating; 7. Stress makes symptoms worse; 8. Tension in neck and shoulders; 9. Poor memory. (S1 Table). Table 4 summarizes the fit statistics for the original CSI (CSI-25) and CSI-9.

Download:

Table 3. Item selection from Rasch analysis.

https://doi.org/10.1371/journal.pone.0200152.t003

Download:

Table 4. Average item endorsability thresholds, including fit statistics.

https://doi.org/10.1371/journal.pone.0200152.t004

Psychometric assessment of CSI

Targeting.

The sample was not well targeted by both versions of CSI. The average person endorsability of CSI-25 and CSI-9 (mean ± SD logits) were -1.42 ± 0.91 and -1.09 ± 1.11, respectively. The item endorsability averages of CSI-25 and CSI-9 (mean ± SD logits) were 0 ± 0.72 and 0 ± 0.79, respectively. Visually, the shifting of person agreeability of CSI-25 and CSI-9 to the left when compared with item endorsability indicated that participants with low levels of central sensitization were not well targeted by the scale (Fig 1).

Download:

Fig 1. Item–person threshold map.

A. Original version. B. Nine-item short form. Persons of lesser central sensitization and easier items are located on the left side of the logit scale (i.e. < 0 logits); persons of higher central sensitization and greater difficulty items are located to the right of the logit scale (i.e. > 0 logits). Item endorsability mean is set at 0 logits by default.

https://doi.org/10.1371/journal.pone.0200152.g001

In CSI-25 and CSI-9, 4 (0.7%) and 7 participants (1.3%), respectively, scored zero for all the items. None of the participants for CSI-25 and CSI-9 scored full points on all items.