Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting all-cause and lung cancer mortality using emphysema score progression rate between baseline and follow-up chest CT images: A comparison of risk model performances

  • Anton Schreuder ,

    Roles Conceptualization, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing

    antonius.schreuder@gmail.com

    Affiliation Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands

  • Colin Jacobs,

    Roles Conceptualization, Data curation, Software, Supervision, Visualization, Writing – review & editing

    Affiliation Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands

  • Leticia Gallardo-Estrella,

    Roles Software, Writing – review & editing

    Affiliations Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands, Thirona, Nijmegen, the Netherlands

  • Mathias Prokop,

    Roles Investigation, Supervision, Writing – review & editing

    Affiliation Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands

  • Cornelia M. Schaefer-Prokop,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliations Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands, Department of Radiology, Meander Medisch Centrum, Amersfoort, the Netherlands

  • Bram van Ginneken

    Roles Conceptualization, Investigation, Supervision, Writing – review & editing

    Affiliations Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, the Netherlands, Fraunhofer MEVIS, Bremen, Germany

Abstract

Purpose

Normalized emphysema score is a protocol-robust CT biomarker of mortality. We aimed to improve mortality prediction by including the emphysema score progression rate–its change over time–into the models.

Method and materials

CT scans from 6000 National Lung Screening Trial CT arm participants were included. Of these, 1810 died (445 lung cancer-specific). The remaining 4190 survivors were sampled with replacement up to 24432 to approximate the full cohort. Three overlapping subcohorts were formed which required participants to have images from specific screening rounds. Emphysema scores were obtained after resampling, normalization, and bullae cluster analysis of the original images. Base models contained solely the latest emphysema score. Progression models included emphysema score progression rate. Models were adjusted by including baseline age, sex, BMI, smoking status, smoking intensity, smoking duration, and previous COPD diagnosis. Cox proportional hazard models predicting all-cause and lung cancer mortality were compared by calculating the area under the curve per year follow-up.

Results

In the subcohort of participants with baseline and first annual follow-up scans, the analysis was performed on 4940 participants (23227 after resampling). Area under the curve for all-cause mortality predictions of the base and progression models 6 years after baseline were 0.564 (0.564 to 0.565) and 0.569 (0.568 to 0.569) when unadjusted, and 0.704 (0.703 to 0.704) to 0.705 (0.704 to 0.705) when adjusted. The respective performances predicting lung cancer mortality were 0.638 (0.637 to 0.639) and 0.643 (0.642 to 0.644) when unadjusted, and 0.724 (0.723 to 0.725) and 0.725 (0.725 to 0.726) when adjusted.

Conclusion

Including emphysema score progression rate into risk models shows no clinically relevant improvement in mortality risk prediction. This is because scan normalization does not adjust for an overall change in lung density. Adjusting for changes in smoking behavior is likely required to make this a clinically useful measure of emphysema progression.

Introduction

Visual assessment of emphysema extent in CT images of the lungs is a time-consuming process that requires training and is prone to subjectivity [1]. To solve these issues, a quantitative measure of emphysema was introduced [2]. Emphysema score (ES) is defined as the percentage voxels below -950 HU in the lungs [3, 4]. However, its value as a clinical risk predictor is limited due to its dependence on different imaging protocols and reconstruction algorithms [5, 6]; ES from different centers cannot be compared. Recently, normalized ES was introduced to overcome this issue by applying a normalization algorithm before calculating the ES [7]. It was subsequently shown using images from the National Lung Screening Trial (NLST) CT cohort that mortality risk groups based on normalized ES could better distinguish high- from low-risk individuals compared to those based on the original ES [8, 9]. From this point on in this paper, “normalized ES” is referred to as “ES” unless stated otherwise.

Having shown that ES is a protocol-robust CT biomarker, we wanted to investigate whether mortality predictions would improve if a previous CT image was available, enabling the simple calculation of ES change per year (ES progression rate, EPR). This would be especially relevant in a lung cancer CT screening setting where a large, high-risk group is scanned on a yearly basis; even small improvements in a risk model’s accuracy can have affect many people. Compared to having measurements at only one time point, disease prognosis should become more accurate when the disease progression rate is known. We hereby hypothesized that EPR could be used as an additional variable to improve the prediction accuracy of all-cause mortality and lung cancer mortality. In the situation where more than one prior CT image is available, it was assumed that a larger absolute difference in ES resulting from more time between the scans would decrease the proportion associated with within-participant random measurement errors. Our second hypothesis was that measuring EPR between scans separated by a larger time interval would improve the risk prediction accuracy.

Materials and methods

Scans and data

Permission to use anonymized scans and data from the NLST study was obtained by the National Cancer Institute Cancer Data Access System under project ID NLST-422. In summary, the NLST was a multicenter randomized controlled trial which enrolled 53454 heavy smokers in the USA with the aim to investigate whether annual low-dose chest CT screening significantly reduces the lung cancer mortality compared to chest radiography [8]. The trial took place between 2002 and 2010; participants were screened in three annual rounds (T0, T1, and T2) and followed for another five years for outcome measures. In the CT arm, 26309, 24715, and 24102 participants were screened in T0, T1, and T2, respectively; 1810 participants with at least one CT image available were confirmed dead, of which 445 died specifically of lung cancer (if not endpoint verification certified, is stated as cause of death on the death certificate). Note that published results may differ slightly from those in the database due to updates and corrections, and that incomplete or corrupted images were excluded from the study.

The NLST also kept track of the time between each screening round and the event, enabling survival analysis to be performed (see “Statistical analysis” section). Seven patient characteristics as collected from baseline questionnaires were selected to be included in the prediction model as adjustment factors to test whether ES and EPR are independent risk factors. These known predictors of emphysema were: age (years), sex, BMI (five categories: <18.5, 18.5–25, 25–30, 30–35, and ≥35), smoking status, cigarette smoking intensity (pack years), cigarette smoking duration (years), and previous diagnosis of COPD.

Data selection and subcohorts

The same 6000 participants used by Gallardo-Estrella and colleagues [9] were considered, which consisted of all deceased participants and a random selection of non-deceased participants from the NLST CT arm. Three overlapping study subcohorts were formed based on which screening round images per participant were available: the T0-T1, T0-T2, and T0-T1-T2 subcohorts. The T0-T1 and T0-T2 subcohorts were used to test the first hypothesis: that EPR is of added value for mortality risk prediction regardless of whether EPR was calculated between T0 and T1 or between T0 and T2 scans. The T0-T1-T2 subcohort excluded the largest number of available participants and therefore had a lower statistical power than the prior two cohorts, but a common data set was required to formally test the second hypothesis: that a two-year EPR would improve the prediction accuracy compared to a one-year EPR.

Emphysema quantification

ES was calculated using CIRRUS Lung Quantification (Diagnostic Image Analysis Group; Fraunhofer MEVIS). This process starts with the segmentation of the lungs using an algorithm based on region growing and morphological smoothing [10]. Subsequently, the images are resampled to 3mm slice thickness, normalized [7], and bullae analysis [11] is performed. The percentage of lung voxels less than -950 HU is calculated from the final image as ES [7, 12]. EPR was defined as the difference between the latest available ES and the ES from the prior scan, adjusted by the number of years between the two scans, given by the following equation:

Statistical analysis

Data was collected and organized in Microsoft Excel; statistical analysis was performed using R statistical analysis package V.3.4.3. As we did not have the complete NLST non-deceased CT cohort, this subcohort was resampled to simulate the full NLST CT cohort (n = 24432) before excluding participants to form this study’s three subcohorts (see “Data selection and subcohorts”).

Cox proportional hazards regression analysis was performed (R package “survival”) to predict the risk of two outcomes: all-cause mortality and lung cancer mortality. For each subcohort, a base model and a progression model were created; two progression models were created using the T0-T1-T2 subcohort. The base model contained one variable: the latest ES (at T1 or T2). The progression models additionally included EPR (average ES change per year between T0 and T1, T0 and T2, or T1 and T2).

The analysis was subsequently performed on the adjusted version of the above-mentioned models, which included seven clinical patient characteristics (see section “Scans and data”). In total, 28 models were created; the ES and EPR beta coefficients and hazard ratios including 95% confidence intervals and Wald statistic p-values were recorded for each model. Beta coefficients represent the change in the natural log of the hazard ratio for each unit change in the associated risk factor; the estimated hazard ratio is calculated by multiplying each risk factor with its beta coefficient, summing them, and solving the exponential function of that value. Kaplan-Meier curves were computed with 95% confidence intervals using R package “ggfortify” to visualize the ability of each model to distinguish between different risk group percentiles of participants. The same risk group percentiles were used as Gallardo-Estrella et al. [9] (percentiles 0 to 60, 60 to 80, and 80 to 100), but were further subdivided into more groups of no smaller than 10% of the study population when there was a good visual separation.

The proportional hazards assumption was tested using Schoenfeld residuals (R package “survival”); a p-value below 0.05 is interpretable as a violation of this assumption. Martingale residual plots were used to visually assess whether each continuous variable had a linear relationship with the log hazard (R package “survminer”). The likelihood ratio test, Wald test, and logrank test p-values were obtained for each Cox model (R package “survival”) to test goodness of fit.

To compare the predictive accuracy of each model, two time-dependent statistical methods were applied at each year follow-up. Area under the receiver operating characteristic curve (AUC) was used with the support of R package “ROCR” [13]; estimates were from applying the models to the bootstrap data. A value of 0.5 indicates that the model has no predictive ability. When comparing the AUCs of two models, a value closer to 1 indicates a higher accuracy (granted that the 95% confidence intervals do not encompass the other model’s AUC). Average continuous net reclassification improvement (NRI)–defined as “1/2 NRI(>0)” by Pencina and colleagues [14]–was calculated using R package “survIDINRI” [15, 16]. To calculate NRI, the net percentages of subjects with and without the event of interest correctly reclassified with a higher and lower risk score, respectively, are summed and halved (maximum range: -100% to 100%); positive scores indicate that the new model is more accurate (granted that the 95% confidence intervals do not encompass 0). 95% confidence intervals were obtained from 1000 bootstrap iterations with replacement for both methods.

To simulate the models’ application in a screening setting, we calculated the sensitivity, specificity, positive predictive value, negative predictive value, and the ratio of non-deceased to deceased at several pragmatic percentile cut-off points. 95% confidence intervals were calculating using the Wilson procedure with continuity correction [17].

Results

Study participants

Compared to non-deceased participants, more of the deceased participants were older, male, had a higher BMI, were active smokers with a more extensive smoking history, and were diagnosed with COPD (Table 1). Regarding adjustment factors, 17 participants were missing values for the variable “BMI” and 14 were missing “previous diagnosis of COPD”; these participants were excluded from the analysis.

thumbnail
Table 1. Baseline characteristics of participants with a baseline scan.

https://doi.org/10.1371/journal.pone.0212756.t001

Out of the 6000 available participants, 4940 (82.3%) were included in the T0-T1 subcohort, 4614 (76.9%) were included in the T0-T2 subcohort, and 4420 (73.7%) were included in the T0-T1-T2 subcohort (Fig 1). The respective percentage of deaths per subcohort were 26.4% (1302/4940), 22.8% (1050/4614), and 22.5% (994/4420); 5.7% (281/4940), 4.7% (215/4614), and 4.6% (202/4420) lung cancer-specific deaths, respectively. The non-deceased participants were first resampled to the full cohort equivalent of 24432 non-deceased participants, then excluded based on each subcohort’s criteria. With both deceased and resampled non-deceased, the final subcohort sizes used for regression analysis were 23227, 22592, and 21679 participants, respectively.

thumbnail
Fig 1. Participant selection flow chart.

NLST, National Lung Screening Trial; T0, baseline screening round; T1, first annual follow-up screening round; T2, second annual follow-up screening round. a CT images may be missing due to the participant skipping a screening round, corrupted images, or incomplete images. Each subsequent exclusion count does not overlap with the previous criteria.

https://doi.org/10.1371/journal.pone.0212756.g001

Among all participants with a baseline scan, the derived ESs for each screening round as well as the EPRs between T0 and T1 and T0 and T2 can be found in the supplementary dataset (S1 Dataset). Figs 2 and 3 show the normalized T0 and T1 CT images with and without ES overlays of two participants, the former with an EPR of 21.6 and the latter -10.3.

thumbnail
Fig 2. Screenshots of a positive emphysema score progression case.

Four screenshots with emphysema score overlays from one participant are displayed. The left-side screenshots are from the baseline scan; the right-side screenshots are from the first annual follow-up scan. The top screenshots are without the emphysema score overlay. This participant had an emphysema score progression rate of 21.6.

https://doi.org/10.1371/journal.pone.0212756.g002

thumbnail
Fig 3. Screenshots of a negative emphysema score progression case.

Four screenshots with emphysema score overlays from one participant are displayed. The left-side screenshots are from the baseline scan; the right-side screenshots are from the first annual follow-up scan. The top screenshots are without the emphysema score overlay. This participant had an emphysema score progression rate of -10.3.

https://doi.org/10.1371/journal.pone.0212756.g003

Cox model assumptions

All variables in all models had a p-value greater than 0.05 when testing for the proportional hazard assumption. All models had p<0.0001 for the goodness of fit tests (likelihood ratio test, Wald test, and logrank test). The distribution of ES was right skewed, but attempts to transform this variable (i.e., log transformation, absolute value transformation, square root transformation, and polynomial transformation) did not maintain linearity. Similarly, transforming EPR had little effect due to the small spread. The other continuous adjustment variables (age, pack years, and smoking duration) remained slightly nonlinear regardless of transformation and were ultimately also left untransformed.

Base versus progression models

Table 2 displays the beta coefficients and hazard ratios of the ES and EPR for all models developed from the resampled T0-T1 and T0-T2 subcohorts. ES was a statistically significant variable in all models (p<0.001); this also applied to EPR in the all-cause mortality prediction models. EPR was less accurate for lung cancer mortality in the unadjusted models, and not of added benefit when adjusting for patient characteristics (p>0.05). Beta coefficients and hazard ratios of the adjustment factors are displayed in Tables 3 and 4. The model performances are visualized as Kaplan-Meier curves in Figs 4 through 7.

thumbnail
Fig 4. Kaplan-Meier curves of all-cause mortality survival predictions including 95% confidence intervals (resampled T0-T1 subcohort).

The base models only contain the variable emphysema score; the progression models include both emphysema score and average change of emphysema score per year. Risk groups are divided by risk percentiles and their corresponding hazard ratios as displayed in the key. The resampled T0-T1 subcohort contains participants with both T0 and T1 scans resampled to the original cohort size. Upper-left: unadjusted base model; Upper-right: unadjusted one-year progression model; Lower-left: adjusted base model; Lower-right: adjusted one-year progression model.

https://doi.org/10.1371/journal.pone.0212756.g004

thumbnail
Fig 5. Kaplan-Meier curves of lung cancer mortality survival predictions including 95% confidence intervals (resampled T0-T1 subcohort).

The base models only contain the variable emphysema score; the progression models include both emphysema score and average change of emphysema score per year. Risk groups are divided by risk percentiles and their corresponding hazard ratios as displayed in the key. The resampled T0-T1 subcohort contains participants with both T0 and T1 scans resampled to the original cohort size. Upper-left: unadjusted base model; Upper-right: unadjusted one-year progression model; Lower-left: adjusted base model; Lower-right: adjusted one-year progression model.

https://doi.org/10.1371/journal.pone.0212756.g005

thumbnail
Fig 6. Kaplan-Meier curves of all-cause mortality survival predictions including 95% confidence intervals (resampled T0-T2 subcohort).

The base models only contain the variable emphysema score; the progression models include both emphysema score and average change of emphysema score per year. Risk groups are divided by risk percentiles and their corresponding hazard ratios as displayed in the key. The resampled T0-T2 subcohort contains participants with both T0 and T2 scans resampled to the original cohort size. Upper-left: unadjusted base model; Upper-right: unadjusted two-year progression model; Lower-left: adjusted base model; Lower-right: adjusted two-year progression model.

https://doi.org/10.1371/journal.pone.0212756.g006

thumbnail
Fig 7. Kaplan-Meier curves of lung cancer mortality survival predictions including 95% confidence intervals (resampled T0-T2 subcohort).

The base models only contain the variable emphysema score; the progression models include both emphysema score and average change of emphysema score per year. Risk groups are divided by risk percentiles and their corresponding hazard ratios as displayed in the key. The resampled T0-T2 subcohort contains participants with both T0 and T2 scans resampled to the original cohort size. Upper-left: unadjusted base model; Upper-right: unadjusted two-year progression model; Lower-left: adjusted base model; Lower-right: adjusted two-year progression model.

https://doi.org/10.1371/journal.pone.0212756.g007

thumbnail
Table 2. Unadjusted and adjusted Cox regression beta coefficients and hazard ratios (resampled T0-T1 and T0-T2 subcohorts).

https://doi.org/10.1371/journal.pone.0212756.t002

thumbnail
Table 3. Adjusted Cox regression two-year progression models predicting all-cause and lung cancer mortality (resampled T0-T1 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t003

thumbnail
Table 4. Adjusted Cox regression two-year progression models predicting all-cause and lung cancer mortality (resampled T0-T2 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t004

Time-dependent AUCs and NRIs of the base and progression models are displayed in Tables 5 and 6. For both outcomes and across most time points, the resulting AUCs and NRIs indicated that the progression models had significantly higher accuracies than the base models when not adjusted for patient characteristics. When adjusted for patient characteristics, progression models were not statistically superior to the base models at most time points.

thumbnail
Table 5. Time-dependent receiver operating characteristic area under the curve comparing the emphysema score base models to the progression models predicting all-cause mortality and lung cancer mortality (resampled T0-T1 and T0-T2 subcohorts).

https://doi.org/10.1371/journal.pone.0212756.t005

thumbnail
Table 6. Time-dependent average continuous net reclassification improvement comparing the emphysema score base models to the progression models (resampled T0-T1 and T0-T2 subcohorts).

https://doi.org/10.1371/journal.pone.0212756.t006

Measures of clinical outcomes at several hazard ratio cut-off points (based on percentiles) are displayed in Table 7. Without information on patient characteristics, 80% of the participants had an estimated hazard ratio above 1.00 using both the base and one-year progression models. At this cut-off point, 29.1% (95% confidence interval: 26.7 to 31.7%) of all the deaths within six years were correctly predicted using solely T1 ES; per death, there were 11.3 (10.1 to 12.6) surviving “overdiagnosed” participants. If only 2% of those with the highest risk scores were selected, 4.7% (3.6 to 6.0%) of all deaths would be observed; however, the ratio of non-deceased to deceased would also decrease to 6.6 (4.8 to 9.4). With the addition of EPR, this ratio decreases to 5.4 (4.0 to 7.5); the ratios for the adjusted base and the adjusted one-year progression models are 3.5 (2.6 to 4.6) and 3.3 (2.5 to 4.4), respectively.

thumbnail
Table 7. Models’ clinical outcomes predicting 6-year all-cause mortality at several hazard cut-off points (T0-T1 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t007

One- versus two-year progression models

Two ES progression models were developed from the resampled T0-T1-T2 subcohort: the one- and two-year EPR calculated from T1 and T2 scans and T0 and T2 scans, respectively. With or without adjustment factors, the absolute value of the two-year EPR beta coefficients (unadjusted: -0.08460 [-0.12219 to -0.04702]; adjusted: -0.06288 [-0.10910 to -0.01665]) were more than four times larger than those of one-year EPR (unadjusted: -0.02025 [-0.03142 to -0.00908]; adjusted: -0.01323 [-0.02776 to 0.00130]) when predicting all-cause mortality (Table 8). At most time points, the AUCs of the unadjusted progression models indicated a small gain in accuracy when using the two-year progression model compared to the one-year progression model (Table 9); the difference in AUCs ranged from 0.002 to 0.013. The differences in beta coefficients for predicting lung cancer mortality were smaller, had a larger variance, and did not lead to differences in model performance. No performance differences were seen among the adjusted models’ AUCs. Adjusted or not, the time-dependent NRI comparisons did not show a statistically significant difference between the progression models (Table 10).

thumbnail
Table 8. Unadjusted and adjusted Cox regression beta coefficients and hazard ratios (resampled T0-T1-T2 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t008

thumbnail
Table 9. Time-dependent receiver operating characteristic area under the curve comparing the base models against the progression models (resampled T0-T1-T2 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t009

thumbnail
Table 10. Time-dependent average continuous net reclassification improvement comparing the one-year to the two-year emphysema score progression models (resampled T0-T1-T2 subcohort).

https://doi.org/10.1371/journal.pone.0212756.t010

Discussion

To our best knowledge, no other study has yet attempted to measure pulmonary emphysema progression in CT scan using quantitative methods for potential use as a measure of disease progression over time. Our study demonstrated that when a prior chest CT scan is available, calculating the EPR in addition to the latest ES showed a very minor improvement to the all-cause and lung cancer mortality prediction in a lung cancer screening population. On the other hand, when baseline patient information was available and included in the model–more specifically age, sex, weight, height, smoking history, and disease history–the value of EPR is negated completely. These results are also reflected in Table 7, which shows the predicted clinical outcomes when the population is separated into high versus low risk groups.

Though not an objective of the study, a positive result was that ES remained a highly significant predictor in all adjusted models developed (p < .001); this had not yet been demonstrated in the previous study by Gallardo-Estrella et al. [9]. ES should therefore continue to be considered for inclusion in future multivariable risk prediction models when available.

Among most progression models, it was observed that the absolute value of the models’ two-year EPR beta coefficients were considerably larger than that of one-year EPR. A theory is that that calculating EPR using a larger time gap between two images would reduce the within-participant random measurement errors relative to the actual change in ES. We therefore hypothesized that two-year EPR is a superior mortality biomarker than one-year EPR and developed both models in a common data set where each participant had images from all screening rounds. Indeed, in the resampled T0-T1-T2 subcohort, we found a statistically significant higher AUC for the unadjusted two-year progression model than that of one-year; on the other hand, the magnitude was very small and again completely negated when adjusted for patient characteristics.

It is worth noting that most participants do not have an EPR large enough to have any influence on their risk score. Using a non-parametric test, there was no statistically significant difference in EPR between the deceased and non-deceased groups (Table 1). However, even among heavy smokers (at least 30 pack years, and no more than 15 years since quit among former smokers [8]), only 6.3% (359/5740) had the diagnosis COPD at baseline. The added value may therefore only apply for the small proportion with more severe emphysema. Careful consideration of the Kaplan-Meier curves reveals that the biggest change generally takes place in the highest risk group (p90-p100) (Figs 47). In most graphs, the five- or six-year survival probability of the p90-p100 risk group increases by about 1% and the p80-p90 group decreases by about 1% accompanied by a narrowing of confidence intervals for better separation from the p60-p80 group. Graphs of adjusted models–which could better distinguish between lower risk strata–additionally showed an improved separation between the p60-p70 and p70-p80 groups. In other words, the seemingly negligible benefit of EPR may in part be due to the small number of people that it applies to.

Another discussion point is that emphysema is currently considered permanent and irreversible, so a moderate decrease in ES which is not explainable by random within-participant measurement errors should not be possible. Regardless, the EPR beta coefficients among all progression models were negative, meaning that a decrease in ES since the last scan increases the mortality risk. This is a paradoxical result which has been observed in previous studies [1823]. The current explanation for this is that active smokers’ lungs are in a constant state of inflammation causing a diffuse increase in density of the lung parenchyma; smoking cessation leads to the clearance of the inflammatory component resulting in a general decrease of lung density and revealing areas of decreased lung density below the threshold of being counted as emphysema, hereby increasing the ES. In a study cohort where about half of the participants were diagnosed with COPD, Grydeland and colleagues [18] reported non-normalized ES medians of 9.72 and 4.86 among former and current smokers, respectively. In our study, 40 out of the 4420 participants (0.9%) in our T0-T1-T2 subcohort had a T0 to T1 EPR less than -5 and 103 (2.3%) met these criteria from T1 to T2. On the other side of the spectrum, the numbers with an EPR greater than 5 were 135 (3.1%) and 93 (2.1%), respectively.

In our study, adjusting for whether the participant was an active smoker or not did not change the sign; this information was taken from a baseline questionnaire, hence it is not known whether the participants’ smoking behavior had changed. It may therefore be conjectured that former smokers relapsed after the screening program failed to diagnose them with lung cancer [24]. Scan normalization does not change the overall scan density and can therefore not automatically adjust for changes in inflammation over time. To solve this issue, different density cut-off points may be used depending on the time since smoking abstinence. Another solution may be to adjust for the mean lung density, which would not require information on smoking behavior.

This study has other limitations. Due to the very small spread, it is still not clear what kind of relationship EPR has with mortality risk; experimenting with various transformation methods did not appear to improve the linearity nor the model accuracy (results not included). Our inclusion criteria required participants to have survived one or two years following the baseline scan, which introduced selection bias.

The value of the current study is that it provides a thorough description and reasoning of the unexpected negative findings, which may contribute to ensuing investigations. A future study on validating EPR should either utilize a data set including patient characteristics reported at each scan time point or include another CT measure to adjust for differences in overall lung density.

Conclusions

It is expected that the prognosis accuracy of a disease should improve considerably when the rate of disease progression is known, which can only be recorded when measurements are made at multiple time points. This was not the case in our study, where we have shown that simply calculating the change in normalized emphysema score between two CT images does not clinically improve the prediction of all-cause and lung cancer mortality. We surmised that recent smoking behavior has a crucial impact on the emphysema score: Though the normalization method remains useful for comparing CT images created using different protocols and algorithms, it does not adjust for changes in the overall lung density which, for each individual, is prone to change over time. It is expected that a future study utilizing follow-up clinical information or adjusting for the mean lung density is necessary before emphysema score progression rate may be clinically useful as a quantitative CT biomarker.

Supporting information

S1 Dataset. Dataset of emphysema scores and emphysema score progression rates.

https://doi.org/10.1371/journal.pone.0212756.s001

(CSV)

Acknowledgments

The authors thank the National Cancer Institute for access to NCI data collected by the National Lung Screening Trial.

References

  1. 1. Bankier AA, De Maertelaer V, Keyzer C, Gevenois PA. Pulmonary emphysema: subjective visual grading versus objective quantification with macroscopic morphometry and thin-section CT densitometry. Radiology. 1999;211:851–858. pmid:10352615
  2. 2. Müller NL, Staples CA, Miller RR, Abboud RT. "Density mask". An objective method to quantitate emphysema using computed tomography. Chest. 1988;94:782–787. pmid:3168574
  3. 3. Gevenois PA, Maertelaer V de, De Vuyst P, Zanen J, Yernault JC. Comparison of computed density and macroscopic morphometry in pulmonary emphysema. Am J Respir Crit Care Med. 1995;152:653–657. pmid:7633722
  4. 4. Gevenois PA, De Vuyst P, Maertelaer V de, Zanen J, Jacobovitz D, Cosio MG, et al. Comparison of computed density and microscopic morphometry in pulmonary emphysema. Am J Respir Crit Care Med. 1996;154:187–192. pmid:8680679
  5. 5. Stolk J, Stoel BC. Lung densitometry to assess progression of emphysema in chronic obstructive pulmonary disease: time to apply in the clinic? Am J Respir Crit Care Med. 2011;183:1578–1580. pmid:21693710
  6. 6. Mets OM, Jong PA de, Ginneken B van, Gietema HA, Lammers JWJ. Quantitative computed tomography in COPD: Possibilities and limitations. Lung. 2012;190:133–145. pmid:22179694
  7. 7. Gallardo-Estrella L, Lynch DA, Prokop M, Stinson D, Zach J, Judy PF, et al. Normalizing computed tomography data reconstructed with different filter kernels: effect on emphysema quantification. Eur Radiol. 2016;26:478–486. pmid:26002132
  8. 8. Team NLSTR, Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365:395–409. pmid:21714641
  9. 9. Gallardo-Estrella L, Pompe E, Jong PA de, Jacobs C, Rikxoort EM van, Prokop M, et al. Normalized emphysema scores on low dose CT: Validation as an imaging biomarker for mortality. PLoS One. 2017;12:e0188902. pmid:29227997
  10. 10. Rikxoort EM van, Hoop B de, Viergever MA, Prokop M, Ginneken B van. Automatic lung segmentation from thoracic computed tomography scans using a hybrid approach with error detection. Med Phys. 2009;36:2934–2947. pmid:19673192
  11. 11. Blechschmidt RA, Werthschützky R, Lörcher U. Automated CT image evaluation of the lung: a morphology-based concept. IEEE Trans Med Imaging. 2001;20:434–442. pmid:11403202
  12. 12. Wang Z, Gu S, Leader JK, Kundu S, Tedrow JR, Sciurba FC, et al. Optimal threshold in CT quantification of emphysema. Eur Radiol. 2013;23:975–984. pmid:23111815
  13. 13. Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56:337–344. pmid:10877287
  14. 14. Pencina MJ, D’Agostino RB, Steyerberg EW. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med. 2011;30:11–21. pmid:21204120
  15. 15. Pencina MJ, D’Agostino RB, D’Agostino RB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27:157–72. pmid:17569110
  16. 16. Uno H, Tian L, Cai T, Kohane IS, Wei LJ. A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data. Stat Med. 2013;32:2430–2442. pmid:23037800
  17. 17. Wilson EB. Probable inference, the law of succession, and statistical inference. J Am Stat Assoc. 1927;22:209–12.
  18. 18. Grydeland TB, Dirksen A, Coxson HO, Pillai SG, Sharma S, Eide GE, et al. Quantitative computed tomography: emphysema and airway wall thickness by sex, age and smoking. Eur Respir J. 2009;34:858–865. pmid:19324952
  19. 19. Camiciottoli G, Cavigli E, Grassi L, Diciotti S, Orlandi I, Zappa M, et al. Prevalence and correlates of pulmonary emphysema in smokers and former smokers. A densitometric study of participants in the ITALUNG trial. Eur Radiol. 2009;19:58–66. pmid:18690451
  20. 20. Ashraf H, Lo P, Shaker SB, Bruijne M de, Dirksen A, Tønnesen P, et al. Short-term effect of changes in smoking behaviour on emphysema quantification by CT. Thorax. 2011;66:55–60. pmid:20978026
  21. 21. Shaker SB, Stavngaard T, Laursen LC, Stoel BC, Dirksen A. Rapid fall in lung density following smoking cessation in COPD. COPD. 2011;8:2–7. pmid:21299472
  22. 22. Zach JA, Williams A, Jou S-S, Yagihashi K, Everett D, Hokanson JE, et al. Current smoking status is associated with lower quantitative CT measures of emphysema and gas trapping. J Thorac Imaging. 2016;31:29–36. pmid:26429588
  23. 23. Jobst BJ, Weinheimer O, Trauth M, Becker N, Motsch E, Groß M-L, et al. Effect of smoking cessation on quantitative computed tomography in smokers at risk in a lung cancer screening population. Eur Radiol. 2018;28:807–815. pmid:28884215
  24. 24. Tammemägi MC, Berg CD, Riley TL, Cunningham CR, Taylor KL. Impact of lung cancer screening results on smoking cessation. J Natl Cancer Inst. 2014;106:dju084. pmid:24872540