Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Disease progression of 213 patients hospitalized with Covid-19 in the Czech Republic in March–October 2020: An exploratory analysis

  • Martin Modrák ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    martin.modrak@biomed.cas.cz

    Affiliation Bioinformatics Core Facility, Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic

  • Paul-Christian Bürkner,

    Roles Formal analysis, Investigation, Software, Writing – review & editing

    Affiliation Cluster of Excellence SimTech, University of Stuttgart, Stuttgart, Deutschland, Germany

  • Tomáš Sieger,

    Roles Formal analysis, Investigation, Software, Writing – review & editing

    Affiliation Dept. of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic

  • Tomáš Slisz,

    Roles Data curation, Writing – review & editing

    Affiliations Department of Respiratory Medicine, 1st Faculty of Medicine, Charles University in Prague, Prague, Czech Republic, Thomayer University Hospital, Prague, Czech Republic

  • Martina Vašáková,

    Roles Supervision, Writing – review & editing

    Affiliations Department of Respiratory Medicine, 1st Faculty of Medicine, Charles University in Prague, Prague, Czech Republic, Thomayer University Hospital, Prague, Czech Republic

  • Grigorij Mesežnikov,

    Roles Data curation, Writing – review & editing

    Affiliation Motol University Hospital, Prague, Czech Republic

  • Luis Fernando Casas-Mendez,

    Roles Data curation, Writing – review & editing

    Affiliation Motol University Hospital, Prague, Czech Republic

  • Jaromír Vajter,

    Roles Data curation, Writing – review & editing

    Affiliation Motol University Hospital, Prague, Czech Republic

  • Jan Táborský,

    Roles Data curation, Writing – review & editing

    Affiliation AGEL Hospital Nový Jičín, Nový Jičín, Czech Republic

  • Viktor Kubricht,

    Roles Data curation, Writing – review & editing

    Affiliation Kralovské Vinohrady University Hospital, Prague, Czech Republic

  • Daniel Suk,

    Roles Conceptualization, Data curation, Writing – review & editing

    Affiliation General University Hospital in Prague, Prague, Czech Republic

  • Jan Horejsek,

    Roles Data curation, Writing – review & editing

    Affiliation General University Hospital in Prague, Prague, Czech Republic

  • Martin Jedlička,

    Roles Data curation, Writing – review & editing

    Affiliation Military Hospital Olomouc, Olomouc, Czech Republic

  • Adriana Mifková,

    Roles Data curation, Writing – review & editing

    Affiliation Military Hospital Olomouc, Olomouc, Czech Republic

  • Adam Jaroš,

    Roles Conceptualization, Data curation, Methodology, Writing – review & editing

    Affiliation Na Homolce Hospital, Prague, Czech Republic

  • Miroslav Kubiska,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Infectious Diseases and Travel Medicine, Faculty of Medicine in Pilsen, Charles University, University Hospital in Pilsen, Pilsen, Czech Republic

  • Jana Váchalová,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Infectious Diseases and Travel Medicine, Faculty of Medicine in Pilsen, Charles University, University Hospital in Pilsen, Pilsen, Czech Republic

  • Robin Šín,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Infectious Diseases and Travel Medicine, Faculty of Medicine in Pilsen, Charles University, University Hospital in Pilsen, Pilsen, Czech Republic

  • Markéta Veverková,

    Roles Data curation, Writing – review & editing

    Affiliation Hořovice Hospital, Hořovice, Czech Republic

  • Zbyšek Pospíšil,

    Roles Data curation, Writing – review & editing

    Affiliation Třebíč Hospital, Třebíč, Czech Republic

  • Julie Vohryzková,

    Roles Data curation, Writing – review & editing

    Affiliation 2nd Faculty of Medicine, Charles University in Prague, Prague, Czech Republic

  • Rebeka Pokrievková,

    Roles Data curation, Writing – review & editing

    Affiliation 3rd Faculty of Medicine, Charles University in Prague, Prague, Czech Republic

  • Kristián Hrušák,

    Roles Data curation, Writing – review & editing

    Affiliation 2nd Faculty of Medicine, Charles University in Prague, Prague, Czech Republic

  • Kristína Christozova,

    Roles Data curation, Writing – review & editing

    Affiliation 2nd Faculty of Medicine, Charles University in Prague, Prague, Czech Republic

  • Vianey Leos-Barajas,

    Roles Formal analysis, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Statistical Sciences, University of Toronto, Toronto, Canada

  • Karel Fišer,

    Roles Supervision, Writing – review & editing

    Affiliation Department of Bioinformatics, 2nd Faculty of Medicine, Charles University in Prague, Prague, Czech Republic

  •  [ ... ],
  • Tomáš Hyánek

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Na Homolce Hospital, Prague, Czech Republic

  • [ view all ]
  • [ view less ]

Abstract

We collected a multi-centric retrospective dataset of patients (N = 213) who were admitted to ten hospitals in Czech Republic and tested positive for SARS-CoV-2 during the early phases of the pandemic in March—October 2020. The dataset contains baseline patient characteristics, breathing support required, pharmacological treatment received and multiple markers on daily resolution. Patients in the dataset were treated with hydroxychloroquine (N = 108), azithromycin (N = 72), favipiravir (N = 9), convalescent plasma (N = 7), dexamethasone (N = 4) and remdesivir (N = 3), often in combination. To explore association between treatments and patient outcomes we performed multiverse analysis, observing how the conclusions change between defensible choices of statistical model, predictors included in the model and other analytical degrees of freedom. Weak evidence to constrain the potential efficacy of azithromycin and favipiravir can be extracted from the data. Additionally, we performed external validation of several proposed prognostic models for Covid-19 severity showing that they mostly perform unsatisfactorily on our dataset.

Introduction

The Covid-19 pandemic caused by severe acute respiratory syndrome coronavirus (SARS-CoV-2) has, as of June 2021, led to over 172 million cases and over 3.7 million deaths. The present study was designed and conducted during March—October 2020, when Czech Republic experienced a relatively mild first wave of the pandemic due to early and strict lockdowns. Low numbers of cases continued throughout the summer but during September and October, after most of the data collection for this study concluded, the number of cases was raising again. On October 1st 2020, Czech Republic had accumulated 74283 total confirmed Covid-19 cases and 704 confirmed Covid-19 related deaths [1].

At the time the study was conducted, the proposed treatments included antivirals approved for other indications (chloroquine, hydroxychloroquine, lopinavir/ritonavir, remdesivir, favipiravir, umifenovir), azithromycin, corticosteroids, immunoglobulins, tocilizumab and convalescent plasma [2, 3]. Notably, the anti-malarial and anti-rheumatic drug hydroxychloroquine and the macrolide antibiotic azithromycin showed promise in early data and were broadly available and thus were frequently used in the early stages of the pandemic. Remdesivir, previously designed and approved for Ebola, SARS and MERS, also reported good initial results. However, during spring and summer 2020 remdesivir was available in Czech Republic only in limited amounts via compassionate use programme. The RECOVERY trial reported positive results of coroticosteriod dexamethasone for severe cases in June 2020 [4], but at this point the number of Covid-19 patients hospitalized in Czech Republic was low and dexamethasone thus did not see wider use until later in the pandemic.

Our understanding of the efficacy of Covid-19 treatments has improved substantially since the present study was conducted. As of April 2021, the pharmacological treatments that were deemed to be beneficial for at least one outcome in a systematic review of randomized trials were the corticosteroid dexamethasone (mortality, mechanical ventilation), colchicine (mortality, length of hospital stay), the antiviral remdesivir (mechanical ventilation), Janus kinase inhibitors (mechanical ventilation, duration of ventilation), IL-6 inhibitors (mechanical ventilation, length of hospital stay), the antiviral favipiravir (length of hospital stay, resolution of symptoms) and the anti-androgen proxalutimide (admission to hospital). Hydroxychloroquine, interferon beta, lopinavir-ritonavir, azithromycin, vitamin C, vitamin D, anticoagulants and ACE inhibitors were considered to not be better than standard of care and lopinvair-ritonavir showed evidence of harm, although most of the conclusions were considered to be of low certainty [5]. Interestingly, in observational studies, hydroxychloroquine was often found to be associated with better outcomes [68]. No benefit was also observed in a meta-analysis of randomized trials of convalescent plasma treatment [9].

High IL-6, D-dimer values were observed to be associated with worse outcome and increased disease severity [10]. Large study of electronic health records [11] showed an increase in C-reactive protein in early disease and increase of D-dimer and white blood cell count in later stages of the disease.

An ongoing challenge in evaluating Covid-19 treatments is that the analysis and interpretation of the data is often inappropriate or misleading, most notably interpreting lack of evidence due to small sample size as evidence of no effect [12, 13].

Additionally, many methods for predicting disease severity of Covid-19 were published, but the methods are at high risk of bias and lack external validation [14].

The present study aims to describe the outcomes and disease course of hospitalized patients with mild to severe clinical presentation in a multicentric Czech cohort during the early stages of the pandemic, explore the association between the outcomes and pharmacological interventions and to provide external validation to previously published prognostic models for Covid-19 severity.

Methods

Patients and study design

A convenience sample of patients from 10 sites was collected. The study sites span the whole spectrum of sizes from large university hospitals in major cities with multiple dedicated Covid-19 wards (Thomayer University Hospital in Prague, Motol University Hospital in Prague, Kralovské Vinohrady University Hospital in Prague, General University Hospital in Prague, University Hospital in Pilsen) through major regional/specialized hospitals (Na Homolce Hospital in Prague, Military Hospital Olomouc) as well as smaller hospitals caring for just several Covid-19 patients at a time (AGEL Hospital Nový Jičín, Hořovice Hospital, Třebíč Hospital). The sites were chosen based on availability and willingness of the personnel to participate in data collection. None of the study sites was exclusively dedicated to treating Covid-19 patients. For each site, the dataset contains all patients hospitalized in the participating wards over the data collection period. The data collection started at the onset of the Covid-19 pandemic in March 2020 (except for one site where some older records were inaccessible), but the end date for collection differs between sites due to time constraints of the participating physicians. Three sites included total of 23 patients that could be considered part of “second wave” (admitted after September 1st). Last patient included in the dataset was admitted on October 12th. See S1 Fig for per-site data collection periods. Patients over the age of 18 were included if they had PCR-confirmed infection of SARS-CoV-2 and were not participating in a clinical trial of any Covid-19 pharmacotherapy.

Not all patients developed pneumonia or other symptoms of Covid-19. All patients received the standard of care which could include supplemental oxygen and ventilation and antibiotics for bacterial superinfections, as determined by the attending physician. Some patients were not indicated for all treatment modalities (especially mechanical ventilation) based on decision of the attending physician and underlying patient condition. We note that the participating sites were not homogeneous in either patient population or treatment protocols. The choice of pharmacological treatment was based on the decision of the attending clinician and its availability.

The study was approved by the Ethical committees of General University Hospital, Hospital Nový Jíčín, Motol University Hospital, Thomayer Hospital, University Hospital Vinohrady, Military Hospital Olomouc, Na Homolce Hospital, University Hospital in Pilsen, Hořovice Hospital, Jihlava Hospital, all data were collected in fully anonymized form. Data was collected between June and October 2020 for patients that were treated between March and October 2020.

Data collection

We collected data on comorbidities and information about disease progression on daily resolution including breathing support required, oxygen flow rate, experimental anti-Covid-19 and antimicrobial drugs taken and several laboratory markers (PCR positivity for SARS-CoV-2, C-reactive protein, D-dimer, Interleukin 6, Ferritin, lymphocyte count). Full protocol for data collection is attached in S1 File and the data collection table in S2 File. Due to very low number of patients using extra-corporeal membrane oxygenation (N = 1) or non-invasive positive pressure ventilation (N = 6) in our sample, we merged those categories with mechanical ventilation.

Statistical analysis

The character of the convenience sample does not allow for a proper assessment of the association between treatments and patient outcomes, because the treatments had not been assigned to patients at random but were only observed retrospectively. This can be partially remedied by adjusting for patient characteristics in the analysis, but such adjustments will always be imperfect and the analysis needs to be treated as exploratory and interpreted cautiously.

Since many details of analysis may influence the conclusions made, we performed multiverse analysis [15] and report results for all the hypothesis tested across multiple different models using both frequentist and Bayesian paradigms. For each model class we worked with several possible sets of adjustments. All analyses were performed in the R language [16], visualization and data cleaning was run with the tidyverse package [17].

First class of models are frequentist survival and multistate models under the proportional hazards assumption as implemented in the coxph function from the survival package [18]. We primarily use a model with competing risks for death and discharge from hospital (see Fig 1a).

thumbnail
Fig 1. States used in the competing risk model (a) and in the two hidden Markov model variants (b,c).

AA = Ambient air, Oxygen = Nasal oxygen, Ventilated = any form of ventilation (non-invasive positive-pressure ventilation, mechanical ventilation and extra-corporeal membrane oxygenation). In all models the ‘Death’ and ‘Discharged’ states are terminal. In the second hidden Markov model (c), the ‘Improving’ and ‘Worsening’ variants of each non-terminal state are not observable—only the breathing support is observed and improving/worsening is inferred from progression of the disease.

https://doi.org/10.1371/journal.pone.0245103.g001

Second class of models are Bayesian hidden Markov models (HMM) of disease progression implemented via a custom extension to the brms package [19]. The parametrization of the HMM is inspired by Williams et al. [20]: the actual process of disease is assumed to be continuous and allow only for transitions between neighboring states (as shown in Fig 1b and 1c). The total probability of transition between any two states over the period of a day is then computed as the total probability of transition across all possible paths. This class of models does not satisfy the proportional hazards assumption, instead, it is assumed the process has the Markov property—i.e., that the (potentially unobserved) state and the covariates at a given day contain all the information to determine probabilities of the states on the next day. We use two versions of such models, one working solely with the observed breathing support and one assuming a hidden improving/worsening distinction. All of the hidden Markov models take into account whether best supportive care was initiated and a patient was thus not indicated to progress to more intensive treatment modalities.

Finally, we used a set of Bayesian regression models implemented with the brms package [19]. Those included overall survival, state at day 7 or 28 as either binary or categorical outcome and a Bayesian version of the Cox proportional-hazards model.

Except for age, sex and comorbidities, all covariates are treated as time-varying, e.g., the effect of taking a drug is only included for the days after the drug was taken. More details on the exact model formulations can be found in the supplementary statistical analysis in S3 File.

Evaluating prognostic models

We searched the living systematic review of Covid-19 prognostic models [14] for those that could be applied to our dataset (i.e., where we have gathered all the input features). We primarily focused on the Area Under Receiver Operating Characteristic Curve (AUC), and its bootstrapped 95% confidence intervals which we computed using the pROC package [21]. When there were multiple reasonable ways to evaluate the outcome or a predictor in our dataset, we computed and reported all of those options. We used two simple scores with age or the decade of age as the sole predictor to have a baseline to compare the scores against.

Complete code for all analyses is available at https://github.com/cas-bioinf/covid19retrospective/.

Results

Baseline characteristics

In total, we were able to gather data for 213 patients, see Table 1 for the overall characteristics of the patient sample and several subgroups we used in the analysis, including treatments taken. Counts of all treatment combinations are shown in S2 and S3 Figs shows outcomes by study site, demonstrating quite large hospital-specific differences. The dataset includes 19 patients already reported in a study of inflammatory signatures of Covid-19 [22].

thumbnail
Table 1. Patient characteristics for the overall sample and treatment subgroups.

Note that the favipiravir subgroup is not exclusive with either the HCQ or No HCQ group.

https://doi.org/10.1371/journal.pone.0245103.t001

Disease progression

In Fig 2 we show the overall disease progression for all patients and in Fig 3 we show the time-course of a subset of the markers we have measured. The data show some interesting patterns: patients with low Interleukin-6 or D-dimer values are overrepresented among patients with better outcomes, most patients had high CRP upon admission and for many the CRP levels stayed elevated over the whole hospitalization. However, the limited nature of the data does not allow for any statistically robust conclusions. We also see that the marker levels were not substantially stratified by study site. Those patterns should however be interpreted with care due to systematic missingness of the data—in particular, patients that fared worse were probably more likely to have the markers measured. However, we believe this kind of patient-level view is useful to appreciate the extent of both between-patient and within-patient variability.

thumbnail
Fig 2. Disease progression for all patients included in the study as determined by breathing support required.

Each vertical strip is a single patient, the ordering on the horizontal axis is by disease severity. Ventilated = any form of ventilation (non-invasive positive-pressure ventilation, mechanical ventilation and extra-corporeal membrane oxygenation).

https://doi.org/10.1371/journal.pone.0245103.g002

thumbnail
Fig 3. Values of selected markers over the course of the disease.

Each line represents a patient, stratified by the worst breathing support required over the hospitalization. Color indicates study sites. The vertical scale is logarithmic. Ventilated = any form of ventilation (non-invasive positive-pressure ventilation, mechanical ventilation and extra-corporeal membrane oxygenation), CRP = C-reactive protein [mg/l], D-dimer [ng/ml DDU], Ly = lymphocyte count [109/l], IL-6 = Interleukin 6 [ng/l].

https://doi.org/10.1371/journal.pone.0245103.g003

The between-patient variability is notable even across outcomes—when ordering the patients by the highest CRP levels experienced throughout the hospital stay, the top 20% of patients that breathed ambient air for the whole hospitalization experienced higher levels than the bottom 20% of patients that required ventilation or died. This overlap is even larger when comparing only against the patients that died and D-dimer, Interleukin-6 and lymphocyte count also show a notably larger overlap than CRP (S4 Fig).

Association between patients’ characteristics and treatments

As noted above, the nature of the convenience sample did not enforce random assignment of treatments to patients. In fact, patients with worse baseline characteristics, which lead to worse outcomes, were less likely to receive hydroxychloroquine (see Fig 4). This clearly creates a bias towards a positive effect of hydroxychloroquine on the outcome (and potentially for other treatments as well—most were used in combination with hydroxychloroquine), which, however, could be false.

thumbnail
Fig 4. The choice of treatment with hydroxychloroquine seemed to be associated with the status of patients upon hospitalization.

Comorbidities were associated with both worse outcome (black) and lower chance of treatment with hydroxychloroquine (red). Dots and lines represent the estimates and the 95% confidence intervals of the log odds ratio of the respective outcome. HCQ: hydroxychloroquine, IHD: ischemic heart disease, HD: hypertension drugs, HF: heart failure history, COPD: chronic obstructive pulmonary disease, LungD: other lung disease, Dia: diabetes, RenalD: renal disease, LiverD: liver disease, HighCr: creatinin above 115 for males or above 97 for females, HighInr: Prothrombin time (Quick test) as International Normalized Ratio above 1.2, LowAlb: albumin in serum/plasma below 36 g/l.

https://doi.org/10.1371/journal.pone.0245103.g004

Taken quantitatively, the comorbidities known upon hospitalization were informative with respect to the future hydroxychloroquine treatment: the score representing the cumulative presence of ischemic heart disease, hypertension drugs, former heart failure, COPD, other lung diseases, renal disease, or high creatinine was associated with a lower chance of taking hydroxychloroquine over the course of the hospitalization (the chance was only 79.9%, 95% confidence interval (65.3, 97)%, Chi-square test in the logistic regression model, χ2 = 5.18, df = 1, P = 0.023).

Association between treatments and outcomes

Here, we focus on hydroxychloroquine and azithromycin as those are the only treatments with larger sample size. We also investigate favipiravir as it is less well reported in the literature. Hydroxychloroquine was dosed almost exclusively in a 5-day regime starting with a loading dose of 800mg on the first day and followed by 400mg. Majority of patients complemented hydroxychloroquine with azithromycin while azithromycin was rarely used alone (see Table 1). Azithromycin was most frequently dosed 250 or 500mg/day, but doses ranging from 100mg/day to 1500 mg/day were observed. Favipiravir was used only at one site with a loading dose of 3600mg on the first day, followed by at most 9 days with a 1600mg dose. All but one of the patients receiving favipiravir also received hydroxychloroquine. Treatment was initiated mostly within two days of admission (see S5 Fig).

The results of the multiverse analysis for association between hydroxychloroquine, azithromycin and favipiravir usage and death is shown in Fig 5—here, we only show models that were not found to have immediate problems representing the data well or computational issues (see S3 File for details). Results for all models we tested are reported in S6S8 Figs, with additional details in S3 File. The results do not change noticeably when only patients from the first wave are included (S6S8 Figs).

thumbnail
Fig 5. Estimates of model coefficients for association between treatments and main outcomes.

Each row represents a model—Categorical 7/28 = Bayesian categorical regression for state at day 7/28, Bayes Cox = Bayesian version of the Cox proportional hazards model with a binary outcome, Cox (single) = frequentist Cox model with a binary outcome, Cox (competing) = frequentist Cox model using competing risks (as in Fig 1a), HMM A = Bayesian hidden-Markov model as in Fig 1b with predictors for rate groups, HMM B = Bayesian hidden-markov model as in Fig 1b with predictors for individual rates, HMM C = Bayesian hidden-Markov model as in Fig 1c. For frequentist models, we show maximum likelihood estimate and 95% confidence interval, for Bayesian models we show posterior mean and 95% credible interval. The estimands are either log odds-ratio (Categorical, HMM) or log hazard ratio (Cox variants). In all cases coefficient <0 means better patient outcome in the treatment group. Vertical lines indicate zero (blue) and substantial increase or decrease with odds or hazard ratio of 3:2 or 2:3 (green). Additionally the factors the model adjusted for are listed—Site = the study site, admitted = Admitted for Covid-19, Supportive = best supportive care initiated, Comorb. = total number of comorbidities, AZ = took azithromycin, HCQ = took hydroxychloroquine, FPV = took favipiravir, C. plasma = received convalescent plasma.

https://doi.org/10.1371/journal.pone.0245103.g005

Most models report that using hydroxychloroquine is associated with lower risk of death. We must however bear in mind the potential bias noted in the previous section. Also, we see that for the HMM models, as we add adjustments the credible intervals do not widen but instead shift towards zero. This is a weak indication that further adjustments could drive the effect towards zero. We did not attempt to model additional adjustments as the models became computationally unstable. The case of hydroxychloroquine serves as a “control group” for our other results—since randomized trials give us high confidence that hydroxychloroquine does not substantially reduce mortality, we can be quite certain the associations we observe for hydroxychloroquine are just a measure of bias in the data. Additionally, our models either cannot determine the sign of association between azithromycin and risk of death or even show an increase in risk of death. This serves as a weak evidence that a substantial improvement in mortality from azithromycin is unlikely.

Most models exclude very strong association between increased risk of death and using favipiravir, but our data are necessarily quite limited, which is reflected in the very wide uncertainties around estimates. We also cannot put strict bounds on the association between favipiravir and length of hospitalization.

We also examined the association between treatments and length of hospital stay for all the patients that survived. Almost all models cannot discern the sign of the association for all treatments examined (S6S8 Figs). Similarly, we studied the association between D-dimer and Interleukin 6 and outcomes, with unconclusive results as well (S9 Fig).

Published prognostic models are not better than using age as the sole predictor of outcome

Following Wynants et al. [14] we found five prediction models we were able to recompute: Li et al. report the ACP index [23] combining CRP and age to form 3 grades, Chen & Liu [24] report a continuous score using age, CRP, D-dimer and lymphocyte count, Shi et al. [25] use age, sex and hypertension to form 4 grades, Caramelo et al. use age, sex, hypertension, diabetes, cardiac disease, chronic respiratory disease and cancer to form a continuous score [26], Bello-Chavolla et al. [27] use age, diabetes, obesity, pneumonia, chronic kidney disease, COPD and immunosuppression to build a score ranging from -6 to 22. For the latter two scores we had to impute some of the predictors as they had no immediate equivalent in our dataset. The outcomes present in the studies were: 12-day mortality, 30-day mortality and mortality without any further details, here we report results for both 12-day and 30-day mortality. Full details on the scores and how we used our dataset to compute them is given in the S3 File.

All prognostic models we tested performed similarly to or notably worse than using age as the only predictor and also worse than originally reported (Fig 6). Additionally, some publications did not provide enough detail to unambiguously reconstruct how the score and/or outcome was assessed. We thus concur with Wynants et al. [14] that reported prediction scores are at high risk of bias and need additional careful evaluation.

thumbnail
Fig 6. Performance of tested prediction scores as measured by AUC.

AUC = 1 means perfect prediction while AUC ≤ 0.5 means that the score is worse than random guess and a better prediction would be obtained by reversing the score (marked by thick blue line). The line ranges represent the bootstrapped 95% confidence intervals. Red dots show results computed in present study—model variants (horizontal axis) vary in the outcome measured (12-day or 30-day mortality, severe disease) and potentially on how ambiguities in score computation were resolved, although this rarely makes a big difference—see S3 File for details. Cyan triangles show AUC as reported by the original authors or recomputed based on their published data. When the confidence interval or the AUC of the original study is not shown, it means that the value was not reported by the authors and not enough information to recompute it was given.

https://doi.org/10.1371/journal.pone.0245103.g006

Discussion

Our data show the extent of between-patient variability in progression of the disease in terms of both length of hospital stay, duration of various types of breathing support and basic markers. A direct comparison with other studies is hard to perform as almost always only summaries of measurements are reported.

For multiple candidate Covid-19 treatments, observational data repeatedly contradicted results of randomized controlled trials (contrast e.g. Catteau et al. [6] to the RECOVERY trial [28] for hydroxychloroquine and Liu et al. [29] to Agarwal et al. [30] for convalescent plasma). Our results for hydroxychloroquine also fit into this pattern. This should make us wary about over-interpreting the results of this study for azithromycin and favipiravir, although some higher-quality evidence that suggests clinical benefit of favipiravir has been reported [5].

The current (April 2021) Covid-19 treatment guidelines in Czech Republic recommend monoclonal antibodies and in some cases convalescent plasma or favipiravir as early treatment for high-risk patients with mild or no symptoms. For more severe cases dexamethason and anticoagulants are recommended while remdesivir is recommended only for patients that have severe disease but do not require mechanical ventilation [31]. This is similar to recommendations from the National Institute of Health in the USA who additionally recommend tocilizumab in some cases while not recommending convalescent plasma and favipiravir [32]. We do not believe our results should directly inform clinical decision, though we see some potential for inclusion of our results in future meta-analyses.

Regarding methodology, there are multiple approaches that are—at least in principle—capable of deriving causal conclusions from observational data, most notably the DAG framework [33, 34] and target trial framework [35, 36]. In all approaches—and also in some randomized designs—it is common that substantial uncertainty about the best statistical model for the task at hand remains and cannot be eliminated. Nevertheless, most published papers present results only from a single statistical model. We believe that this uncertainty about model choice should not be ignored, rather we should embrace the uncertainty and employ a multiverse analysis or other forms of robustness checks to explore how our conclusions would differ had different assumptions been adopted. In this work we tried to show how such an analysis could be performed and reported in practice. We note that modelling choices that are often made semi-arbitrarily, e.g., logistic regression for survival at 28 days vs. a Cox proportional hazards model for time to event, did in our case lead to substantially different results.

The hidden-Markov models (HMMs) used in this study are of some interest because they fitted the data well and allowed for inclusion of a larger number of predictors than the Cox proportional hazards model without making the posterior uncertainty unreasonably large. We believe this is because such models make better use of the available detailed data. Additionally, HMMs can be used even when the outcomes are observed only indirectly or noisily—as in the original study that motivated our models which concerns the progression of Alzheimer’s disease [20]. Noisily observed outcomes can pose problems for the proportional hazards model and require some special care [37, 38]. We should however note that the Markov property assumed in HMMs is likely to be a reasonable approximation in fewer settings than the proportional hazards assumption of the Cox model.

Common problems with prognostic models in medicine are small sample sizes used to develop the models, weak or problematic statistical methods and lack of external validation on independent datasets [39]. Those problems are prevalent also in prediction models for Covid-19 [14]. We used our dataset to validate several models and observed very poor performance for four out of the five models tested. In fact, the simple assumption that older patients tend to have worse outcomes provides better or similar results to all of the models we were able to implement. This is despite all of the scores including age as a predictor. There seem to be two causes—three of the models dichotomize the age into just two groups which is known to lose information [40, 41]. Of the other two models Chen & Liu [24] use stepwise variable selection which is known to be a problematic approch [42]. The resulting model puts largest relative weight on laboratory markers and deemphasizes age. Caramelo et al. [26] take the decade of age as a very strong predictor and perform the best on our data. Still their results are not distinguishable from just using age. Our findings are almost the same as in a similar but larger validation study using 22 models and 411 patients from the United Kingdom where no tested model provided better prediction for mortality than age alone [43].

Conclusions

We provide very weak observational evidence against a substantial beneficiary effect of using azithromycin (both with or without hydroxychloroquine) and against substantial negative effect of using favipiravir in hospitalized Covid-19 patients. We also observed better outcomes associated with taking hydroxychloroquine, which is likely linked to substantial confounding by indication. Where our results contradict randomized trials, the most likely explanation is systematic bias in our dataset.

A lesson from our analysis is that the assessment of treatment efficacy from observational data is sensitive to modelling assumptions while it is usually almost impossible to determine which of the models is more likely to reflect reality (if any). We believe that using multiverse analysis is an appropriate way to explore data in such contexts as it lets us be transparent about this sensitivity. We further believe that using hidden Markov models is a promising complement to the standard Cox proportional hazards analysis when detailed information on disease progression is available, particularly because it lets us impose additional structure on the model and thus make inferences with more disease states than would be possible to handle in the Cox framework, making better use of the available data.

Additionally, our experience indicates that a substantial fraction of published prognostic models will perform much worse on new patients than on the datasets they were built for and that external validation is crucial. We suggest that comparing the prognostic models against simple baselines (e.g., decade of age as the single predictor) should be a first step in validation. Furthermore, some of the published scores lack enough information to let others implement the score in the same way.

Supporting information

S1 Fig. Data collection timeline.

Data collection periods at individual sites, showing the range of admission dates of patients included in the study. Note that we cannot provide additional information to link the sites here with data shown elsewhere as that would increase the risk of deanonymization of the patients.

https://doi.org/10.1371/journal.pone.0245103.s001

(TIFF)

S2 Fig. Treatment combinations.

Upset plot of treatment combinations—each vertical bar displays the number of patients that received the combination indicated by filled dots in the matrix. Horizontal bars show the total number of patients receiveing the given treatment.

https://doi.org/10.1371/journal.pone.0245103.s002

(TIFF)

S3 Fig. Outcomes per site.

Number of patients and outcomes at the individual sites. The numbers above bars are the exact counts. Hospitalized = still hospitalized at the end of data collection at the site or transferred to other site and lost to followup. Sites are anonymized to preserve patient privacy.

https://doi.org/10.1371/journal.pone.0245103.s003

(TIFF)

S4 Fig. Markers and outcomes.

Density plots of worst marker values per patient, stratified by worst condition experienced by the patient. For each patient that had a given marker measured, the worst value was taken. Additionally the patients are classified by the worst condition (regardless of the timing relative to the worst marker levels). For each set of patients and marker an empirical density plot of the worst marker values is shown.

https://doi.org/10.1371/journal.pone.0245103.s004

(TIFF)

S5 Fig. Treatment onset.

Histogram of timing of first treatment relative to admission into one of the study sites. Two patients initiated treatment before admission, which is shown as the negative numbers.

https://doi.org/10.1371/journal.pone.0245103.s005

(TIFF)

S6 Fig. Association of HCQ with outcomes.

Estimates of model coefficients for association between hydroxychloroquine and main outcomes. The “Suspicious” section shows models that were found to not fit the data well or have computational issues—see supplementary statistical analysis for details. Each row represents a model—Categorical All/7/28 = Bayesian categorical regression for state at last observed day/day 7/day 28, Binary All/7/28 = Bayesian logistic regression for state at last observed day/day 7/day 28, Bayes Cox = Bayesian version of the Cox proportional hazards model with a binary outcome, Cox (single) = frequentist Cox model with a binary outcome, Cox (competing) = frequentist Cox model using competing risks (as in Fig 1a), HMM A = Bayesian hidden-Markov model as in Fig 1b with predictors for rate groups, HMM B = Bayesian hidden-markov model as in Fig 1b with predictors for individual rates, HMM C = Bayesian hidden-Markov model as in Fig 1c. For frequentist models, we show maximum likelihood estimate and 95% confidence interval, for Bayesian models we show posterior mean and 95% credible interval. The estimands are either log odds-ratio (Categorical, HMM) or log hazard ratio (Cox variants) or log ratio of mean duration of hospitalization (HMM duration). In all cases coefficient <0 means better patient outcome in the treatment group. Vertical lines indicate zero (blue) and substantial increase or decrease with odds or hazard ratio of 3:2 or 2:3 (green). Additionally the factors the model adjusted for are listed—Site = the study site, admitted = Admitted for Covid-19, Supportive = best supportive care initiated, Comorb. = total number of comorbidities, AZ = took azithromycin, HCQ = took hydroxychloroquine, FPV = took favipiravir, C. plasma = received convalescent plasma, first wave = only patients admitted before September 1st were included.

https://doi.org/10.1371/journal.pone.0245103.s006

(TIFF)

S7 Fig. Association of azithromycin with outcomes.

Estimates of model coefficients for association between azithromycin and main outcomes. The “Suspicious” section shows models that were found to not fit the data well or have computational issues—see supplementary statistical analysis for details. Each row represents a model—Categorical All/7/28 = Bayesian categorical regression for state at last observed day/day 7/day 28, Binary All/7/28 = Bayesian logistic regression for state at last observed day/day 7/day 28, Bayes Cox = Bayesian version of the Cox proportional hazards model with a binary outcome, Cox (single) = frequentist Cox model with a binary outcome, Cox (competing) = frequentist Cox model using competing risks (as in Fig 1a), HMM A = Bayesian hidden-Markov model as in Fig 1b with predictors for rate groups, HMM B = Bayesian hidden-markov model as in Fig 1b with predictors for individual rates, HMM C = Bayesian hidden-Markov model as in Fig 1c. For frequentist models, we show maximum likelihood estimate and 95% confidence interval, for Bayesian models we show posterior mean and 95% credible interval. The estimands are either log odds-ratio (Categorical, HMM) or log hazard ratio (Cox variants) or log ratio of mean duration of hospitalization (HMM duration). In all cases coefficient <0 means better patient outcome in the treatment group. Vertical lines indicate zero (blue) and substantial increase or decrease with odds or hazard ratio of 3:2 or 2:3 (green). Additionally the factors the model adjusted for are listed—Site = the study site, admitted = Admitted for Covid-19, Supportive = best supportive care initiated, Comorb. = total number of comorbidities, AZ = took azithromycin, HCQ = took hydroxychloroquine, FPV = took favipiravir, C. plasma = received convalescent plasma, first wave = only patients admitted before September 1st were included.

https://doi.org/10.1371/journal.pone.0245103.s007

(TIFF)

S8 Fig. Association of favipiravir with outcomes.

Estimates of model coefficients for association between favipiravir and main outcomes. The “Suspicious” section shows models that were found to not fit the data well or have computational issues—see supplementary statistical analysis for details. Each row represents a model—Categorical All/7/28 = Bayesian categorical regression for state at last observed day/day 7/day 28, Binary All/7/28 = Bayesian logistic regression for state at last observed day/day 7/day 28, Bayes Cox = Bayesian version of the Cox proportional hazards model with a binary outcome, Cox (single) = frequentist Cox model with a binary outcome, Cox (competing) = frequentist Cox model using competing risks (as in Fig 1a), HMM A = Bayesian hidden-Markov model as in Fig 1b with predictors for rate groups, HMM B = Bayesian hidden-markov model as in Fig 1b with predictors for individual rates, HMM C = Bayesian hidden-Markov model as in Fig 1c. For frequentist models, we show maximum likelihood estimate and 95% confidence interval, for Bayesian models we show posterior mean and 95% credible interval. The estimands are either log odds-ratio (Categorical, HMM) or log hazard ratio (Cox variants) or log ratio of mean duration of hospitalization (HMM duration). In all cases coefficient <0 means better patient outcome in the treatment group. Vertical lines indicate zero (blue) and substantial increase or decrease with odds or hazard ratio of 3:2 or 2:3 (green). Additionally the factors the model adjusted for are listed—Site = the study site, admitted = Admitted for Covid-19, Supportive = best supportive care initiated, Comorb. = total number of comorbidities, AZ = took azithromycin, HCQ = took hydroxychloroquine, FPV = took favipiravir, C. plasma = received convalescent plasma, first wave = only patients admitted before September 1st were included.

https://doi.org/10.1371/journal.pone.0245103.s008

(TIFF)

S9 Fig. Association of markers with outcomes.

Estimates of model coefficients (log hazard ratios) for association between markers and death. The “Suspicious” section shows models that were found to not fit the data well or have computational issues, “Problematic” section shows models that were completely broken—see supplementary statistical analysis for details. Each row represents a model—Cox (competing) = frequentist Cox model using competing risks (as in Fig 1a), HMM A = Bayesian hidden-markov model as in Fig 1b with predictors for rate groups, JM = Bayesian joint longitudinal and time-to-event model. For frequentist models, we show maximum likelihood estimate and 95% confidence interval, for Bayesian models we show posterior mean and 95% credible interval. Additionally the factors the model adjusted for are listed—Site = the study site, Supportive = best supportive care initiated, HCQ = took Hydroxychloroquine. We show posterior mean and 95% credible interval.

https://doi.org/10.1371/journal.pone.0245103.s009

(TIFF)

S2 File. MS Excel table used for data collection.

https://doi.org/10.1371/journal.pone.0245103.s011

(XLSX)

S3 File. Supplementary statistical analysis.

Contains details on all statistical models and procedures used.

https://doi.org/10.1371/journal.pone.0245103.s012

(PDF)

References

  1. 1. COVID-19 v ČR: Otevřené datové sady a sady ke stažení [Internet]. Ministry of Health of the Czech Republic; Available: https://onemocneni-aktualne.mzcr.cz
  2. 2. Sanders JM, Monogue ML, Jodlowski TZ, Cutrell JB. Pharmacologic Treatments for Coronavirus Disease 2019 (COVID-19): A Review. JAMA. 2020;323: 1824–1836. pmid:32282022
  3. 3. Tobaiqy M, Qashqary M, Al-Dahery S, Mujallad A, Hershan AA, Kamal MA, et al. Therapeutic management of patients with COVID-19: a systematic review. Infection Prevention in Practice. 2020;2: 100061. pmid:34316558
  4. 4. Horby P, Lim WS, Emberson J, Mafham M, Bell J, Linsell L, et al. Effect of Dexamethasone in Hospitalized Patients with COVID-19—Preliminary Report [Internet]. Infectious Diseases (except HIV/AIDS); 2020 Jun. Available: http://medrxiv.org/lookup/doi/10.1101/2020.06.22.20137273
  5. 5. Siemieniuk RA, Bartoszko JJ, Ge L, Zeraatkar D, Izcovich A, Kum E, et al. Drug treatments for covid-19: living systematic review and network meta-analysis. BMJ. 2020;370. pmid:32732190
  6. 6. Catteau L, Dauby N, Montourcy M, Bottieau E, Hautekiet J, Goetghebeur E, et al. Low-dose hydroxychloroquine therapy and mortality in hospitalised patients with COVID-19: a nationwide observational study of 8075 participants. International Journal of Antimicrobial Agents. 2020; 106144. pmid:32853673
  7. 7. Del Valle DM, Kim-Schulze S, Huang H-H, Beckmann ND, Nirenberg S, Wang B, et al. An inflammatory cytokine signature predicts COVID-19 severity and survival. Nature Medicine. 2020; 1–8. pmid:32839624
  8. 8. Castelnuovo AD, Costanzo S, Antinori A, Berselli N, Blandi L, Bruno R, et al. Use of hydroxychloroquine in hospitalised COVID-19 patients is associated with reduced mortality: Findings from the observational multicentre Italian CORIST study. European Journal of Internal Medicine. 2020;
  9. 9. Janiaud P, Axfors C, Schmitt AM, Gloy V, Ebrahimi F, Hepprich M, et al. Association of Convalescent Plasma Treatment With Clinical Outcomes in Patients With COVID-19: A Systematic Review and Meta-analysis. JAMA. 2021;325: 1185. pmid:33635310
  10. 10. Maeda T, Obata R, Rizk DO D, Kuno T. The association of interleukin-6 value, interleukin inhibitors, and outcomes of patients with COVID-19 in New York City. J Med Virol. 2020; jmv.26365. pmid:32720702
  11. 11. Brat GA, Weber GM, Gehlenborg N, Avillach P, Palmer NP, Chiovato L, et al. International electronic health record-derived COVID-19 clinical course profiles: the 4CE consortium. npj Digital Medicine. Nature Publishing Group; 2020;3: 1–9. pmid:32864472
  12. 12. Hood K, Dahly D, Wilkinson J. Statistical review of Efficacy and safety of lopinavir/ritonavir or arbidol in adult patients with mild/moderate COVID-19: an exploratory randomized controlled trial. 2020;
  13. 13. Hood K, Goulao B, Dahly D, Yap C. Statistical review of Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial [Internet]. Zenodo; 2020 May. Available: https://zenodo.org/record/3819778#.X1yBlotS-Uk
  14. 14. Wynants L, Calster BV, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ. 2020;369. pmid:32265220
  15. 15. Steegen S, Tuerlinckx F, Gelman A, Vanpaemel W. Increasing Transparency Through a Multiverse Analysis. Perspect Psychol Sci. 2016;11: 702–712. pmid:27694465
  16. 16. R Core Team. R: A Language and Environment for Statistical Computing [Internet]. 2020. Available: https://www.R-project.org/
  17. 17. Wickham H, Averick M, Bryan J, Chang W, McGowan L, François R, et al. Welcome to the Tidyverse. JOSS. 2019;4: 1686.
  18. 18. Therneau TM, Grambsch PM. Modeling survival data: extending the Cox model. New York: Springer; 2000.
  19. 19. Bürkner P-C. Advanced Bayesian Multilevel Modeling with the R Package brms. The R Journal. 2018;10: 395.
  20. 20. Williams JP, Storlie CB, Therneau TM, J CR Jr, Hannig J. A Bayesian Approach to Multistate Hidden Markov Models: Application to Dementia Progression. Journal of the American Statistical Association. 2020;115: 16–31.
  21. 21. Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12: 77. pmid:21414208
  22. 22. Parackova Z, Zentsova I, Bloomfield M, Vrabcova P, Smetanova J, Klocperk A, et al. Disharmonic Inflammatory Signatures in COVID-19: Augmented Neutrophils’ but Impaired Monocytes’ and Dendritic Cells’ Responsiveness. Cells. Multidisciplinary Digital Publishing Institute; 2020;9: 2206. pmid:33003471
  23. 23. Lu J, Hu S, Fan R, Liu Z, Yin X, Wang Q, et al. ACP risk grade: a simple mortality index for patients with confirmed or suspected severe acute respiratory syndrome coronavirus 2 disease (COVID-19) during the early stage of outbreak in Wuhan, China. medRxiv. 2020;
  24. 24. Chen X, Liu Z. Early prediction of mortality risk among severe COVID-19 patients using machine learning. medRxiv. 2020;
  25. 25. Shi Y, Yu X, Zhao H, Wang H, Zhao R, Sheng J. Host susceptibility to severe COVID-19 and establishment of a host risk score: findings of 487 cases outside Wuhan. Critical Care. 2020;24: 108. pmid:32188484
  26. 26. Caramelo F, Ferreira N, Oliveiros B. Estimation of risk factors for COVID-19 mortality—preliminary results. medRxiv. 2020;
  27. 27. Bello-Chavolla OY, Bahena-López JP, Antonio-Villa NE, Vargas-Vázquez A, González-Díaz A, Márquez-Salinas A, et al. Predicting Mortality Due to SARS-CoV-2: A Mechanistic Score Relating Obesity and Diabetes to COVID-19 Outcomes in Mexico. J Clin Endocrinol Metab. 2020;105: 2752–2761. pmid:32474598
  28. 28. Horby P, Mafham M, Linsell L, Bell JL, Staplin N, Emberson JR, et al. Effect of Hydroxychloroquine in Hospitalized Patients with COVID-19: Preliminary results from a multi-centre, randomized, controlled trial. medRxiv. 2020;
  29. 29. Liu STH, Lin H-M, Baine I, Wajnberg A, Gumprecht JP, Rahman F, et al. Convalescent plasma treatment of severe COVID-19: a propensity score–matched control study. Nat Med. 2020;26: 1708–1713. pmid:32934372
  30. 30. Agarwal A, Mukherjee A, Kumar G, Chatterjee P, Bhatnagar T, Malhotra P, et al. Convalescent plasma in the management of moderate COVID-19 in India: An open-label parallel-arm phase II multicentre randomized controlled trial (PLACID Trial). medRxiv. 2020;
  31. 31. Štefan M, Chrdle A, Husa P, Beneš J, Dlouhý P. Covid-19: diagnostika a léčba [Internet]. Společnost infekčního lékařství ČLS JEP; 2021. Available: https://www.infekce.cz/Covid2019/DPcovid-19_SIL_0421.pdf
  32. 32. COVID-19 Treatment Guidelines Panel. Coronavirus Disease 2019 (COVID-19)Treatment Guidelines [Internet]. National Institutes of Health; Available: https://www.covid19treatmentguidelines.nih.gov/
  33. 33. Greenland S, Pearl J, Robins JM. Causal Diagrams for Epidemiologic Research: Epidemiology. 1999;10: 37–48. pmid:9888278
  34. 34. Arah OA. Analyzing Selection Bias for Credible Causal Inference: When in Doubt, DAG It Out. Epidemiology. 2019;30: 517–520. pmid:31033691
  35. 35. Lodi S, Phillips A, Lundgren J, Logan R, Sharma S, Cole SR, et al. Effect Estimates in Randomized Trials and Observational Studies: Comparing Apples With Apples. American Journal of Epidemiology. 2019;188: 1569–1577. pmid:31063192
  36. 36. Hernán MA, Sauer BC, Hernández-Díaz S, Platt R, Shrier I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. Journal of Clinical Epidemiology. 2016;79: 70–75. pmid:27237061
  37. 37. Meier AS, Richardson BA, Hughes JP. Discrete Proportional Hazards Models for Mismeasured Outcomes. Biometrics. 2003;59: 947–954. pmid:14969473
  38. 38. Chen Y, Lawrence J, Hung HMJ, Stockbridge N. Methods for Employing Information About Uncertainty of Ascertainment of Events in Clinical Trials. Ther Innov Regul Sci. 2021;55: 197–211. pmid:32870460
  39. 39. Steyerberg EW, Moons KGM, van der Windt DA, Hayden JA, Perel P, Schroter S, et al. Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research. PLoS Med. 2013;10: e1001381. pmid:23393430
  40. 40. Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332: 1080.1. pmid:16675816
  41. 41. Thoresen M. Spurious interaction as a result of categorization. BMC Med Res Methodol. 2019;19: 28. pmid:30732587
  42. 42. Smith G. Step away from stepwise. J Big Data. 2018;5: 32.
  43. 43. Gupta RK, Marks M, Samuels THA, Luintel A, Rampling T, Chowdhury H, et al. Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: an observational cohort study. Eur Respir J. 2020;56: 2003498. pmid:32978307